Python 将文本转换成html的简单示例

本文介绍了一个简单的Python程序,该程序能够将纯文本文件转换为HTML格式,并应用基本的标记以增强可读性。通过使用正则表达式和简单的文本处理技巧,此工具实现了对文本中特定格式的支持。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

实例txt文件test_input.txt:

Welcome to World Wide Spam. Inc.



These are the corporate web pages of *World Wide Spam*,Inc.We hope
you find your stay enjoyable,and that you will sample many of our
products.

A short history if the company

World Wide Spam was started in the summer of 2000.The business
concept was to ride the dot-com wave ande to make money both through
bulk email and by selling canned meat online.

After receiving several complaints from customers who weren't
satisfied by their bulk email.World Wide Spam altered their profile,
and focused 100%on canned goods.Today,they rank as the world's
13,892nd online supplier of SPAM.

Destinations

From this page you may visit several of our intersting web pages:

-What is SPAM?(http://wwspam.fu/whatisspam)

-How do they make it?(http://wwspam.fu/howtomakeit)

-Why should I eat it?(http://wwspam.fu/whyeatif)

How to get in touch with us

You can get in touch with us in *many* ways: By phone (555-1234),by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).

 

 

将txt文件分块的模块util.py:

def lines(file):
    for line in file:yield line
    yield '\n'

def blocks(file):
    block = []
    for line in lines(file):
        if line.strip():
           block.append(line)
        elif block:
           yield ''.join(block).strip()
           block=[]

 

简单的转换模块simple_markup.py:

import sys,re
from util import *

print '<html><body>'

title = True
for block in blocks(sys.stdin):
    block = re.sub(r'\*(.+?)\*',r'<em>\1</em>',block)
    if title:
        print'<h1>'
        print block
        print '</h1>'
        title =False
    else:
        print'<p>'
        print block
        print'</p>'

print'</body></html>'

转换代码:python simple_markup.py<test_input.txt> test_output.html

代码执行过后当前目录会产生一个html文件test_output.html,放入浏览器运行可观察效果。

关于代码的注释部分可以参看http://1.imablog.sinaapp.com/exam-translate-txt-html/

转载于:https://www.cnblogs.com/micky1989/p/3281825.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值