TXT和Excel文本读取,替换关键字,输出文本——后续修改

本文介绍了一种使用XML格式的Word模板进行文本替换的方法,解决了Word文档中特定字段格式化的问题。通过读取Excel表格中的数据并应用到邮件模板上,实现了格式化的批量邮件生成。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

       前面写了一个程序,程序的功能是:从Excel表读取已查询的论文信息,从TXT文本读取邮件模板,对邮件模板进行关键字替换,之后进行Word文本输出。详情请见-http://blog.youkuaiyun.com/py_wang/article/details/79186524

       编写的程序基本功能已经实现,但是输出的Word文本字体不能修改,只能是默认文本,没有达到预期目的——重要字段字体为红色。之后网上查找了一些资料,详情请见-http://www.cnblogs.com/lcngu/p/5247179.html。最后选用最简单的方式——设置好word模板另存为xml格式,通过之前的程序读取、替换,再输出doc格式,成功输出有格式word文本。

       word模板如下:

Dear Professor %Author%,

  Your article titled as "  %PaperTitle%  " in XXXXXXXXXXXXXX,volume %volumeNum% issue %issueNum%. The lates new citation in Web of Science is %quotedNum% up to February, 2018.

  Now the jornal has been indexed by SCI. Please promote and enlarge the influence of your article as much as possible in the future.

     Many thanks for your help!

     Best regards,

     Editorial Office of XXXXXXXXXXXXXXXXXXXXXX

       另存为xml格式如下:其中xml文件中有word模板中的所有的文本,并对文本的字体格式进行记录。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
	<pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
		<pkg:xmlData>
			<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
				<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
				<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
				<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>
			</Relationships>
		</pkg:xmlData>
	</pkg:part>
	<pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
		<pkg:xmlData>
			<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
				<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml"/>
			</Relationships>
		</pkg:xmlData>
	</pkg:part>
	<pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
		<pkg:xmlData>
			<w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape
				" mc:Ignorable="w14 w15 w16se w16cid wp14">
				<w:body>
					<w:p w14:paraId="3BD95754" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
						<w:pPr>
							<w:rPr>
								<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
							</w:rPr>
						</w:pPr>
						<w:r w:rsidRPr="0085072C">
							<w:rPr>
								<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
							</w:rPr>
							<w:t>Dear Professor %Author%,</w:t>
						</w:r>
					</w:p>
					<w:p w14:paraId="1BF7F76D" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
						<w:pPr>
							<w:rPr>
								<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
							</w:rPr>
						</w:pPr>
					</w:p>
					<w:p w14:paraId="12DBD622" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
						<w:pPr>
							<w:rPr>
								<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
							</w:rPr>

						</w:r>...............................................................................

         之后对原程序进行修改,只修改文本替换的程序

         由于xml文本记录的是单个word文本的文本内容和格式,因此不能将信件合并输出到一个word文档。在文本替换的程序中添加将信件输出到单个word的方法-saveBySingle()和合并输出到一个word的方法-saveByMerge(),如下所示。

package excelOperate;

import java.util.ArrayList;
import java.util.Iterator;

public class StringReplace {
	static String letterName = "D:/Program_Files/eclipse se/workspace/excelOperate/letterMode.xml";
	static String fileName = "D:/Program_Files/eclipse se/workspace/excelOperate/2016-2017.xls";
	static String dateString = "2018-03-03";
	static int dateCol = ExcelFile.getDateCol(dateString, fileName);
	static String[] repedString = { "%Author%", "%PaperTitle%", "%volumeNum%", "%issueNum%", "%quotedNum%"};
	static String letter = TextFile.read(letterName);

	/**
	 * get the content of a rowContent
	 * 
	 * @param rowContent
	 * @return
	 */
	public String[] getRepString(RowContent rowContent) {
		String[] repString = new String[6];
		repString[0] = rowContent.getAuthor();
		repString[1] = rowContent.getPaperTitle();
		repString[2] = rowContent.getVolumeNum() + "";
		repString[3] = rowContent.getIssueNum() + "";
		repString[4] = rowContent.getQuotedNum() + "";
		return repString;
	}

	/**
	 * using replaceAll() replace keywords of letter and using Recursive
	 * 
	 * @param i
	 * @param rowContent
	 * @return
	 */
	public String replace(int i, RowContent rowContent) {
		if (i == 0) {
			return letter;
		} else {
			return replace(i - 1, rowContent).replaceAll(repedString[i - 1], getRepString(rowContent)[i - 1]);
		}
	}

	/**
	 * save these letters by merge
	 * 
	 * @param lettered
	 */
	public void saveByMerge(ArrayList<String> lettered) {
		StringBuilder out = new StringBuilder();
		Iterator<String> istring = lettered.iterator();
		while (istring.hasNext()) {
			out.append(istring.next());
		}
		System.out.println(out);
		TextFile.write("D:/Program_Files/eclipse se/workspace/excelOperate/letter/out-" + dateString + ".doc",
				out.toString());
	}
	
	/**
	 * save these letters by single
	 * 
	 * @param lettered
	 */
	public void saveBySingle(ArrayList<String> lettered) {
		Iterator<String> istring = lettered.iterator();
		int i = 1;
		while (istring.hasNext()) {
			String letter = istring.next();
			System.out.println(letter);
			TextFile.write("D:/Program_Files/eclipse se/workspace/excelOperate/letter/out-" + dateString + " - " + i + ".doc",
					letter.toString());
			i++;
		}
	}

	public static void main(String[] args) {
		ArrayList<RowContent> rowContents = ExcelFile.getAllRowContent(fileName, dateCol);
		Iterator<RowContent> it = rowContents.iterator();
		ArrayList<String> lettered = new ArrayList<>();
		StringReplace stringReplace = new StringReplace();
		while (it.hasNext()) {
			lettered.add(stringReplace.replace(5, it.next()));
		}
		// 合并输出
		//stringReplace.saveByMerge(lettered);
		stringReplace.saveBySingle(lettered);
	}
} 

       至此,程序修改完毕,实现输出指定格式的word文本。

       修改程序时遇到以下一些问题:

1、对于部分字体的word生成xml文本,程序输出成doc格式文档,电脑无法打开。

答:原因在于eclipse默认编码为GKB格式,无法识别大部分格式的文本。

解决方法:修改编码方式为UTF-8即可。

2、读取excel表格中的字符&时,输出的doc格式文档无法打开。

答:由于字符  &  为xml中的特殊字符,程序读取字符  &  ,将它放到xml文本中,再以doc格式输出文档,导致文档转换失败,进而无法打开。

解决方法:将excel表格中的字符  &  用xml转义序列  &amp;  替换即可解决该问题。

xml共有5个特殊字符,分别是&、<、>、"、'。对应的转义序列如下所示。

特殊字符转义序列特殊字符转移序列
<&lt;"&quot;
>&gt;'&apos;
&&amp;  
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值