前面写了一个程序,程序的功能是:从Excel表读取已查询的论文信息,从TXT文本读取邮件模板,对邮件模板进行关键字替换,之后进行Word文本输出。详情请见-http://blog.youkuaiyun.com/py_wang/article/details/79186524
编写的程序基本功能已经实现,但是输出的Word文本字体不能修改,只能是默认文本,没有达到预期目的——重要字段字体为红色。之后网上查找了一些资料,详情请见-http://www.cnblogs.com/lcngu/p/5247179.html。最后选用最简单的方式——设置好word模板另存为xml格式,通过之前的程序读取、替换,再输出doc格式,成功输出有格式word文本。
word模板如下:
Dear Professor %Author%,
Your article titled as " %PaperTitle% " in XXXXXXXXXXXXXX,volume %volumeNum% issue %issueNum%. The lates new citation in Web of Science is %quotedNum% up to February, 2018.
Now the jornal has been indexed by SCI. Please promote and enlarge the influence of your article as much as possible in the future.
Many thanks for your help!
Best regards,
Editorial Office of XXXXXXXXXXXXXXXXXXXXXX
另存为xml格式如下:其中xml文件中有word模板中的所有的文本,并对文本的字体格式进行记录。
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
<pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
<pkg:xmlData>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>
</Relationships>
</pkg:xmlData>
</pkg:part>
<pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
<pkg:xmlData>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml"/>
</Relationships>
</pkg:xmlData>
</pkg:part>
<pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
<pkg:xmlData>
<w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape
" mc:Ignorable="w14 w15 w16se w16cid wp14">
<w:body>
<w:p w14:paraId="3BD95754" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
</w:rPr>
</w:pPr>
<w:r w:rsidRPr="0085072C">
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
</w:rPr>
<w:t>Dear Professor %Author%,</w:t>
</w:r>
</w:p>
<w:p w14:paraId="1BF7F76D" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
</w:rPr>
</w:pPr>
</w:p>
<w:p w14:paraId="12DBD622" w14:textId="77777777" w:rsidR="00E65E0C" w:rsidRPr="0085072C" w:rsidRDefault="00E65E0C" w:rsidP="00E65E0C">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times New Roman"/>
</w:rPr>
</w:r>...............................................................................
之后对原程序进行修改,只修改文本替换的程序。
由于xml文本记录的是单个word文本的文本内容和格式,因此不能将信件合并输出到一个word文档。在文本替换的程序中添加将信件输出到单个word的方法-saveBySingle()和合并输出到一个word的方法-saveByMerge(),如下所示。
package excelOperate;
import java.util.ArrayList;
import java.util.Iterator;
public class StringReplace {
static String letterName = "D:/Program_Files/eclipse se/workspace/excelOperate/letterMode.xml";
static String fileName = "D:/Program_Files/eclipse se/workspace/excelOperate/2016-2017.xls";
static String dateString = "2018-03-03";
static int dateCol = ExcelFile.getDateCol(dateString, fileName);
static String[] repedString = { "%Author%", "%PaperTitle%", "%volumeNum%", "%issueNum%", "%quotedNum%"};
static String letter = TextFile.read(letterName);
/**
* get the content of a rowContent
*
* @param rowContent
* @return
*/
public String[] getRepString(RowContent rowContent) {
String[] repString = new String[6];
repString[0] = rowContent.getAuthor();
repString[1] = rowContent.getPaperTitle();
repString[2] = rowContent.getVolumeNum() + "";
repString[3] = rowContent.getIssueNum() + "";
repString[4] = rowContent.getQuotedNum() + "";
return repString;
}
/**
* using replaceAll() replace keywords of letter and using Recursive
*
* @param i
* @param rowContent
* @return
*/
public String replace(int i, RowContent rowContent) {
if (i == 0) {
return letter;
} else {
return replace(i - 1, rowContent).replaceAll(repedString[i - 1], getRepString(rowContent)[i - 1]);
}
}
/**
* save these letters by merge
*
* @param lettered
*/
public void saveByMerge(ArrayList<String> lettered) {
StringBuilder out = new StringBuilder();
Iterator<String> istring = lettered.iterator();
while (istring.hasNext()) {
out.append(istring.next());
}
System.out.println(out);
TextFile.write("D:/Program_Files/eclipse se/workspace/excelOperate/letter/out-" + dateString + ".doc",
out.toString());
}
/**
* save these letters by single
*
* @param lettered
*/
public void saveBySingle(ArrayList<String> lettered) {
Iterator<String> istring = lettered.iterator();
int i = 1;
while (istring.hasNext()) {
String letter = istring.next();
System.out.println(letter);
TextFile.write("D:/Program_Files/eclipse se/workspace/excelOperate/letter/out-" + dateString + " - " + i + ".doc",
letter.toString());
i++;
}
}
public static void main(String[] args) {
ArrayList<RowContent> rowContents = ExcelFile.getAllRowContent(fileName, dateCol);
Iterator<RowContent> it = rowContents.iterator();
ArrayList<String> lettered = new ArrayList<>();
StringReplace stringReplace = new StringReplace();
while (it.hasNext()) {
lettered.add(stringReplace.replace(5, it.next()));
}
// 合并输出
//stringReplace.saveByMerge(lettered);
stringReplace.saveBySingle(lettered);
}
}
至此,程序修改完毕,实现输出指定格式的word文本。
修改程序时遇到以下一些问题:
1、对于部分字体的word生成xml文本,程序输出成doc格式文档,电脑无法打开。
答:原因在于eclipse默认编码为GKB格式,无法识别大部分格式的文本。
解决方法:修改编码方式为UTF-8即可。
2、读取excel表格中的字符&时,输出的doc格式文档无法打开。
答:由于字符 & 为xml中的特殊字符,程序读取字符 & ,将它放到xml文本中,再以doc格式输出文档,导致文档转换失败,进而无法打开。
解决方法:将excel表格中的字符 & 用xml转义序列 & 替换即可解决该问题。
xml共有5个特殊字符,分别是&、<、>、"、'。对应的转义序列如下所示。
特殊字符 | 转义序列 | 特殊字符 | 转移序列 |
< | < | " | " |
> | > | ' | ' |
& | & |