在使用pdfbox过程中出现错误:
File sourceFile = new File(fileName);
PDDocument doc = PDDocument.load(sourceFile);
ImageIO.scanForPlugins();
PDFRenderer renderer = new PDFRenderer(doc);
int pageCount = doc.getNumberOfPages();
BufferedImage[] images = new BufferedImage[pageCount];
for (int i = 0; i < pageCount; ++i) {
BufferedImage image = null;
renderer.renderImageWithDPI(i, 160, ImageType.GRAY);
images[i] = image;
}
Exception in thread “main” java.lang.ExceptionInInitializerError
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider(FontMapperImpl.java:149)
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:413)
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getTrueTypeFont(FontMapperImpl.java:321)
at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.(PDTrueTypeFont.java:219)
at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:75)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:143)
at org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:838)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:495)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:469)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:203)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:145)
at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:94)
at error.ConvertPdf2tif.convertPdf2Tiff(ConvertPdf2tif.java:62)
at error.ConvertPdf2tif.main(ConvertPdf2tif.java:31)
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.pdfbox.pdmodel.font.FontFormat. T T F
at java.lang.Enum.valueOf(Enum.java:238)
at org.apache.pdfbox.pdmodel.font.FontFormat.valueOf(FontFormat.java:25)
at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:408)
at org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.(FileSystemFontProvider.java:217)
at org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.(FontMapperImpl.java:130)
… 107 more
仔细研究源码发现pdfbox在读取字体信息是会存储到一个名为“.pdfbox.cache”的缓存文件中,该文件记录的是系统字体信息。
pdfbox 相关源码:
FileSystemFontProvider
/**
* Saves the font metadata cache to disk.
*/
private void saveDiskCache()
{
BufferedWriter writer = null;
try
{
File file = getDiskCacheFile();
try
{
writer = new BufferedWriter(new FileWriter(file));
}
catch (SecurityException e)
{
return;
}
for (FSFontInfo fontInfo : fontInfoList)
{
writer.write(fontInfo.postScriptName.trim());
writer.write("|");
writer.write(fontInfo.format.toString());
writer.write("|");
if (fontInfo.cidSystemInfo != null)
{
writer.write(fontInfo.cidSystemInfo.getRegistry() + '-' +
fontInfo.cidSystemInfo.getOrdering() + '-' +
fontInfo.cidSystemInfo.getSupplement());
}
writer.write("|");
if (fontInfo.usWeightClass > -1)
{
writer.write(Integer.toHexString(fontInfo.usWeightClass));
}
writer.write("|");
if (fontInfo.sFamilyClass > -1)
{
writer.write(Integer.toHexString(fontInfo.sFamilyClass));
}
writer.write("|");
writer.write(Integer.toHexString(fontInfo.ulCodePageRange1));
writer.write("|");
writer.write(Integer.toHexString(fontInfo.ulCodePageRange2));
writer.write("|");
if (fontInfo.macStyle > -1)
{
writer.write(Integer.toHexString(fontInfo.macStyle));
}
writer.write("|");
if (fontInfo.panose != null)
{
byte[] bytes = fontInfo.panose.getBytes();
for (int i = 0; i < 10; i ++)
{
String str = Integer.toHexString(bytes[i]);
if (str.length() == 1)
{
writer.write('0');
}
writer.write(str);
}
}
writer.write("|");
writer.write(fontInfo.file.getAbsolutePath());
writer.newLine();
}
}
catch (IOException e)
{
LOG.warn("Could not write to font cache", e);
LOG.warn("Installed fonts information will have to be reloaded for each start");
LOG.warn("You can assign a directory to the 'pdfbox.fontcache' property");
}
finally
{
IOUtils.closeQuietly(writer);
}
}
private File getDiskCacheFile()
{
String path = System.getProperty("pdfbox.fontcache");
if (path == null)
{
path = System.getProperty("user.home");
if (path == null)
{
path = System.getProperty("java.io.tmpdir");
}
}
return new File(path, ".pdfbox.cache");
}
Enum
public static <T extends Enum<T>> T valueOf(Class<T> enumType,
String name) {
T result = enumType.enumConstantDirectory().get(name);
if (result != null)
return result;
if (name == null)
throw new NullPointerException("Name is null");
throw new IllegalArgumentException(
"No enum constant " + enumType.getCanonicalName() + "." + name);
}
通过阅读以上源码结合报错信息,我们发现是pdfbox在读取改文件时出现乱码,我猜测是改文件内容的编码方式被改变,重命名了旧有文件,pdfbox生成新的缓存文件,通过对比发现,改文件编码确实被改变了,正确编码应该是ANSI,错误文件的编码是Unicode big endian。