Tika支持的文件格式
下面的表显示了Tika支持的文件格式。
文件格式 | 类库 | Tika中的类 |
---|---|---|
XML | org.apache.tika.parser.xml | XMLParser |
HTML | org.apache.tika.parser.htmll and it uses Tagsoup Library | HtmlParser |
MS-Office compound document Ole2 till 2007 ooxml 2007 onwards | org.apache.tika.parser.microsoft org.apache.tika.parser.microsoft.ooxml and it uses Apache Poi library | OfficeParser(ole2) OOXMLParser(ooxml) |
OpenDocument Format openoffice | org.apache.tika.parser.odf | OpenOfficeParser |
portable Document Format(PDF) | org.apache.tika.parser.pdf and this package uses Apache PdfBox library | PDFParser |
Electronic Publication Format (digital books) | org.apache.tika.parser.epub | EpubParser |
Rich Text format | org.apache.tika.parser.rtf | RTFParser |
Compression and packaging formats | org.apache.tika.parser.pkg and this package uses Common compress library | PackageParser and CompressorParser and its sub-classes |
Text format | org.apache.tika.parser.txt | TXTParser |
Feed and syndication formats | org.apache.tika.parser.feed | FeedParser |
Audio formats | org.apache.tika.parser.audio and org.apache.tika.parser.mp3 | AudioParser MidiParser Mp3- for mp3parser |
Imageparsers | org.apache.tika.parser.jpeg | JpegParser-for jpeg images |
Videoformats | org.apache.tika.parser.mp4 and org.apache.tika.parser.video this parser internally uses Simple Algorithm to parse flash video formats | Mp4parser FlvParser |
java class files and jar files | org.apache.tika.parser.asm | ClassParser CompressorParser |
Mobxformat (email messages) | org.apache.tika.parser.mbox | MobXParser |
Cad formats | org.apache.tika.parser.dwg | DWGParser |
FontFormats | org.apache.tika.parser.font | TrueTypeParser |
executable programs and libraries | org.apache.tika.parser.executable | ExecutableParser |
标签:TIKA 文件格式
本站文章除注明转载外,均为本站原创或编译
欢迎任何形式的转载,但请务必注明出处,尊重他人劳动共创优秀实例教程
转载请注明:文章转载自: 易百教程 [ http:/www.yiibai.com]
本文标题: TIKA文件格式
本文地址: http://www.yiibai.com/tika/tika_file_formats.html
欢迎任何形式的转载,但请务必注明出处,尊重他人劳动共创优秀实例教程
转载请注明:文章转载自: 易百教程 [ http:/www.yiibai.com]
本文标题: TIKA文件格式
本文地址: http://www.yiibai.com/tika/tika_file_formats.html