Doc2X:科研翻译与解析工具
提供 批量PDF处理、公式解析、多栏识别,以及 GPT 翻译 与 深度语料提取功能。
Doc2X: Research Translation and Parsing Tool
Offers batch PDF processing, formula parsing, multi-column recognition, along with GPT translation and corpus extraction.
👉 立即使用 Doc2X | Use Doc2X Now
原文链接:https://arxiv.org/pdf/2410.13848
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Janus:解耦视觉编码以实现统一的多模态理解和生成
Chengyue W u 1 , 2 {\mathrm{ {Wu}}}^{1,2}\; Wu1,2 Xiaokang C h e n 1 , ∗ , † {\mathrm{ {Chen}}}^{1,*, \dagger }\; Chen1,∗,† Zhiyu W u 1 , 3 {\mathrm{ {Wu}}}^{1,3}\; Wu1,3 Yiyang M a 1 , 3 {\mathrm{ {Ma}}}^{1,3}\; Ma1,3 Xingchao L i u 1 {\mathrm{ {Liu}}}^{1}\; Liu1 Zizheng Pan 1 {}^{1} 1 Wen Liu 1 {}^{1} 1 Zhenda Xie 1 {}^{1} 1 Xingkai Yu 1 {}^{1} 1 Chong Ruan 1 {}^{1} 1 Ping Luo 2 , ∗ {}^{2, * } 2,∗
程宇