pip install marker-pdf
marker_single 11.pdf --output_dir ./mkd
Loaded layout model datalab-to/surya_layout on device cpu with dtype torch.float32
Loaded texify model datalab-to/texify on device cpu with dtype torch.float32
Loaded recognition model vikp/surya_rec2 on device cpu with dtype torch.float32
Loaded table recognition model datalab-to/surya_tablerec on device cpu with dtype torch.float32
Loaded detection model vikp/surya_det3 on device cpu with dtype torch.float32
Loaded detection model datalab-to/inline_math_det0 on device cpu with dtype torch.float32
Recognizing layout: 33%
Recognizing layout: 67%
Recognizing layout: 100%
Recognizing layout: 100%
3/3 [00:23<00:00, 7.74s/it]
Running OCR Error Detection: 25%
Running OCR Error Detection: 50%
Running OCR Error Detection: 75%
Running OCR Error Detection: 100%
Running OCR Error Detection: 100%
4/4 [00:02<00:00, 1.47it/s]
Detecting bboxes: 0it [00:00, ?it/s]
Detecting bboxes: 0it [00:00, ?it/s]
Recognizing tables:
Recognizing tables: 100%
Recognizing tables: 100%
2/2 [00:15<00:00, 7.64s/it]
Saved markdown to ./mkd/11
Total time: 44.10177683830261
第一次运行,会下载近1G的资料。 转换一个14页的简单试卷,需要 44s