Replies: 3 comments 1 reply
-
|
From my experience in prod: |
Beta Was this translation helpful? Give feedback.
-
|
Reporting my results Setup:
What we tested:
Results:
Key findings:
Pipeline profiling breakdown (OCR=on, VLM=on):
Further questions:
|
Beta Was this translation helpful? Give feedback.
-
|
Your benchmark suggests the pipeline is not GPU-saturated; it is stage-bound. The important clue is: That usually means the GPU work happens in short bursts, but the full pipeline is waiting on CPU work, preprocessing, image conversion, Python scheduling, model handoff, or per-page orchestration between bursts. In that situation, increasing batch size can raise VRAM usage without improving throughput because the stages are not actually feeding large batches into the model continuously. For optimization, I would separate three questions:
I would also benchmark by stage with a bigger corpus split into homogeneous groups: Expected throughput will vary heavily by those categories, so a single pages/sec number can be misleading. Based on your data, the next useful experiment is probably not RapidOCR also looks like the right default to test first if VRAM efficiency matters. I would only pay the EasyOCR cost if you have a measured accuracy win on your document set. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What is the latest guide to GPU optimization for Docling? What is expected throughput for parsing pdfs?
Are the below suggestions still valid:
Looking for suggestions and ideas towards optimization. I'm currently running benchmarks parallelly and will report back with my findings. Hoping to get any community answer or dosu suggestion for the latest updates as most issues/discussions seem outdated for 2026.
Beta Was this translation helpful? Give feedback.
All reactions