GitHub - allenai/olmocr: Toolkit for linearizing PDFs for LLM datasets/training

GitHub Daily Trend - A podcast by VoiceFeed

https://github.com/allenai/olmocr Toolkit for linearizing PDFs for LLM datasets/training - allenai/olmocr