← Back to PRs

#11427: fix(pdf): provide standardFontDataUrl to pdfjs-dist getDocument

by TheUnderdev open 2026-02-07 20:08 View on GitHub →
stale
pdfjs-dist's NodeStandardFontDataFactory requires a standardFontDataUrl to locate bundled .pfb/.ttf font files when running in Node.js. Without it, PDFs using standard fonts (Helvetica, Times, Courier, etc.) throw: UnknownErrorException: Ensure that the standardFontDataUrl API parameter is provided. Resolve the standard_fonts directory within the pdfjs-dist package at runtime and pass it to getDocument(). Also update the type declaration to include standardFontDataUrl and useSystemFonts parameters. <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR updates the PDF extraction path to pass `standardFontDataUrl` into `pdfjs-dist`’s `getDocument()` call, resolving the `standard_fonts` directory at runtime so Node.js can load bundled standard font files. It also extends the local `pdfjs-dist/legacy/build/pdf.mjs` type declaration to include `standardFontDataUrl` and `useSystemFonts` options. The change lives in `src/media/input-files.ts`’s `extractPdfContent()` path and is intended to prevent runtime failures when parsing PDFs that rely on standard fonts (Helvetica/Times/Courier) under Node.js. <h3>Confidence Score: 4/5</h3> - This PR is likely safe to merge once the Windows path/URL handling for `standardFontDataUrl` is corrected. - The change is small and scoped, but the current construction of `standardFontDataUrl` can produce mixed path separators on Windows, which can break pdfjs’s standard font loading in Node environments. - src/media/input-files.ts <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs