Skip to content

Fix three reliability issues: request hangs, infinite retries, malformed PDFs#1146

Open
chen3feng wants to merge 3 commits into
PDFMathTranslate:mainfrom
chen3feng:fix/translation-robustness
Open

Fix three reliability issues: request hangs, infinite retries, malformed PDFs#1146
chen3feng wants to merge 3 commits into
PDFMathTranslate:mainfrom
chen3feng:fix/translation-robustness

Conversation

@chen3feng
Copy link
Copy Markdown

Summary

Three independent robustness fixes found while translating real-world PDFs with the deepseek/openai services. Each one could make a run hang forever or crash outright. All three are validated end-to-end (a 114-page and a 47-page paper now translate cleanly; the 47-page one previously hung 100% of the time on its last page).

Changes

1. translator.py — add a request timeout to the OpenAI client

openai.OpenAI(...) was created without a timeout, so a single request that silently stalls (network blip / unresponsive endpoint) blocks the entire document indefinitely (0% CPU, no error, no progress). Added timeout (default 120s, override via PDF2ZH_OPENAI_TIMEOUT) and max_retries=2, so a stalled request fails fast and is retried instead of hanging forever.

2. converter.py — bound paragraph translation retries

worker() used @retry(wait=wait_fixed(1)) with no stop condition, so a permanently-failing paragraph retried every second forever and the whole run got stuck. Added stop_after_attempt (default 5, override via PDF2ZH_TRANSLATE_RETRIES) and a retry_error_callback that falls back to the source text on exhaustion, so the document still completes instead of stalling.

3. high_level.py — auto-repair malformed input PDFs

Some structurally broken PDFs make MuPDF raise during the initial doc_en.save() (cannot parse object (N 0 R)), aborting before translation even starts. Caught that failure and repaired the input with pikepdf (already a dependency, already imported in this module) before retrying.

Notes

  • All three are opt-out-safe: behavior is unchanged on healthy inputs, and the two new knobs are environment variables with sensible defaults.
  • No new dependencies (pikepdf was already required).

🤖 Generated with Claude Code

chen3feng and others added 3 commits June 7, 2026 16:13
The OpenAI client was created without a timeout, so a single request
that silently hangs (network stall / unresponsive endpoint) blocks the
whole document indefinitely. Add a timeout (default 120s, override via
PDF2ZH_OPENAI_TIMEOUT) plus max_retries so a stalled request fails fast
and is retried instead of hanging forever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
worker() was decorated with @Retry(wait=wait_fixed(1)) and no stop
condition, so a permanently failing paragraph retried forever and the
whole run got stuck. Cap attempts (default 5, override via
PDF2ZH_TRANSLATE_RETRIES) and fall back to the source text on exhaustion
so the document still completes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Some structurally broken PDFs make MuPDF raise during the initial
doc_en.save(), aborting before translation starts. Catch that failure
and repair the input with pikepdf (already a dependency) before retrying.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant