Frequently Asked Questions
Everything you need to know about PDF deduplication.
Our tool performs a deep text and visual analysis. It compares the exact text strings using a Jaccard index, and renders a micro-thumbnail page view to check visual similarity via image hashing, catching both digital and scanned duplicates.
Removing pages changes the structure and bytes of the document. If your PDF is cryptographically signed, removing duplicate pages will invalidate the signatures. We recommend running the deduplication before applying signatures.
No. The scanning process runs entirely in your browser. The file is only sent to the server for the final compiling step where pdf-lib deletes the selected pages in-memory and streams the output directly back. No files are stored.
You can upload and process PDF files up to 50 MB.
Guests get 5 free deduplications per day. Free accounts get 10 per day, and Lifetime Pro users get 100 per day. Limits are completely separate from the signature verifier and dark mode converter limits.