Dockitt

PDF Utility Tools

Advanced PDF utilities — crop, OCR, repair, reorder pages and more.

Crop PDF

Crop PDF pages online for free.

Repair PDF

Repair corrupted PDF files online for free.

OCR PDF

Make scanned PDF files searchable online for free.

Extract PDF Pages

Extract pages from PDF files online for free.

Reorder PDF Pages

Reorder pages in PDF files online for free.

Frequently Asked Questions

What kinds of PDF files can OCR make searchable?

OCR works on PDFs that are purely image-based — typically scanned documents, photographed pages, or PDFs exported from image editing tools where the text was flattened into the page image. If you open a PDF and cannot select, highlight, or copy any text, it is image-based and OCR will help. If you can already select text, the PDF already has a text layer and OCR is unnecessary. Dockitt uses ocrmypdf with the Tesseract engine, which is one of the most accurate open-source OCR systems available and supports a wide range of document types. OCR accuracy is highest on clean, high-resolution scans of printed text. Handwriting, unusual fonts, low-contrast pages, or heavily skewed scans will produce less accurate results.

What causes a PDF to become corrupted and can it always be repaired?

PDF corruption most commonly happens due to interrupted file transfers — a download that was cut off mid-way, a file that was being saved when the application crashed, or storage media errors on a USB drive or hard disk with bad sectors. Cloud sync conflicts can also produce corrupted files. Dockitt's Repair PDF tool uses Ghostscript to rebuild the internal structure of the file — reconstructing the cross-reference table and rewriting the object streams. This works well for mildly damaged files. However, if the file data itself has been physically overwritten or large portions are missing due to truncation, the content cannot be recovered. The repair tool will always attempt to produce a valid PDF from whatever data is readable.

Does cropping a PDF permanently delete the content outside the crop area?

In the PDF format, cropping sets a property called the CropBox, which defines the visible area of each page. Content outside the CropBox is hidden from view but technically remains in the file. This means the file size does not decrease significantly after cropping, and in theory the hidden content could be made visible again by changing the CropBox in a PDF editor. Dockitt's Crop PDF tool sets the CropBox on each page as standard PDF practice. If you need to permanently and irreversibly remove content from the edges of pages — for example for privacy reasons — a dedicated PDF redaction tool would provide a stronger guarantee.

How do I fix a multi-page scanned document where the pages are in the wrong order?

Use the Reorder PDF Pages tool. After uploading your document, the tool displays thumbnail previews of all pages so you can see the content of each one. You can then drag and drop the thumbnails into the correct sequence. This is particularly useful for scanned documents where pages were fed into the scanner in the wrong order, or for documents assembled from multiple scans that need to be arranged chronologically or logically. Once the order looks correct, click to save and download the reordered document.

Can I extract non-consecutive pages from a PDF into a new document?

Yes. The Extract PDF Pages tool is designed exactly for this. Unlike Split PDF which works with consecutive page ranges, Extract PDF Pages lets you specify individual page numbers in any order — for example, pages 3, 8, 12, and 17. The resulting PDF will contain only those pages, in the order you specified. This is useful when working with reports that have relevant sections scattered throughout, reference documents where you need specific appendices, or any situation where the content you need is not in a continuous block.

What should I do if a repaired PDF still shows blank or missing pages?

If specific pages appear blank after repair, it means the data for those pages was too damaged for Ghostscript to reconstruct. The repair process recovers what it can — if a page's content data was overwritten or missing at the byte level, there is no way to recover it from the file alone. In this situation, check whether you have an earlier version of the file saved elsewhere, whether the sender can resend the original, or whether you have a printed copy that could be scanned. For future documents, keeping backups in at least two locations — a local drive and a cloud service — prevents permanent data loss.