Why Is My PDF So Large? 7 Causes and How to Fix Each One
You exported a simple five-page document and it came out at 18 MB. Or you received a PDF you need to email on, but it's 40 MB and Gmail won't take it. Something is hiding inside that file — and it's not the text. This guide identifies every common cause, tells you how to spot it, and gives you a concrete fix for each one.
80%
of large PDFs are caused by unoptimised images
300 KB
typical size of a clean 10-page text PDF
4×
file size penalty of scanning at 300 DPI vs 150 DPI
25 MB
Gmail attachment limit — where most people first notice the problem
What a PDF actually contains
PDF (Portable Document Format, ISO 32000) is a container format — a single .pdf file can hold any combination of the following:
- Text streams — the actual characters in the document, usually compressed with Flate/Deflate
- Raster images — JPEG, PNG, JBIG2, or CCITT-encoded image data embedded per-page
- Vector graphics — lines, shapes, and curves described as PostScript-like path commands
- Font data — full or partial copies of the fonts used, so the document renders identically on any device
- Metadata — author, title, creation date, revision history, XMP data packets
- Annotations and form fields — interactive elements that can add significant overhead
- Embedded previews and thumbnails — pre-rendered images used by file browsers and apps
- Cross-reference tables — an internal index of every object in the file
The same A4 page of plain text might weigh 30 KB. The same page scanned at 300 DPI as an image might weigh 2–8 MB. Understanding which of these components is in your specific PDF is the key to shrinking it.
Size benchmarks — what's normal?
Before diagnosing a problem, you need a baseline. Here are the expected sizes for common PDF types at 10 and 50 pages:
| Document type | 10 pages | 50 pages | What drives the size |
|---|---|---|---|
| Text only (digitally born) | 50–200 KB | 200–800 KB | Character streams + fonts |
| Text + charts and diagrams | 200–600 KB | 1–3 MB | Vector graphics + fonts |
| Word / Office export | 500 KB – 2 MB | 2–10 MB | Fonts + embedded images + metadata |
| PowerPoint export | 3–15 MB | 15–80 MB | Full-bleed slide images + fonts per slide |
| Design portfolio (screen quality) | 2–8 MB | 10–40 MB | JPEG images at 150 DPI |
| Design portfolio (print quality) | 10–40 MB | 50–200 MB | JPEG/TIFF images at 300 DPI |
| Scanned document at 300 DPI | 5–20 MB | 25–100 MB | Raster image per page |
| Scanned document at 150 DPI | 1–5 MB | 5–25 MB | Raster image per page (lower res) |
| Scanned document at 72 DPI | 200–800 KB | 1–4 MB | Low-res raster — readable on screen only |
Cause 1: High-resolution embedded images
This is the most common cause of large PDFs by a wide margin. When you insert an image into Word, InDesign, Figma, or any other tool, the application embeds the original image file at its original resolution — even if you've scaled it down to fit on a page.
A single 4K photograph (3840×2160 pixels) might be 8–12 MB on disk as a JPEG. Placed on an A4 page and exported to PDF, Word will embed that full 12 MB image even though the printed version only needs about 150 DPI — which would be under 1 MB. The extra resolution is invisible but the file size is very real.
| Image resolution | A4 page dimensions at this DPI | Approx. JPEG size per page | Use case |
|---|---|---|---|
| 72 DPI | 595 × 842 px | 40–150 KB | Screen-only, lowest quality |
| 96 DPI | 794 × 1123 px | 80–300 KB | Web display |
| 150 DPI | 1240 × 1754 px | 200–700 KB | Screen + casual print — the sweet spot |
| 300 DPI | 2480 × 3508 px | 800 KB – 3 MB | Professional print quality |
| 600 DPI | 4961 × 7016 px | 3–12 MB | Prepress / large-format — rarely needed |
Fix: For PDFs you're emailing or sharing on screen, re-export from the source file with images downsampled to 150 DPI. In Word: File → Save As → PDF → Options → uncheck “Best for: Standard” and use “Minimum size.” In InDesign: Export → Compression tab → Downsample to 150 PPI. If you don't have the source file, use Ghostscript's /ebook setting or an online compressor that re-encodes images.
If you only need lossless structural savings (no image re-encoding), ClickyFix PDF Compressor can reduce Word and Office exports by 15–35% with zero quality loss.
Cause 2: Scanned pages
A scanned PDF is not a text document — it's a photograph of each page, wrapped in a PDF envelope. Every page is a raster image, and its size is determined entirely by the scan resolution and compression used by your scanner.
Most office scanners default to 300 DPI, which is designed for printing. If you're sending the document digitally, you almost never need that resolution. A 10-page contract scanned at 300 DPI might be 15–25 MB. The same contract scanned at 150 DPI would be 3–6 MB and looks identical on any screen.
Fix options:
- Re-scan at 150 DPI if you have access to the original document — this is the cleanest fix.
- Use Ghostscript
/ebookto downsample the images inside the existing PDF to 150 DPI without re-scanning. - Use an online compressor like iLovePDF — these re-encode the embedded images at lower quality and lower DPI, typically reducing scanned PDFs by 60–80%.
Note that lossless tools (including ClickyFix) will have little impact on a purely scanned PDF, because there is no structural overhead to remove — the bulk of the file is the image data itself.
Cause 3: Fully embedded fonts
PDF fonts come in two forms: fully embedded (the complete font file is included) and subsetted (only the characters actually used in the document are included).
A full font file for a single typeface weight (e.g., “Helvetica Neue Regular”) is typically 250–800 KB. A document using four fonts — say, a regular, bold, italic, and monospace — could be carrying 1–3 MB of font data alone. A subsetted version of those same fonts, containing only the characters that appear in the text, might weigh 30–80 KB total.
The difference is invisible to the reader. Both look identical on screen. But the fully embedded version is 10–30× larger in font data alone.
Fix: When exporting from the source application:
- Word / Office: File → Save As → PDF → Options → check “ISO 19005-1 compliant (PDF/A)” — this forces font subsetting
- InDesign: Export → General tab → Compatibility: Acrobat 5 or higher → check “Subset fonts below: 100%”
- Adobe Acrobat: File → Save As Other → Optimised PDF → Fonts tab → check all embedded fonts and click “Unembed” or reduce the subsetting threshold
Cause 4: Hidden metadata and revision history
Every PDF carries metadata — some visible, much invisible. The most common sources of metadata bloat:
| Metadata type | Where it comes from | Typical size | Safe to remove? |
|---|---|---|---|
| Author / Title / Subject / Keywords | Office app document properties | 1–5 KB | Yes |
| XMP metadata packet | Adobe Creative Cloud, InDesign, Acrobat | 5–30 KB | Yes |
| Document thumbnail preview | Most desktop export apps | 20–200 KB | Yes |
| Revision history / tracked changes | Word exports | 50 KB – 3 MB | Yes (unless you need to track edits) |
| Embedded ICC colour profile | Design tools, scanners, cameras | 300 KB – 4 MB | Usually yes — sRGB is fine for screen |
| Undo data / editor state | Word, Google Docs exports | 100 KB – 5 MB | Yes |
| Digital signatures | Signed contracts, certified PDFs | Varies | No — removing breaks the signature |
Fix: Use Save As (not Save) in Word before exporting — this creates a clean copy without accumulated undo history. In Acrobat: Document → Examine Document → Remove All to strip metadata. Lossless tools like ClickyFix also strip most metadata automatically as part of the compression process.
Cause 5: Office app export bloat
Microsoft Word, Excel, and PowerPoint are notorious for producing oversized PDFs. A 2 MB Word document often exports to a 6–12 MB PDF. The reasons stack up:
- Embedded preview images: Office embeds a screen-resolution preview of each page for file browsers to display without opening the file. On a 50-page document this alone can be 5–10 MB.
- Full font embedding: Office embeds complete font files by default unless you specifically enable subsetting.
- Image re-sampling disabled: By default, Word does not downsample images on export. The full-resolution version of every inserted photo is included.
- Accumulated file history: Repeated saves in Word compound the undo history embedded in the exported file.
Fixes specific to Office apps:
- Word: File → Save As → PDF → Options → select “Minimum size (publishing online)” in the “Optimise for” section
- PowerPoint: File → Export → Create PDF/XPS → click Options → check “Bitmap text when fonts may not be embedded” and lower the image DPI
- Excel: Before exporting, delete unused sheets and clear any chart data series you don't need
- All Office apps: Before exporting, do File → Info → Inspect Document → Inspect → Remove All to clear document properties, personal info, and revision data
Cause 6: Duplicate embedded resources
When PDFs are merged — either by combining multiple files or by appending pages — each source file brings its own copy of every resource it contains. A font used across 10 merged documents may be embedded 10 separate times. An image used as a header or footer on every page might be stored as a separate object on each page rather than referenced once.
This is a common hidden cause of large PDFs in workflows involving templates, monthly report generation, or document assembly pipelines. The content looks fine — the problem is entirely internal duplication.
Fix:
- Adobe Acrobat PDF Optimiser: File → Save As Other → Optimised PDF → check “Discard Duplicate Content Streams.” This deduplicates identical objects throughout the file.
- Ghostscript: Running any Ghostscript compression pass will also consolidate duplicate streams automatically.
- ClickyFix: The lossless restructure removes some but not all types of duplication — it's a useful first pass but won't catch embedded-font duplication in merged files.
Cause 7: Embedded thumbnails and preview images
Several applications embed additional preview or thumbnail images inside the PDF that are invisible to the reader but consumed by operating system file browsers, Adobe Bridge, and document management systems.
- Adobe InDesign embeds a full-resolution rasterised version of every page as a “display proxy” — on a 20-page design document, this alone can add 5–15 MB that readers never see.
- Microsoft Office embeds a screen-resolution thumbnail of the first page.
- Older Acrobat versions embedded per-page thumbnails directly in the file.
- Canva and Figma exports sometimes include original asset data alongside the rendered output.
Fix:
- InDesign: When exporting PDF, go to General tab → uncheck “Optimise for Fast Web View” and “Include Rasterized Pages Preview.” This alone can save 30–50% on design-heavy files.
- Acrobat: Document → Examine Document → check for “Embedded Thumbnails” → Remove All.
- Any PDF: A Ghostscript pass with
/ebookstrips embedded thumbnails as part of its restructuring.
Ready to shrink your PDF?
ClickyFix compresses PDFs in your browser — no upload, no signup, zero quality loss.
Compress PDF Free →How to diagnose your PDF
You don't need specialist software to diagnose what's making your PDF large. Start with this quick diagnostic table based on where your PDF came from:
| PDF origin | Most likely cause | Best fix |
|---|---|---|
| Exported from Word / Excel / PowerPoint | Full embedded fonts + metadata + export settings | Re-export using 'Minimum size' option, or use ClickyFix for lossless savings |
| Exported from InDesign or Figma | Embedded preview images + high-res raster images + full fonts | Re-export without preview, downsample images to 150 DPI |
| Exported from Canva | High-res images + embedded original assets | Use Canva's 'Compress PDF' export option, or run through Ghostscript |
| Scanned with a photocopier or scanner | High-DPI raster image on every page | Re-scan at 150 DPI, or use Ghostscript /ebook to downsample |
| Merged from multiple PDFs | Duplicate fonts and resources from each source file | Run through Acrobat Optimiser to deduplicate resources |
| Old PDF, origin unknown | Accumulated metadata, old-format streams, full fonts | ClickyFix lossless pass first; then Ghostscript if still too large |
| Looks mostly text but is huge | Embedded previews, revision history, or ICC colour profile | Use Document Inspector in Acrobat to identify; strip with Acrobat or Ghostscript |
| PDF from a design studio or print shop | Print-quality images at 300+ DPI, full fonts, CMYK colour profile | If for screen use, run Ghostscript /ebook — accept some image quality reduction |
Going deeper: PDF structure inspection
If the table above doesn't pinpoint the problem, you can inspect the PDF's internal structure:
- Adobe Acrobat Pro: File → Properties → Description tab shows document info; File → Save As Other → Optimised PDF → Audit Space Usage shows exactly what each component contributes to the file size.
- pdfinfo (command line):
pdfinfo yourfile.pdf— part of the poppler-utils package (Linux/Mac) — shows page count, PDF version, and metadata summary. - PDF Examiner (online): Uploads and analyses the PDF structure, listing embedded fonts, images, and object types. Use only for non-sensitive files.
Frequently Asked Questions
Q: Why is my PDF larger than the Word document it was exported from?
A: Word stores text as compressed XML and references images by file path. When you export to PDF, the application embeds full copies of every font used and every image in the document — plus metadata, thumbnails, and sometimes revision history. A 2 MB Word file easily becomes an 8–12 MB PDF if images are inserted at high resolution and fonts are fully embedded.
Q: Why is my PDF larger after re-saving it in Acrobat?
A: Each time you save a PDF in Acrobat without using 'Save As', Acrobat appends an update section at the end of the file rather than rewriting the whole file. After many saves, these accumulated update sections become significant. Use File → Save As to create a compacted version of the file — it will typically be noticeably smaller.
Q: Why do scanned PDFs take up so much more space than typed documents?
A: A typed document stores text as characters — each character is just a byte or two. A scanned document replaces each page with a photograph, which is millions of pixels. At 300 DPI on A4, that's over 8 million pixels per page. Even with JPEG compression, each page is 1–3 MB. Typed text on the same page would be 5–20 KB.
Q: Does compressing a PDF always make it smaller?
A: No. If the PDF is already well-optimised — subsetted fonts, compressed images at the right DPI, no excess metadata — a lossless compressor will have little to work with and the output may be the same size or fractionally larger. This is normal. It means the PDF is already efficiently structured.
Q: Why is my emailed PDF attachment larger than the file I sent?
A: Email systems encode binary attachments in Base64, which adds roughly 33% to the size. A 19 MB PDF becomes a 25 MB email — right at Gmail's limit. This is why the email bounces even when your file appears to be under the limit. Compress the PDF to around 18 MB to safely clear Gmail's threshold.
Q: What is the absolute smallest a PDF can be?
A: A single page of plain text with one subsetted system font and no images can be under 10 KB. A 100-page text-only document with efficient font subsetting and Flate-compressed text streams can be under 500 KB. The lower bound for image-containing PDFs is set entirely by the image resolution and compression ratio — there's no further floor beyond physics.
Published 10 June 2026 · ClickyFix Blog · File size measurements based on representative samples across common document types.
← Back to Blog