ClickyFix › Blog › Why Is My PDF So Large?

PDFFile SizeGuide

Why Is My PDF So Large? 7 Causes and How to Fix Each One

10 June 2026·8 min read·~2,200 words

You exported a simple five-page document and it came out at 18 MB. Or you received a PDF you need to email on, but it's 40 MB and Gmail won't take it. Something is hiding inside that file — and it's not the text. This guide identifies every common cause, tells you how to spot it, and gives you a concrete fix for each one.

80%

of large PDFs are caused by unoptimised images

300 KB

typical size of a clean 10-page text PDF

4×

file size penalty of scanning at 300 DPI vs 150 DPI

25 MB

Gmail attachment limit — where most people first notice the problem

What a PDF actually contains

PDF (Portable Document Format, ISO 32000) is a container format — a single .pdf file can hold any combination of the following:

Text streams — the actual characters in the document, usually compressed with Flate/Deflate
Raster images — JPEG, PNG, JBIG2, or CCITT-encoded image data embedded per-page
Vector graphics — lines, shapes, and curves described as PostScript-like path commands
Font data — full or partial copies of the fonts used, so the document renders identically on any device
Metadata — author, title, creation date, revision history, XMP data packets
Annotations and form fields — interactive elements that can add significant overhead
Embedded previews and thumbnails — pre-rendered images used by file browsers and apps
Cross-reference tables — an internal index of every object in the file

The same A4 page of plain text might weigh 30 KB. The same page scanned at 300 DPI as an image might weigh 2–8 MB. Understanding which of these components is in your specific PDF is the key to shrinking it.

Size benchmarks — what's normal?

Before diagnosing a problem, you need a baseline. Here are the expected sizes for common PDF types at 10 and 50 pages:

Document type	10 pages	50 pages	What drives the size
Text only (digitally born)	50–200 KB	200–800 KB	Character streams + fonts
Text + charts and diagrams	200–600 KB	1–3 MB	Vector graphics + fonts
Word / Office export	500 KB – 2 MB	2–10 MB	Fonts + embedded images + metadata
PowerPoint export	3–15 MB	15–80 MB	Full-bleed slide images + fonts per slide
Design portfolio (screen quality)	2–8 MB	10–40 MB	JPEG images at 150 DPI
Design portfolio (print quality)	10–40 MB	50–200 MB	JPEG/TIFF images at 300 DPI
Scanned document at 300 DPI	5–20 MB	25–100 MB	Raster image per page
Scanned document at 150 DPI	1–5 MB	5–25 MB	Raster image per page (lower res)
Scanned document at 72 DPI	200–800 KB	1–4 MB	Low-res raster — readable on screen only

💡Quick check: If your PDF is significantly larger than the range for its type, something avoidable is inflating it. Use the causes below to identify what.

Cause 1: High-resolution embedded images

This is the most common cause of large PDFs by a wide margin. When you insert an image into Word, InDesign, Figma, or any other tool, the application embeds the original image file at its original resolution — even if you've scaled it down to fit on a page.

A single 4K photograph (3840×2160 pixels) might be 8–12 MB on disk as a JPEG. Placed on an A4 page and exported to PDF, Word will embed that full 12 MB image even though the printed version only needs about 150 DPI — which would be under 1 MB. The extra resolution is invisible but the file size is very real.

Image resolution	A4 page dimensions at this DPI	Approx. JPEG size per page	Use case
72 DPI	595 × 842 px	40–150 KB	Screen-only, lowest quality
96 DPI	794 × 1123 px	80–300 KB	Web display
150 DPI	1240 × 1754 px	200–700 KB	Screen + casual print — the sweet spot
300 DPI	2480 × 3508 px	800 KB – 3 MB	Professional print quality
600 DPI	4961 × 7016 px	3–12 MB	Prepress / large-format — rarely needed

Fix: For PDFs you're emailing or sharing on screen, re-export from the source file with images downsampled to 150 DPI. In Word: File → Save As → PDF → Options → uncheck “Best for: Standard” and use “Minimum size.” In InDesign: Export → Compression tab → Downsample to 150 PPI. If you don't have the source file, use Ghostscript's /ebook setting or an online compressor that re-encodes images.

If you only need lossless structural savings (no image re-encoding), ClickyFix PDF Compressor can reduce Word and Office exports by 15–35% with zero quality loss.

Cause 2: Scanned pages

A scanned PDF is not a text document — it's a photograph of each page, wrapped in a PDF envelope. Every page is a raster image, and its size is determined entirely by the scan resolution and compression used by your scanner.

Most office scanners default to 300 DPI, which is designed for printing. If you're sending the document digitally, you almost never need that resolution. A 10-page contract scanned at 300 DPI might be 15–25 MB. The same contract scanned at 150 DPI would be 3–6 MB and looks identical on any screen.

ℹ️The 4× rule: Halving the scan resolution (e.g., 300 DPI → 150 DPI) reduces the image dimensions by 2× in each direction, so the resulting image is 4× smaller in pixel count — and roughly 4× smaller in file size after compression.

Fix options:

Re-scan at 150 DPI if you have access to the original document — this is the cleanest fix.
Use Ghostscript /ebook to downsample the images inside the existing PDF to 150 DPI without re-scanning.
Use an online compressor like iLovePDF — these re-encode the embedded images at lower quality and lower DPI, typically reducing scanned PDFs by 60–80%.

Note that lossless tools (including ClickyFix) will have little impact on a purely scanned PDF, because there is no structural overhead to remove — the bulk of the file is the image data itself.

Cause 3: Fully embedded fonts

PDF fonts come in two forms: fully embedded (the complete font file is included) and subsetted (only the characters actually used in the document are included).

A full font file for a single typeface weight (e.g., “Helvetica Neue Regular”) is typically 250–800 KB. A document using four fonts — say, a regular, bold, italic, and monospace — could be carrying 1–3 MB of font data alone. A subsetted version of those same fonts, containing only the characters that appear in the text, might weigh 30–80 KB total.

The difference is invisible to the reader. Both look identical on screen. But the fully embedded version is 10–30× larger in font data alone.

Fix: When exporting from the source application:

Word / Office: File → Save As → PDF → Options → check “ISO 19005-1 compliant (PDF/A)” — this forces font subsetting
InDesign: Export → General tab → Compatibility: Acrobat 5 or higher → check “Subset fonts below: 100%”
Adobe Acrobat: File → Save As Other → Optimised PDF → Fonts tab → check all embedded fonts and click “Unembed” or reduce the subsetting threshold

Cause 4: Hidden metadata and revision history

Every PDF carries metadata — some visible, much invisible. The most common sources of metadata bloat:

Metadata type	Where it comes from	Typical size	Safe to remove?
Author / Title / Subject / Keywords	Office app document properties	1–5 KB	Yes
XMP metadata packet	Adobe Creative Cloud, InDesign, Acrobat	5–30 KB	Yes
Document thumbnail preview	Most desktop export apps	20–200 KB	Yes
Revision history / tracked changes	Word exports	50 KB – 3 MB	Yes (unless you need to track edits)
Embedded ICC colour profile	Design tools, scanners, cameras	300 KB – 4 MB	Usually yes — sRGB is fine for screen
Undo data / editor state	Word, Google Docs exports	100 KB – 5 MB	Yes
Digital signatures	Signed contracts, certified PDFs	Varies	No — removing breaks the signature

⚠️Privacy note:Revision history and tracked changes in a Word export can include every edit ever made to the document — including deleted text and previous versions of clauses. If you're sharing a contract or report externally, always export from a freshly saved “Save As” copy to strip this data.

Fix: Use Save As (not Save) in Word before exporting — this creates a clean copy without accumulated undo history. In Acrobat: Document → Examine Document → Remove All to strip metadata. Lossless tools like ClickyFix also strip most metadata automatically as part of the compression process.

Cause 5: Office app export bloat

Microsoft Word, Excel, and PowerPoint are notorious for producing oversized PDFs. A 2 MB Word document often exports to a 6–12 MB PDF. The reasons stack up:

Embedded preview images: Office embeds a screen-resolution preview of each page for file browsers to display without opening the file. On a 50-page document this alone can be 5–10 MB.
Full font embedding: Office embeds complete font files by default unless you specifically enable subsetting.
Image re-sampling disabled: By default, Word does not downsample images on export. The full-resolution version of every inserted photo is included.
Accumulated file history: Repeated saves in Word compound the undo history embedded in the exported file.

Fixes specific to Office apps:

Word: File → Save As → PDF → Options → select “Minimum size (publishing online)” in the “Optimise for” section
PowerPoint: File → Export → Create PDF/XPS → click Options → check “Bitmap text when fonts may not be embedded” and lower the image DPI
Excel: Before exporting, delete unused sheets and clear any chart data series you don't need
All Office apps: Before exporting, do File → Info → Inspect Document → Inspect → Remove All to clear document properties, personal info, and revision data

Cause 6: Duplicate embedded resources

When PDFs are merged — either by combining multiple files or by appending pages — each source file brings its own copy of every resource it contains. A font used across 10 merged documents may be embedded 10 separate times. An image used as a header or footer on every page might be stored as a separate object on each page rather than referenced once.

This is a common hidden cause of large PDFs in workflows involving templates, monthly report generation, or document assembly pipelines. The content looks fine — the problem is entirely internal duplication.

Fix:

Adobe Acrobat PDF Optimiser: File → Save As Other → Optimised PDF → check “Discard Duplicate Content Streams.” This deduplicates identical objects throughout the file.
Ghostscript: Running any Ghostscript compression pass will also consolidate duplicate streams automatically.
ClickyFix: The lossless restructure removes some but not all types of duplication — it's a useful first pass but won't catch embedded-font duplication in merged files.

Cause 7: Embedded thumbnails and preview images

Several applications embed additional preview or thumbnail images inside the PDF that are invisible to the reader but consumed by operating system file browsers, Adobe Bridge, and document management systems.

Adobe InDesign embeds a full-resolution rasterised version of every page as a “display proxy” — on a 20-page design document, this alone can add 5–15 MB that readers never see.
Microsoft Office embeds a screen-resolution thumbnail of the first page.
Older Acrobat versions embedded per-page thumbnails directly in the file.
Canva and Figma exports sometimes include original asset data alongside the rendered output.

Fix:

InDesign: When exporting PDF, go to General tab → uncheck “Optimise for Fast Web View” and “Include Rasterized Pages Preview.” This alone can save 30–50% on design-heavy files.
Acrobat: Document → Examine Document → check for “Embedded Thumbnails” → Remove All.
Any PDF: A Ghostscript pass with /ebook strips embedded thumbnails as part of its restructuring.

Ready to shrink your PDF?

ClickyFix compresses PDFs in your browser — no upload, no signup, zero quality loss.

Compress PDF Free →

How to diagnose your PDF

You don't need specialist software to diagnose what's making your PDF large. Start with this quick diagnostic table based on where your PDF came from:

PDF origin	Most likely cause	Best fix
Exported from Word / Excel / PowerPoint	Full embedded fonts + metadata + export settings	Re-export using 'Minimum size' option, or use ClickyFix for lossless savings
Exported from InDesign or Figma	Embedded preview images + high-res raster images + full fonts	Re-export without preview, downsample images to 150 DPI
Exported from Canva	High-res images + embedded original assets	Use Canva's 'Compress PDF' export option, or run through Ghostscript
Scanned with a photocopier or scanner	High-DPI raster image on every page	Re-scan at 150 DPI, or use Ghostscript /ebook to downsample
Merged from multiple PDFs	Duplicate fonts and resources from each source file	Run through Acrobat Optimiser to deduplicate resources
Old PDF, origin unknown	Accumulated metadata, old-format streams, full fonts	ClickyFix lossless pass first; then Ghostscript if still too large
Looks mostly text but is huge	Embedded previews, revision history, or ICC colour profile	Use Document Inspector in Acrobat to identify; strip with Acrobat or Ghostscript
PDF from a design studio or print shop	Print-quality images at 300+ DPI, full fonts, CMYK colour profile	If for screen use, run Ghostscript /ebook — accept some image quality reduction

Going deeper: PDF structure inspection

If the table above doesn't pinpoint the problem, you can inspect the PDF's internal structure:

Adobe Acrobat Pro: File → Properties → Description tab shows document info; File → Save As Other → Optimised PDF → Audit Space Usage shows exactly what each component contributes to the file size.
pdfinfo (command line): pdfinfo yourfile.pdf — part of the poppler-utils package (Linux/Mac) — shows page count, PDF version, and metadata summary.
PDF Examiner (online): Uploads and analyses the PDF structure, listing embedded fonts, images, and object types. Use only for non-sensitive files.

ℹ️The fastest diagnostic:If you compress your PDF with ClickyFix and the file barely changes (under 5% reduction), the bulk of your file is image data. No lossless tool can help further. You need Ghostscript or an image-downsampling compressor. If ClickyFix reduces it by 20% or more, the original had significant structural overhead that didn't need to be there.

Frequently Asked Questions

Q: Why is my PDF larger than the Word document it was exported from?

A: Word stores text as compressed XML and references images by file path. When you export to PDF, the application embeds full copies of every font used and every image in the document — plus metadata, thumbnails, and sometimes revision history. A 2 MB Word file easily becomes an 8–12 MB PDF if images are inserted at high resolution and fonts are fully embedded.

Q: Why is my PDF larger after re-saving it in Acrobat?

A: Each time you save a PDF in Acrobat without using 'Save As', Acrobat appends an update section at the end of the file rather than rewriting the whole file. After many saves, these accumulated update sections become significant. Use File → Save As to create a compacted version of the file — it will typically be noticeably smaller.

Q: Why do scanned PDFs take up so much more space than typed documents?

A: A typed document stores text as characters — each character is just a byte or two. A scanned document replaces each page with a photograph, which is millions of pixels. At 300 DPI on A4, that's over 8 million pixels per page. Even with JPEG compression, each page is 1–3 MB. Typed text on the same page would be 5–20 KB.

Q: Does compressing a PDF always make it smaller?

A: No. If the PDF is already well-optimised — subsetted fonts, compressed images at the right DPI, no excess metadata — a lossless compressor will have little to work with and the output may be the same size or fractionally larger. This is normal. It means the PDF is already efficiently structured.

Q: Why is my emailed PDF attachment larger than the file I sent?

A: Email systems encode binary attachments in Base64, which adds roughly 33% to the size. A 19 MB PDF becomes a 25 MB email — right at Gmail's limit. This is why the email bounces even when your file appears to be under the limit. Compress the PDF to around 18 MB to safely clear Gmail's threshold.

Q: What is the absolute smallest a PDF can be?

A: A single page of plain text with one subsetted system font and no images can be under 10 KB. A 100-page text-only document with efficient font subsetting and Flate-compressed text streams can be under 500 KB. The lower bound for image-containing PDFs is set entirely by the image resolution and compression ratio — there's no further floor beyond physics.

Published 10 June 2026 · ClickyFix Blog · File size measurements based on representative samples across common document types.

← Back to Blog