PDF TOC Generator
Auto-detect headings by font size in any PDF and generate a clickable Table of Contents page with PDF bookmark outlines. Runs entirely in your browser.
How to Use
Generate a clickable Table of Contents for any PDF in four steps:
- Upload your PDF — Drag and drop a PDF onto the upload area, or click to browse. The tool scans every page for text content and analyzes font sizes to identify headings. This typically takes 1-3 seconds for a 50-page document.
- Review detected headings — The tool displays all identified headings with their classification (H1, H2, H3), page number, and a checkbox to include or exclude each one. The detection algorithm compares each text element's font size against the most common size in the document (body text) and classifies larger text as headings.
- Adjust heading levels — Use the dropdown next to each heading to change its level. H1 headings appear as top-level entries, H2 entries are indented once, and H3 entries are indented twice. You can also customize the TOC title and font size using the settings above the heading list.
- Click "Generate TOC" — The tool creates a new PDF with a formatted Table of Contents page inserted at the beginning. Each entry is a clickable link that jumps to the correct page, and PDF bookmark outlines are added for sidebar navigation in PDF readers like Adobe Acrobat and Preview.
The heading detection uses pdfjs-dist for text extraction and pdf-lib for document modification, both running entirely in your browser.
About This Tool
A Table of Contents is a navigational structure that maps a document's headings to their page numbers, allowing readers to locate specific sections without scrolling through every page. In digital PDFs, a TOC serves a dual purpose: it provides a visual reference on a dedicated page and, when properly linked, enables click-to-jump navigation that mirrors the experience of web page anchor links.
The heading detection algorithm in this tool operates on a statistical principle. Every text element in a PDF carries metadata including its font size, position, and font name. The algorithm first identifies the "body text" font size by finding the most frequently occurring size across the entire document. Text elements with font sizes significantly larger than the body size are classified as headings. Specifically, text at 1.6x or more of the body size becomes an H1, 1.3x becomes an H2, and 1.15x becomes an H3. These ratios align with common typographic conventions: a 12pt body text document typically uses 19-20pt for chapter titles (H1), 15-16pt for section headings (H2), and 13-14pt for subsection headings (H3).
PDF bookmark outlines are a separate navigational structure from the visual TOC page. Outlines appear in the sidebar panel of PDF readers and provide a collapsible tree view of the document's structure. This tool generates both: a printed TOC page with clickable links and a bookmark outline tree with hierarchical nesting. The bookmark tree mirrors the heading hierarchy — H1 entries contain nested H2 entries, which in turn contain H3 entries — giving readers two complementary ways to navigate the document.
Unlike word processors that embed structural heading metadata directly into the document, PDF is a presentation-oriented format. PDF files store text as positioned glyphs with font references rather than semantic headings. This is why heading detection must rely on visual properties like font size rather than structural tags. The approach works well for documents produced by word processors (Microsoft Word, Google Docs, LibreOffice), LaTeX systems, and professional typesetting software that follow consistent typographic hierarchies. It may produce inaccurate results for highly decorative layouts, documents using uniform font sizes throughout, or scanned documents containing only image data without selectable text.
The generated TOC page uses Helvetica (a PDF standard font that requires no embedding) and includes visual differentiation between heading levels: H1 entries are bold at full size, H2 entries are regular-weight at 93% size with one level of indentation, and H3 entries appear at 86% size with two levels of indentation. Dotted leader lines connect each heading to its page number, following the conventional TOC layout established by typographic tradition. Page numbers in the TOC are automatically adjusted to account for the TOC page itself being inserted at the beginning of the document.
The link annotations on the TOC page use the PDF /Dest array format with the /Fit destination type, which tells the PDF reader to display the entire target page when the link is clicked. This is the most compatible destination type across PDF viewers — Adobe Acrobat, Chrome's built-in PDF viewer, Firefox's pdf.js, Apple Preview, and Foxit Reader all support it consistently.
Why Use This Tool
Generating a Table of Contents from an existing PDF addresses several common document management challenges:
- Academic papers and theses — Dissertation committees and journal reviewers expect a table of contents in long-form academic documents. When exporting from LaTeX or Google Docs without a TOC, this tool adds one without re-editing the source document. The PDF bookmark outlines also satisfy accessibility requirements imposed by many university repositories.
- Technical documentation and manuals — Product manuals, API documentation exports, and internal knowledge bases often span hundreds of pages. A navigable TOC with sidebar bookmarks transforms a flat PDF into a reference document where engineers can locate specific sections in seconds rather than scrolling through the entire file.
- Legal and regulatory filings — Court submissions, compliance reports, and contract bundles require structured navigation for judges, regulators, and opposing counsel to review efficiently. Many jurisdictions mandate a table of contents for filings exceeding a certain page count, and PDF bookmarks satisfy electronic filing system requirements.
- Meeting minutes and board packages — Corporate secretaries compile board packages from multiple source documents merged into a single PDF. Adding a TOC creates a professional presentation that allows board members to navigate directly to specific agenda items, financial statements, or committee reports.
- E-books and digital publications — Self-published authors and small publishers distributing PDF e-books benefit from both a visual TOC page and sidebar bookmarks. E-reader applications that support PDF use the bookmark outline for chapter navigation, matching the experience of EPUB or MOBI formats.
- Scanned document organization — After OCR processing a scanned document, the text layer contains font size information that this tool can use to detect chapter and section breaks. Adding a TOC to digitized archives, historical records, or scanned textbooks dramatically improves their usability.
This tool processes everything locally in your browser. The PDF text is extracted and analyzed using pdfjs-dist, and the modified document is assembled using pdf-lib — both are client-side JavaScript libraries. No server receives your document, making it suitable for confidential academic work, privileged legal documents, proprietary technical manuals, and any file where privacy is essential.