Creating Accessible PDFs in the Browser: A Guide to Using PDFKit — Blog

Recently, I was tasked with creating an accessible PDF document in the browser. This turned out to be a challenging and time-consuming process due to the lack of clear resources. To help others who may face a similar challenge, I’ve decided to write this article. Here, I’ll demonstrate how to create accessible PDFs in the browser using PDFKit.

Is This Necessary?

Before diving in, it’s worth noting that, as a front-end developer, I typically avoid handling resource-intensive tasks in the front-end, preferring to offload them to the back end. However, I was working on a Chrome extension without a back end, where everything needed to happen in the browser. This scenario, where the backend isn't an option (like in an offline-first approach), requires handling PDF creation in the frontend. Though I generally advocate for keeping the frontend as lightweight as possible, I’ve now learned how to create accessible PDFs directly in the browser.

Which Tools Can I Use?

I began by searching for a suitable PDF generation library. While several libraries can transform HTML or Markdown into PDFs, most don’t support creating accessible PDFs. I eventually settled on PDFKit, a powerful library that, though not perfect, provides documentation on how to create accessible PDFs.

What Is an Accessible PDF?

Not all PDFs are accessible. In fact, many aren’t. Creating an accessible PDF involves several key requirements:

A meaningful title,
Properly tagged elements (Headings, Paragraphs, Figures, Tables, etc.),
A set document language,
Alternative text for images,
Sufficient text contrast (3:1 for bold/large text, 4.5:1 for regular text),
Keyboard usability,
PDF/UA (Universal Access) compliance,
Embedded fonts.

These requirements align with general web accessibility principles, ensuring that users with disabilities can effectively use PDF documents, just as they do with websites.

Fortunately, the PDF Accessibility Checker (PAC) is a helpful tool for verifying PDF accessibility. I used PAC throughout development and followed up with manual accessibility testing using a keyboard and screen reader.

Why PDFKit?

PDFKit was the only solution I found that allowed for programmatic creation of accessible PDFs, making it an easy choice for my task. Being able to create accessible PDFs is a significant value in a PDF generation library.

However, PDFKit has its drawbacks:

It doesn’t accept HTML or Markdown files as templates, nor can it convert these formats to PDFs. All content must be written in JavaScript using PDFKit’s API.

The documentation, while extensive, focuses on specific use cases rather than providing a comprehensive API reference, which can lead to time-consuming research.

Most examples in the documentation aren’t accessible, except for those in the dedicated accessibility section.

PDFKit is primarily a Node.js package, relying heavily on Node.js functionality that isn’t available in the browser. This can be addressed with browserify, but it’s not straightforward with modern frameworks like Vite. I ended up using a precompiled file, pdfkit.standalone.js, along with another dependency, blob-stream, to convert the writable stream generated by PDFKit into a downloadable blob.

Implementing an accessible document with PDFKit
below, I’ll walk you through the basics of programmatically creating an accessible document with PDFKit in the browser. You can experiment with the code in this CodePen.

Document Setup

The central element in PDFKit is the PDFDocument. To start, initialize the PDFDocument with the necessary options:

JavaScript

const doc = new PDFDocument({
    pdfVersion: '1.7',
    tagged: true,
    info: {
        Title: "Our accessible PDF document"
    },
    displayTitle: true,
    lang: 'en-EN',
    subset: 'PDF/UA',
    size: 'A4'
});

Initialization Breakdown:

pdfVersion: Set to 1.7, the latest version that supports creating accessible PDFs,
tagged: Enables a structure tree necessary for displaying tagged content,
info.Title: Sets the document title, important for screen readers,
displayTitle: Ensures the title is displayed and read by screen readers,
lang: Specifies the document’s language, aiding screen readers,
subset: Marks the document as PDF/UA compliant,
size: Defines the paper size, which is a preference and not related to accessibility.

Streaming the Document

PDFKit generates a writable stream, but to download the final PDF, it needs to be converted into a blob.

JavaScript

const stream = doc.pipe(blobStream());
doc.end();
stream.on('finish', () => {
    const blob = stream.toBlob('application/pdf');
    const a = document.createElement('a');
    document.body.appendChild(a);
    a.href = window.URL.createObjectURL(blob);
    a.download = 'accessible PDF.pdf';
    a.click();
    window.URL.revokeObjectURL(a.href);
});

The blob-stream package allows piping the document into the stream. Once the document is finished with doc.end(), the stream is completed, and the blob can be accessed and downloaded.

Creating Tagged Structure

Each PDF document consists of a Document tag, which contains all content. Let’s create the structure for a Document:

JavaScript

const myDocument = doc.struct('Document');
doc.addStructure(myDocument);

We’ll create two sections in the document: one with a primary heading and text, the other with a secondary heading and an image. Here’s how to create the first section:

JavaScript

// 1. Adding the section to the myDocument
const myTitleSection = doc.struct('Sect');
myDocument.add(myTitleSection);

// 2. Creating the structure for a heading and adding it to the section
const myTitle = doc.struct('H1');
myTitleSection.add(myTitle);

// 3. Creating the content for the heading and adding it to the heading
const myTitleContent = doc.markStructureContent('H1');
myTitle.add(myTitleContent);

// 4. Writing the heading itself with a given font and fontSize
doc.font('Helvetica').fontSize(24).text('This will be the h1 heading');

// 5. Ending the heading-section
myTitle.end();

// 6. Moving the cursor down the page
doc.moveDown(1); // moving down one line.
Explanation:

Create a new section structure (Sect) and add it to myDocument.

Create the structure for a level-one heading (H1) and add it to the section.

Add content to the heading using doc.markStructureContent('H1').
Write the text for the heading, chaining functions to set the font and size.

End the heading, finalizing its structure.

Add padding after the heading by moving the cursor down.

This process can be repetitive, so consider writing helper functions like addParagraph or addHeading.

Embedding Fonts

So far, we’ve used “PDF standard fonts,” which are supported by PDFKit but not embedded in the PDF document. For accessibility, fonts must be embedded to ensure readability at any zoom level. To embed custom fonts, they need to be fetched, converted into an arrayBuffer, and registered with PDFKit.

JavaScript

async function fetchSrc(src) {
    const res = await fetch(src);
    return res.ok ? await res.arrayBuffer() : null;
}

const fontRegular = await fetchSrc('/OpenSans-Regular.ttf');
const fontBold = await fetchSrc('/OpenSans-Bold.ttf');

doc.registerFont('regular', fontRegular);
doc.registerFont('bold', fontBold);
Use the registered font in your document by calling .font('bold').

Adding Images to the Document

To add a secondary heading and an image:

JavaScript

const myImageSection = doc.struct('Sect');
myDocument.add(myImageSection);

const myHeading = doc.struct('H2');
myImageSection.add(myHeading);

const myHeadingContent = doc.markStructureContent('H2');
myHeading.add(myHeadingContent);

doc.font('bold').fontSize(24).text('This will be the h2 heading');
myHeading.end();

doc.moveDown(1); // moving down one line.

const image = await fetchSrc('path/to/image');
myImageSection.add(
    doc.struct(
        'Figure',
        {
            alt: 'this is the alt text',
            actual: 'required to create a bounding box for the figure structure'
        },
        () => {
            doc.image(image, 100, 200, { width: 200 });
        }
    )
);
myImageSection.end();

Here, we fetched the image, added it to the section, and provided necessary alt text for accessibility. Note that you may need to handle x- and y-coordinates manually for precise placement.

Problems with PDFKit

Positioning: The Virtual Cursor

While PDFKit handles basic text positioning, more complex layouts (like images) require manual management of x- and y-coordinates. PDFKit offers functions for calculating positions, but it’s essential to understand how the virtual cursor works.

Page Breaks and Accessibility

Page breaks can break accessibility in PDFKit. When content exceeds the page length, PDFKit creates a new page but doesn’t extend the tagging structure, resulting in broken accessibility. To avoid this, check if the content fits on the current page before inserting it:

JavaScript

function willFitOnCurrentPage(doc, heading, text) {
    const currentPosition = doc.y;
    const width = doc.page.width - 144;
    doc.fontSize(24).font('bold');