PDF Accessibility
Creating HTML Alternatives to PDF Files

Create an HTML File From the Original Document

When converting a PDF file to HTML, it is always best to go back to the source. It will be easier to convert a Microsoft Word or Adobe InDesign document to an accessible HTML file than it is to do the same with an Adobe PDF.

In Microsoft Office

Important

When converting from Microsoft Office, only well-marked-up Office documents will produce well-marked-up HTML or PDF files! Garbage in, garbage out.

In Microsoft Office products you must use real headings (not just large, bold fonts), bullets, numbered lists and other structural tags in the original Office document. If you don't, then the correct tags will not be created when the document is converted into HTML or PDF. For many people, this means learning how to use the structural elements within Word, because many of us don't pay attention to the "style" options in word processors. We pay attention to the visual output. This has to change in order to make the content accessible and usable in screen readers.

Important

You must add alternative text in the Office file in order for the alternative text to be converted into an HTML or PDF file.

To do this, right-click on the image, select Format Picture.

screenshot of the menu that appears from righclicking an image in Word

A dialog box will appear. Select the Web tab and add the appropriate alt text.

screenshot of adding alt text

Once you have correctly marked up the Microsoft Word document, choose File > Save as Web Page > Save as type > Web Page, Filtered. It will still be necessary to check this final HTML document for accessibility, especially if the original Word document contained data tables.

Convert a PDF file to HTML

Sometimes the original file used to create the PDF is unavailable. In that case you can create an HTML file using Acrobat, but the file will probably be more complex and will require more work to make it accessible.

To create an HTML file in Adobe Acrobat, choose File > Save As > Save as Type and then select HTML 4.01 if you have Acrobat 7 or 6 or HTML 3.2 if you are using Acrobat 5. The HTML that is produced in this conversion process is really so poor that you may possibly spend as much time trying to clean up the file as you would creating it from scratch. If you have an images, only the alternate description will be saved, but not the image, and there are no tables in the HTML file, even if the table is an appropriately-tagged data table in the original PDF file.

WebAIM is an initiative of:
Center for Persons with Disabilities (CPD) Utah State University