Chris Phillips over at Curb Cut Learning has a rant, which I agree with, that HTML is better than PDF for accessibility.
In fact, I go further than he does, HTML is already good enough for most people. Maybe professional printing houses still need PDF or MSWord but John Public, who merely wants to print a flyer for his lost dog, can do that just fine in HTML with a little CSS on the side.
I remember when I was a temp at Microsoft and some W3C guys came by to give a presentation about web standards and web accessibility, which mostly fell on deaf ears. It was telling to me, when my boss converted their presentation, which was in nice simple HTML into PowerPoint. My boss’ fear was legitimate. He was afraid other staff in the company wouldn’t read the W3C presentation if it remained in HTML on some file server someplace–people stick with what they know.
That incident stuck with me though. I realized that most of my job as the webmaster for the Microsoft Accessibility site was simply converting Microsoft Word files into something that could reasonably be called HTML 3.2. To this day, as webmaster for several customers in my own business, most of my work consists of converting Acrobat and Word files into XHTML 1 strict. This is work that shouldn’t exist.
The problem is that semantic purity requires thought on the part of an author. Most people don’t want to think about whether a piece of text is a book title, a programming variable, an author’s contact information or emphasis and intonation. They just italicize it, at least in anglophone countries, and let the world figure it out.
Somehow we have to make authoring tools that handle this semantic stuff automatically for people who have better things to do. Otherwise Berners-Lee’s dream isn’t going to manifest.
Even still, HTML is already good enough to suplant most proprietary document formats for most people for most purposes.