Today is a good day to code

HTML 5 Simplification: Fixing HTML by Douglas Crockford Response

Posted: December 31st, 1969 | Author: | Filed under: java, Programming, Uncategorized | Tags: | No Comments »

HTML 5 Simplification: Fixing HTML by Douglas Crockford Response

Picture of IrvinIn almost every way I agree with Douglas Crokford's Fixing HTML. I think in many ways, XHTML has started to take us down the wrong path. CSS was great, but it is now rapidly becoming a mess, largely unusable for anyone except experts. HTML is the same way. This becomes rapidly evident when you start trying to create a browser.

I really love the idea that we can abolish doctypes. They are arguably the most misunderstood element of HTML, and quite probably XML for that matter. That the browser should be responsible for determing if the markup matches a doctype element, after having to fix unclosed tags and improper syntax is part of what leads browsers to be so complex, error prone, and vulnerable to attack.

The best part of giving the html tag a version number, in Douglas' proposal is that no longer do browsers have to figure out what version of html you are coding for by looking at the doctype and your markup. If you tell it your page is html version=”5″ then if it doesn't conform, it shouldn't render. There would need to be more robust built-in error checking and descriptive error messages in browsers, but for the most part, this should be in there anyway.

Having only one scripting language allowed on a page is just common sense, I suppose there could be a justifiable business case for using more than one, but in almost all day to day use it is not practical and should not be supported.

JavaScript should never execute before the page has finished rendering. I can agree with processing scripts included in the head once the /head is encountered, partially. I don't think any scripts should be executed until everything has rendered. I know that this may limit the flexibility of designing pages where the JavaScript creates elements without explicitly calling an “init” method at the bottom of the page, but this is really, IMHO, the proper way to kick off your scripts.

There should definitely be no more document.write, and no more javascript:// urls. That was always just terrible, and allows for easy scripting attacks.

Largely, the changes that Douglas Crockford proposes for the script elements in HTML will result in faster more scalable JavaScript. If the browser always processes the scripts in the same way, and doesn't have to worry about processing the script elements at the same time as it is rendering the DOM, the execution of the script will be more predictable, and the page more robust.

Clearly frames, framesets, and iframes have to go. Although it would be nice if they are removed, to allow for, either the browser treating XHRs and scripted DOM changes as though they were page refreshes to maintain the back button's functionality, or give the scripting language a safe API to handle the back button press in the browser. The module tag that is proposed could easily replace the functionality lost by removing the frame metaphor.

I totally agree with him on the need for CSS content to be standardized, but I think that by allowing scripts to grab document elements by their CSS selectors, as well as improving encoding, the application of CSS can be cleaned up fairly nicely. CSS 3 and its support for namespaces could simplify the application of CSS as well, but the trick is to get all of the major browsers to support it.

I don't agree about empty tags, I think that they should be required to self-close. In writing browsers, it would allow standard DOM parsers to be used to process the markup instead of having to go through the document, close the tags, and then feed it to the DOM parser. Doing this in Java, even with JTidy, as fast as it is, still has too much cost relative to the benefit of just self-closing the tags.

Custom tags and custom attributes are a must. CSS is robust enough now to fully control how the item displays, whether it is a block element, or inline, etc… I worry a little about overuse of these things, and how to enforce they not interfere with microformats, etc… but in general I think that allowing flexibility as long as the custom tags and attributes meet the requirements of HTML should be OK.

Originally I was really in the XHTML 2.0 camp, but I can see the need for more robustness in the basic markup tags than can be allowed in XHTML. HTML blows up the encapsulation of data, logic, and presentation in many ways, but it needs to remain and become an easy way for relatively un-expert individuals to create rich pages. It is currently far too complicated, and the proposed methods of handling mixed HTML 5 / 4 / XHTML pages sound scary and prone to exploits. I hope this little treatise doesn't fall on deaf ears. Are you listening what3g?