I think you should check out the article How Browsers Work: Behind the Scenes of Modern Web Browsers. It's a lengthy read, but well worth your time. Specifically, the HTML Parser section.
While I cannot do the article justice, perhaps a cursory summary will be good to hold one over until they have the time to read and digest that masterpiece. I must admit though, in this area I am a novice having very little experience. Having developed for the web professionally for about 10 years, the way in which the browser handles and interprets my code has long been a black box.
HTML, XHTML, CSS or JavaScript - take your pick. They all have a grammer, as well as a vocabulary. English is another great example. We have grammatical rules that we expect people, books, and more to follow. We also have a vocabulary made up of nouns, verbs, adjectives and more.
Browsers interpret a document by examining its grammar, as well as its vocabulary. When it comes across items it ultimately doesn't understand, it will let you know (raising exceptions, etc). You and I do the same in common-speak.
I love StackOverflow, but if I could changed one thing it would be be absolutamente broken...
Note in the example above how you immediately start to pick apart the words and relationships between words. The beginning makes complete sense, "I love StackOverflow." Then we come to "...if I could changed," and we immediately stop. "Changed" doesn't belong here. It's likely the author meant "change" instead. Now the vocabulary is right, but the grammar is wrong. A little later we come across "be be" which may also violate a grammatical rule, and just a bit further we encounter the word "absolutamente", which is not part of the English vocabulary - another mistake.
Think of all of this in terms of a DOCTYPE. I have right now opened up on my second monitor the source behind XHTML 1.0 Strict Doctype. Among its internals are lines like the following:
<!ENTITY % heading "h1|h2|h3|h4|h5|h6">
This defines the heading entities. And as long as I adhere to the grammar of XHTML, I can use any one of these in my document (<h1>Hello World</h1>
). But if I try to make one up, say H7
, the browser will stumble over the vocabulary as "foreign," and inform me:
"Line 7, Column 8: element "h7" undefined"
Perhaps while parsing the document we come across <table
. We know that we're now dealing with a table
element, which has its own set of vocabulary such as tbody
, tr
, etc. As long as we know the language, the grammar rules, etc., we know when something is wrong. Returning to the XHTML 1.0 Strict Doctype, we find the following:
<!ELEMENT table
(caption?, (col*|colgroup*), thead?, tfoot?, (tbody+|tr+))>
<!ELEMENT caption %Inline;>
<!ELEMENT thead (tr)+>
<!ELEMENT tfoot (tr)+>
<!ELEMENT tbody (tr)+>
<!ELEMENT colgroup (col)*>
<!ELEMENT col EMPTY>
<!ELEMENT tr (th|td)+>
<!ELEMENT th %Flow;>
<!ELEMENT td %Flow;>
Given this reference, we can keep a running check against whatever source we're parsing. If the author writes tread
, instead of thead
, we have a standard by which we can determine that to be in error. When issues are unresolved, and we cannot find rules to match certain uses of grammar and vocabulary, we inform the author that their document is invalid.
I am by no means doing this science justice, however I hope that this serves - if nothing more - to be enough that you might find it within yourself to sit down and read the article referenced as the beginning of this answer, and perhaps sit down and study the various DTD's that we encounter day to day.