Eric Vasilik: Code Karma

Sunday, July 02, 2006

Code Karma

I've recently been writing client side Java Script for an HTML user interface I've been building at work, and I ran into an issue with Internet Explorer which I was at least partially responsible for 10 years ago! Let me explain what it is, why it is and an effective way to work around it.

I was attempting to dynamically replace a number of rows in a table with a different set of rows. I was using the innerHTML property of an element which takes a string, parses it as HTML and replaces the contents of that element with the new HTML. In this case, I try to replace the contents of a TBODY with a new set of rows:

  <table>
    <tbody id="rows">
      <tr>....
      <tr>....
      ...
    </tbody>
  </table>

  var tb = document.getElementsByName('rows')[0];
  tb.innerHTML = "<tr>......";

This works just fine in Firefox. The new rows replace the old and the display updates. However, in Internet Explorer, one gets a script error stating that there was a runtime error!

At first, I thought that IE was not capable of performing the redraw for modified tables with innerHTML, but then I remembered that I was responsible for this limitation! How many developers get to deal with the consequences of their decisions about products at a later date? Probably not many. Let me describe how this came to be.

About 10 years ago, I was part of the Trident team. Trident was responsible for implementing the parsing, rendering and object model for the next version of IE, after 3.0. Also known as mshtml.dll. I was a developer responsible for the in-memory representation of the HTML and the dynamic manipulation of that HTML.

One of the things I did during this time was invent the method innerHTML, along with innerText, outerHTML, outerText and the lesser known insertAdjacentHTML and insertAdjacentText methods. These were methods which took HTML or raw text and replaced/inserted that new content into the document.

Now, Microsoft documents this as not applicable to table elements. Why? They don't say. However, I remembered why.

When one sets the innerHTML property of an element, the string containing the HTML is run through the parser. Now, HTML parsers are not simple, straightforward parsers like XML parsers. The HTML parser (implemented brilliantly by David Bau) takes arbitrary text and, usually, produces an HTML tree of elements. For example, parsing a file containing only "Foo" will result in the tree:

  <HTML><HEAD></HEAD>
  <BODY>Foo</BODY></HTML>

You can see this for yourself by running the following through IE (Firefox won't work, as they did not implement outerHTML):

Foo<script>alert(document.body.parentNode.outerHTML)</script>

Now, parsing something like "<tr><td>Foo" where there is no TABLE tag preceding the TR causes the parser to ignore the TR tag altogether. This was probably done by the IE parser for backwards compatibility with the Netscape browser of the time. In fact, much of the complexity of the parser is influenced by backwards compatibility.

So, attempting to set the innerHTML of a TBODY with "<tr>..." would result in setting the contents of the TBODY with "Foo". This is not terribly "valid" or displayable HTML. In order to get that TR created, you need to precede it with a TABLE tag. However, attempting to set the contents of the TBODY with "<table><tr>..." makes even less sense because injecting a TABLE directly in a TBODY is also meaningless.

What this all calls for is what I used to call "Contextual HTML Parsing". This is a mode of parsing where a branch of an existing HTML tree was to "seed" the parser with a context with which the parser would then interpret a string to parse. Thus, if the branch of tags were (from the bottom) TBODY, TABLE, BODY, HTML and one were to parse "<tr>...", the "Contextual HTML Parser" would know that creating a TR was okay because its immediate parent would implicitly be a TBODY, a valid container for a TR.

Nifty concept, this contextual parsing. The problem was that we never had enough time to implement such a feature. And, in order to deal with attempts to modify tables in such a manner, I prohibited the modification of tables with innerHTML and other methods.

An alternative to all this would have been to "hack" something up. For example, I could have checked to see if the innerHTML of a TBODY was being set to something which began with a "<tr>". Under these circumstances I could have prepended a "<table>" to the string, and then plucked the TR's out of the resulting tree and replaced the contents of the TBODY with them.

Sounds simple enough until you have to consider all the variations. Like, what if the string to be parsed looks like "<tr>...". Pretty soon you start doing all the work the real parser has to do.

So instead of hacking up something very incomplete and possibly erroneous in many cases, I left the modifications of tables with innerHTML out of the product. It would have been fun to modify the parser to deal with non textual context!

I wonder how Firefox implemented this. Perhaps I'll find the time to look at the code sometime....

The workaround for this is actually not all that bad. What I did was to insert a SPAN tag into my original page with the visibility style set to hidden. When I wanted to replace the rows, I would set the innerHTML of this span with something like "<table><tbody><tr>...". Because the span is not visible, this does not cause the page to redraw. Then, I would use the DOM method replaceChild of elements to remove the old TBODY and replace it with the newly parsed TBODY. This resulted in the table changing and being redrawn correctly!

  <table>
    <tbody id="tb">
      <tr>....
      <tr>....
      ...
    </tbody>
  </table>
  <span id=temp style='visibility:hidden'></span>

  var temp = document.getElementsByName('temp')[0];
  temp.innerHTML = '<table><tbody><tr><td>New Row';
  var tb = document.getElementsByName('tb')[0];
  tb.parentNode.replaceChild(temp.firstChild.firstChild, tb);

You can see this in action here.

# posted by Eric Vasilik @ 9:11 PM

Comments:

Thanks for a very helpful and insightful piece of writing. Ran into the exact same problem earlier today, and your little workaround totally saved my afternoon.

But shame on you for not implementing innerHTML properly 10 years ago. :-)

# posted by

Jens Christian Mikkelsen : 11:51 AM

yuck..

Thanks for the post. It clears up why I am having trouble. A couple of things though which aren't clear.

1. why, when creating the temp table does the html stop? shouldn't (angles replaced with squares)

var newRows = '[table][tbody][tr][td]New Row';

really be

var newRows = '[table][tbody][tr][td]New Row'[/td][/tr][/tbody][/table]';

?

2. also, I am trying to replace a single row with two rows (an "expand this row out to see the detail" function) how would this fit?

# posted by

metamind : 2:27 AM

The inventor of innerHTML ! *respect*

This post saved me a lot of trouble as well, really appreciate this.

Now I can get on with easily replacing whole chunks of a table using Ajax, and not break IE...

Thanks!

# posted by

Peter Thomas : 6:51 AM

metamind,

The HTML I use does not include the end tags because HTML parsers to not require them and implicitly close those tags when the 'EOF' is reached.

If you want to replace single rows, simply wrap them in their own TBODYs. Or, you should be able to change the code slightly to replace the TR directly, instead of using the TBODY as a container.

# posted by

Eric Vasilik : 9:48 AM

Thanks. This is very helpful! I just can't seem to get it to replace a [tr] element (like you suggested should be possible). This should be an easy DOM traversal, and I'm probably doing something stupid, but I can't get a reference to the [tr] element. I either just get all the cells in the row, or all the rows in the table: This is an example to show my problem:

[html]
[body]

[table]
[tr id="abc"][td]first row[/td][/tr]
[tr][td]2nd row[/td][/tr]
[/table]

[script type="text/javascript"]
var aNode = document.getElementById("abc")
alert(aNode.innerHTML); // does not contain the [tr] tag
alert(aNode.parentNode.innerHTML); // ok has a few [tr] tags, but looks like text
alert(aNode.parentNode.firstChild.innerHTML); // back to [td] tag like the first one
[/script]

[/body]
[/html]

# posted by

Gerrat Rickert : 11:51 AM

Sorry, nevermind.
Figured out what I needed to do.

# posted by

Gerrat Rickert : 12:32 PM

Great help. When I told my manager the problem I was having but found the guy that caused it he laughed. This has saved me hours of trouble.

Thanks,
Chris

# posted by

Chris : 12:31 PM

Do an alert on the innerHTML of the modified object... it doesnt exist. How would you attach an event to the replaced data? This work around seems a little thin.

# posted by

Anonymous : 2:05 PM

Thank you very much, I really needed this workaround!
I was searching for two days and was near to give it up!

Thankyou
Federica

# posted by

Anonymous : 6:44 AM

OH I have been stuck on this one for 4 months. WHat a find!

innerHTML is a great invention, nice work!

# posted by

John Newman : 1:21 AM

Thank you!
This keeps on helping programmers..

# posted by

Ron Merom : 1:00 AM

Let me show my admiration for your work 10 years ago. About that time, in 1998-1999 I learned and researched Web development mainly with IE4 and Netscape 4, the big two back then.

I found IE4 was a big leap from IE3 and much better than NN4 in DHTML, live DOM manipulation. I remember using innerHTML and innerText a lot. On every update, the page layout was instantly recalculated. NN had nothing comparable: just a layer.document.open(); layer.document.write("something")... IIRC, that did not recalculate any layout or element sizes.

Then came the standard DOM API which I thought was much less convenient than MS's way (and I'm not an MS fan). Something like div.innerHTML="My name is [b]Boraski[/b]"; (square brackets for angle brackets in post) requires a bunch of API calls with standard DOM manipulation. Fortunately they accepted .innerHTML (though it has its detractors).

And I also now remember that I indeed had problems using it with tables, ans I had to use functions such as insertRow(), etc. I now, 10 years later understand why.

Let me also point out that HTML's decision of allowing optional end tags, while good for "lazy" hand code writers, must be a pain for parser writers. Parsers need extra intelligence to know where elements end. And, BTW, Netscape was also quite bad at doing this. I remember some optional end tags were badly inferred by it.

Now modern Web apps rely on good manipulation of all elements of Web pages (adding, removing, resizing, changing styles, etc.) and that was an invention of those people 10 years ago. IMO, in more recent years, however, Web technology has advanced much more outside of MS (better DOM, better CSS, SVG, canvas, video, better JS, ...).

# posted by

Boraski : 10:28 AM

As metamind said before be, The inventor of innerHTML, wow, respect man!

Always wondered why IE won't replace my table tags, so I resulted to lists with spans and div styled as tables,
Oh well, it's you to blame! :)

Had fun reading this post! :)

# posted by

Caseless : 1:42 PM

I also always wondered why IE was such a pain in this way - it makes more sense now of course. But it's still good to know who I can blame :)

Thanks for explaining.

# posted by

Phil Freo : 2:26 PM

This helped me a lot and I could understand that their are *real* people involved in creating browsers and their are *real* challenges in it.

Thanks again

# posted by

Rishav Dixit : 1:02 AM

ThankS

# posted by

Vlad : 11:02 AM

Thanks for clearing this up. I've lost some hear about this. :)

# posted by

Thermo : 8:08 PM

Here, three years later, I too would like to add my thanks for this insight and work around. I do have a suggestion for a change to the workaround however that will work in both IE and FF.

[table]
[tbody id="tb"]
[tr]....
[tr]....
...
[/tbody]
[/table]
[span id='temp' style='visibility:hidden'][/span]

var temp = document.getElementById('temp');
temp.innerHTML = '[table][tbody id='td'][tr][td]New Row'[/tbody][/table];
var tb = document.getElementById('tb');
tb.parentNode.replaceChild(temp.firstChild.firstChild, tb);

Note, I substituted {} for <> since the post was not acceptable with <> in it.
First, I changed the getElementsByName to getElementById since FF does not find temp as it is an id not a name (IE has not trouble with this!). Next the replacement string being stored into temp needs to have an id='tb' assigned to its tbody to maintain this identification. If you leave it out, this code works the first time but will not work after the first replaceChild since there is no longer a tbody id='tb' to get. Lastly, I added the closing tags to the replace string to be clean (call me a purist). Note also that I added quotes around temp in the span tag id to keep FF happy.
All in all a very useful post, it saved me hours of debug, thanks again.

# posted by

Sandy Williams : 4:56 AM

Eric Vasilik

Sunday, July 02, 2006

Code Karma

Links

archives