Is there a way to measure the complexity of the HTML markup?

I am working on cleaning up a very convoluted ASP.NET project and I have a tool that measures the complexity of the project in different ways, so I can show the results of my work: as I clean up, the complexity goes down.

One of my metrics is the number of lines of HTML markup, but I realized that this is not a very good way to measure, because the number of lines depends on inflation during formatting; this snippet:

<span><em>This is bold</em></span>

      

should have the same output as the fairly printable version:

<span>
  <em>This is bold</em>
</span>

      

But simply counting the lines shows the second chunk having more lines.

What would be the best way to calculate the complexity of the markup in order to capture the structural complexity rather than just the line count?

Update : Commenters asked what I mean by difficulty. I mean this in terms of how structured the page is. My original example was not the best because the two snippets are the same. My ultimate goal is to convert sloppy table driven layouts to CSS and I want to measure how much "less" code there is when it's done. Simply counting the number of nodes does not quite fit the nesting structure. Is there a metric that will capture the node count and nesting depth?

+3


source to share


1 answer


You can use the flexibilty package to convert your html code to a list of nodes, actually the DOM, and then read the number of nodes.

This is a good measurement of a complex html page. Fewer nodes, less complex html, and as a result, it allows you to find any given element faster when searching with javascript.



This is also a link to Best Practices for Speed ​​Up Your Website from Yahoo

Other links:
How to use the HTML agility pack
How to get the number of tables in an html file using C # and html-agility-pack
Accounting for specific child nodes with HtmlAgilityPack

+1


source







All Articles