How to determine if HTML content is empty / white when rendered?

Consider the following code:

<p>&nbsp;</p><!-- comment -->
<span></span><br />
<div><span class="foo"></span></div>

      

which in the browser will effectively render as a space.

I am wondering if, given this or a similar markup, there is a simple, programmatic way to detect that the end result of this code with the whitespace removed is an empty string.

The implementation here is JavaScript, but I'm also interested in a more general (language agnostic) solution, if it exists.

Note that simply removing tags and seeing if any text remains is not a real fix, as there are many tags that end up displaying visible content (like img, hr, etc.).

+3


source to share


1 answer


This is the answer I came up with. It uses a whitelist of tags that are supposed to be displayed on the page regardless of whether they have content or not. All other tags are expected to display only if they have actual text content. Once you have this, the solution is actually quite simple - it depends on the attribute innerText

automatically removing all tags.

This solution also ignores elements that are rendered based on CSS (like boxes with a background color, or where content is set for pseudo-elements :after

or :before

), but fortunately this does not apply to my use case.



function htmlIsWhitespace(input) {
	var visible = [
			'img','iframe','object','hr', 
			'audio', 'video', 
			'form', 'button', 'input', 'select', 'textarea'
		],
		container = document.createElement('div');
	container.innerHTML = input;
	return !(container.innerText.trim().length > 0 || container.querySelector(visible.join(',')));
}

// And the tests (I believe these are comprehensive):

var testStringsYes = [
		"",
		"<a href='#'></a>",
		"<a href='#'><span></span></a>",
		"<a href='#'><span> <!-- comment --></span></a>",
		"<a href='#'><span> &nbsp;</span></a>",
		"<a href='#'><span> &nbsp; </span></a>",
		"<a href='#'><span> &nbsp;</span></a> &nbsp;",
		"<p><a href='#'><span> &nbsp;</span></a> &nbsp;</p>",
		" <p><a href='#'><span> &nbsp;</span></a> &nbsp;</p> &nbsp; <p></p>",
		"<p>\n&nbsp;\n</p><ul><li></li></ul>"
	],
	testStringsNo = [
		"<a href='#'><span> &nbsp;hi</span></a>",
		"<img src='#foo'>",
		"<hr />",
		"<div><object /></div>",
		"<div><iframe /></div>",
		"<div><object /></div>",
		"<div><!-- hi -->bye</div>",
		"<div><!-- what --><audio></audio></div>",
		"<div><!-- what --><video></video></div>",
		'<form><!-- empty --></form>',
		'<input type="text">',
		'<select name="foo"><option>1</option></select>',
		'<textarea>',
		'<input type="text">',
		'<form><input type="button"></form>',
		'<button />',
		'<button>Push</button>',
		"yo"
	];

for(var yy=0, yl=testStringsYes.length; yy < yl; yy += 1) {
	console.debug("Testing", testStringsYes[yy]);
	console.assert(htmlIsWhitespace(testStringsYes[yy]));
}

for(var nn=0, nl=testStringsNo.length; nn < nl; nn += 1) {
	console.debug("Testing", testStringsNo[nn]);
	console.assert(!htmlIsWhitespace(testStringsNo[nn]));
}
      

Run code


0


source







All Articles