Dump HTML page including iframes

I would like to dump the content of the HTML page in the web page, including the HTML frames included inside the elements <iframe>

. The Elements tab of the Chrome developer tools allows you to render an iframe this way.

When I say "dump HTML content" I am interested in browser automation tools like Selenium or PhantomJS. Do any of these tools have this capacity built in?

For example, the HTML dump that I would like on this page should include the HTML source of this inline page .

+3


source to share


1 answer


You can use phantomjs to achieve this

Here is a piece of code from the phantom js server code.



var system = require('system');
var url = system.args[1] || '';
if(url.length > 0) {
  var page = require('webpage').create();  
  page.open(url, function (status) {
    if (status == 'success') {
      var delay, checker = (function() {
        var html = page.evaluate(function () {
          var body = document.getElementsByTagName('body')[0];
          if(body.getAttribute('data-status') == 'ready') {
            return document.getElementsByTagName('html')[0].outerHTML;
          }
        });
        if(html) {
          clearTimeout(delay);
          console.log(html);
          phantom.exit();
        }
      });
      delay = setInterval(checker, 100);
    }
  });
}

      

on the html you use the "data status" attribute to let phantomjs know when the page is ready, if the html is yours. Another option is to use a good timeout if the html page doesn't belong to you.

-1


source







All Articles