Get html generated by Javascript using PhantomJS

I am trying to use PhantomJS to get html generated by a dynamic page. I assumed it would be easy, but after hours of trying, I still had no luck.

The page itself has this source code and which ends up being saved to 1.html:

<!doctype html>
<html lang="cs" ng-app="appId">
<head ng-controller="MainCtrl">
     (ommited some lines)
    <script src="/js/conf/config.js?pars"></script>
    <script src="/js/all.js?pars"></script>
</head>
<body>
<!--<![endif]-->
    <div site-loader></div>
    <div page-layout>
        <div ng-view></div>
    </div>
</body>
</html>

      

All web content is loaded inside the site loader div, but I'm out of luck though I'm using a timeout before clearing the html from PhantomJS. Here is the code I'm using:

var url = 'http:...';
var page = require('webpage').create();
var fs = require('fs');

page.open(url, function (status) {
    if (status !== 'success') {
        console.log('Fail');
        phantom.exit();
    } else {        
        window.setTimeout(function () {
        fs.write('1.html', page.content, 'w');
        phantom.exit();
        }, 2000); // Change timeout as required to allow sufficient time 
    }
});

      

Please, what am I doing wrong?

EDIT: I decided to try the PJscrapper framework and set it up to clone the entire content of the div block. Everything I got was disgusting:

["","\n\t\tif (window.DOT) {\n\t\t\tDOT.cfg({service: 'sreality', impress: false});\n\t\t}\n\t","","Loader.load()","",""]

      

It seems that I seriously don't get it and always get the code before Loader.load () takes effect. And obviously a timeout doesn't solve the problem.

+3


source to share


1 answer


This will do the trick



    page.open(url, function (status) {
    if (status !== 'success') {
        console.log('Unable to load the url!');
        phantom.exit();
    } else {
        window.setTimeout(function () {
            var results = page.evaluate(function() {
                return document.documentElement.innerHTML;
            });
            console.log(results)
            phantom.exit();
        }, 200);
    }
});

      

+1


source







All Articles