Node.js request.js HPE_INVALID_HEADER_TOKEN

I was desperately worried about one problem and I need help ... I am using node.js to crawl a list of websites, some of them give me this error, for example: http://www.fz-juelich.de/portal/ DE / Home / home_node.html , Parse Error, HPE_INVALID_HEADER_TOKEN

request.get({
    url: uri,
    timeout: timeout,
    headers: {
        referer: domain
    }
}, (error, response, body) => {
    if (error)
        console.log(error);
    console.log(body);
});

      

although, curl -i -raw http://www.fz-juelich.de/portal/DE/Home/home_node.html works just fine

HTTP/1.1 404 Not Found
Server: Apache-Coyote/1.1
Cache-Control: no-cache
JSESSIONID=E594677A6CCA13BE0338E1D00A729C34; Path=/cae:
Content-Type: text/html;charset=utf-8
Content-Language: de
Set-Cookie: JSESSIONID=E594677A6CCA13BE0338E1D00A729C34; Path=/
Content-Length: 19677

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >

      

Also I can see this website in Chrome browser

Any ideas which way I should dig to get rid of these errors?

+3


source to share


1 answer


At the end of this journey, I no longer use node.js to traverse and parse



Go lang crawler is much better suited, more flixibility in the http library and easier to write really parallel things.

0


source







All Articles