Serving a very large HTML file in a browser?

I am trying to learn Python while working on an interesting project - Facebook post parser. I downloaded my data from Facebook which includes a bunch of html files. One of them - messages.htm - contains all my messages. My goal is to take this html file and parse it to output interesting data such as the most common word, # of messages, etc.

The problem is my messages.htm file is 270MB. I can test it in vim, but there are interesting patterns in the file, and I would like to compare the html code with how it actually displays correctly in the browser, so I can compare the code with the visuals and get a better understanding of what's going on. But when I try to open this file in Firefox, FF crashes. I can open it in Chrome, but it just downloads all messages and ~ 10 minutes in it doesn't even completely load any message thread no matter how tiny the scrollbar is. So this is not possible.

Is it even possible to fully implement such a large and long HTML file?

+3


source to share


1 answer


You can use lynx , which is a text browser to view a large html file. I have html file 139M and I was able to view it very easily with lynx

. lynx

divides the entire document into pages and can quickly load any page. It also supports hyperlink, so navigating within the html document (which was my use case) worked like a charm.



+1


source







All Articles