Java - How to Download Complete HTML Site Source

I am trying to load FULL HTML site's source code String

in Java. I've tried several approaches, however, I get almost all of the source code. To make it worse: One of the main parts I don't get is the part I need the most!


source to share

2 answers

URL url = new URL("");
URLConnection spoof = url.openConnection();

//Spoof the connection so we look like a web browser
spoof.setRequestProperty( "User-Agent", "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0;    H010818)" );
BufferedReader in = new BufferedReader(new InputStreamReader(spoof.getInputStream()));
String strLine = "";
String finalHTML = "";
//Loop through every line in the source
while ((strLine = in.readLine()) != null){
   finalHTML += strLine;




Maybe because the content you are looking for is loaded dynamically, via ajax / javascript.

for example, a website might contain an empty DIV tag that will be filled with many things only after the page has loaded (via an AJAX call elsewhere).



All Articles