C # - WebRequest doesn't return different pages

Purpose of my console program: Make web requests> Save results from web request> Use QueryString to get next page from web request> Save these results> Use QueryString to get next page from web request etc.

So, here's some pseudo code for how I set up the code.

 for (int i = 0; i < 3; i++)
        {
            strPageNo = Convert.ToString(i);  

            //creates the url I want, with incrementing pages
            strURL = "http://www.website.com/results.aspx?page=" + strPageNo;   

            //makes the web request
            wrGETURL = WebRequest.Create(strURL);

            //gets the web page for me
            objStream = wrGETURL.GetResponse().GetResponseStream();

            //for reading web page
            objReader = new StreamReader(objStream);

            //--------
            // -snip- code that saves it to file, etc.
            //--------

            objStream.Close();
            objReader.Close();

            //so the server doesn't get hammered
            System.Threading.Thread.Sleep(1000); 
         }

      

Pretty simple, isn't it? The problem is that even though it increments the page number to get another webpage, I get the same results page every time I run the loop.

i

Installs correctly and I can cut / paste the url strURL

generated in the web browser and it works fine.

I can manually enter the &page=1

, &page=2

, &page=3

, and it will return the correct page. Somehow adding the increment is spinning there.

Does it have anything to do with the sessions or what? I make sure to close both the stream and the reader before it loops again ...

0


source to share


5 answers


You tried to create a new WebRequest object each time during the loop, it might be the Create () method that doesn't completely clear all the old data.



Another thing to check is that the ResponseStream is properly flushed out before the next iteration of the loop.

+4


source


This code works fine for me:



var urls = new [] { "http://www.google.com", "http://www.yahoo.com", "http://www.live.com" };

foreach (var url in urls)
{
    WebRequest request = WebRequest.Create(url);
    using (Stream responseStream = request.GetResponse().GetResponseStream())
    using (Stream outputStream = new FileStream("file" + DateTime.Now.Ticks.ToString(), FileMode.Create, FileAccess.Write, FileShare.None))
    {
        const int chunkSize = 1024;
        byte[] buffer = new byte[chunkSize];
        int bytesRead;
        while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            byte[] actual = new byte[bytesRead];
            Buffer.BlockCopy(buffer, 0, actual, 0, bytesRead);
            outputStream.Write(actual, 0, actual.Length);
        }
    }
    Thread.Sleep(1000);
}

      

+2


source


Just suggest, try to get rid of Stream and Reader. I've seen some weird cases where not deleting objects like these and using them in loops can give some wacky results ....

+1


source


This url doesn't make sense to me unless you are using MVC or something that can interpret the request correctly.

http://www.website.com/results.aspx&page=

      

it should be:

http://www.website.com/results.aspx?page=

      

Some browsers will accept poorly formed URLs and render them accurate. Others may not be a problem in your console application.

0


source


Here's my awful, hack-ish, workaround:

Make another console application that calls THIS in which the first console application passes the argument at the end of strURL. It works, but I feel so messy.

-1


source







All Articles