Use HttpWebRequest to load web pages without too much trouble

Use HttpWebRequest to load web pages without too much trouble.

+1


source to share


4 answers


[update: I don't know why, but both examples below work great! I originally saw 403 on page2 as well. Was it a server problem?]

First, WebClient

it's easier. In fact, I've seen it before . It turned out to be case sensitive in the URL when accessing wikipedia; try to make sure you used the same case in your wikipedia request.

[update] As Bruno Conde and Gimel pointed out, using% 27 should help make it consistent (the intermittent behavior suggests that maybe some wikipedia servers are configured differently for others)



I just checked, in which case the issue with the question doesn't seem to be a problem ... however, if it worked (it does not ), this would be the easiest way to request the page:

        using (WebClient wc = new WebClient())
        {
            string page1 = wc.DownloadString("http://en.wikipedia.org/wiki/Algeria");

            string page2 = wc.DownloadString("http://en.wikipedia.org/wiki/%27Abadilah");
        }

      

I'm afraid I can't think of what to do with the leading apostrophe that breaks things ...

+2


source


I also got strange results ... First,

http://en.wikipedia.org/wiki/ 'Abadilah

didn't work and after some unsuccessful attempts it started working.

Second url,

http://en.wikipedia.org/wiki/ 't_Zand_ (Alfen-Chaam)



always failing for me ...

It seems that the apostrophe is responsible for these problems. If you replace it with

% 27

all urls are working fine.

+1


source


Try escaping special characters using the Encoding Percentage (Section 2.1) . For example, one quote is represented %27

in the URL ( IRI ).

+1


source


I'm sure the OP is already sorted, but I just ran into the same problem - an intermittent 403 when loading from wikipedia via a web client. Customizing the user agent header will sort it:

client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

      

+1


source







All Articles