Remove the price from the website
I am trying to clear a price from a webpage using PHP and Regexes. The price will be in the format £ 123.12 or $ 123.12 (i.e. Pounds or Dollars).
I am downloading content using libcurl. The output of which then goes into preg_match_all
. So it looks something like this:
$contents = curl_exec($curl);
preg_match_all('/(?:\$|£)[0-9]+(?:\.[0-9]{2})?/', $contents, $matches);
So far so simple. The problem is that PHP doesn't fit anywhere, even if there are prices on the page. I've narrowed it down to the "£" symbol problem - PHP doesn't seem to like it.
I think it might be an encoding issue. But whatever I do, I can't get PHP to match this! Does anyone have any idea?
(Edit: It should be noted that if I try to use the Regex Test Tool using the same regex and page content, it works fine)
You are trying to use \ before £
preg_match_all('/(\$|\£)[0-9]+(\.[0-9]{2})/', $contents, $matches);
I tried this expression with .Net with \ E and it works. I just edited it and removed some ":" alt text http://clip2net.com/clip/m12122/1227972904-clip-9kb.png
Read my comment on the possibility that Curl will give you bad encoding (comment on this post).
source to share