Simple regex question?
I have a stringstream where it has many strings inside:
<A style="FONT-WEIGHT: bold" id=thread_title_559960 href="http://microsoft.com/forum/f80/topicName-1234/">Beautiful Topic Name</A> </DIV>
I'm trying to get relevant links that start with:
style="FONT-WEIGHT: bold
As a result, I will have a link:
http://microsoft.com/forum/f80/topicName-1234/
Topic Id:
1234
Topic Display Name:
Beautiful Topic Name
I am using this pattern, right now, but it doesn't do it all:
"href=\"(?<url>.*?)\">(?<title>.*?)</A>"
Because there are other links starting with href.
Also, in order to use Regex, I added all the lines in one line of a line. Does regex support newlines? IE can keep matching lines that span multiple lines?
Please help me with the template.
source to share
In a regular expression, the dot wildcard does not match newline characters. If you want to match any character, including newlines, use [^\x00]
instead .
. This matches all but the null character, which means it matches all.
Try the following:
<A\s+style="FONT-WEIGHT: bold"\s+id=(\S+)\s+href="([^"]*)">([^\x00]*?)</A>
If you are trying to assign this to a string using double quotes, you will need to avoid quotes and backslashes. It will look something like this:
myVar = "<A\\s+style=\"FONT-WEIGHT: bold\"\\s+id=(\\S+)\\s+href=\"([^\"]*)\">([^\\x00]*?)</A>";
source to share
You can make .
matching newlines in the template using the RegexOptions.Singleline enum:
Specifies single line mode. changes the value of the period (.), so it matches every character (instead of every character except \ n).
So, if your title spans multiple lines, with this option enabled, the (?<title>.*?)
pattern portion will continue on lines trying to find a match.
source to share