bold italicbolditalic"; ...">

Split string into array from text and html tag

I have a line like

string html = "truongpm<b><i>bold italic</i></b><b>bold</b><i>italic</i>";

      

How to get an array like

a[0] = "truongpm", a[1]= "<b><i>bold</i></b>", a[2]="<b>bold</b>", a[3]="<i>italic</i>"

      

from that line. Now I am using this code

string tagRegex = @"<\s*([^ >]+)[^>]*>.*?<\s*/\s*\1\s*>";
MatchCollection matchesImgSrc = Regex.Matches(html, tagRegex, RegexOptions.IgnoreCase | RegexOptions.Singleline);
        foreach (Match m in matchesImgSrc)

      

But he just gets

a[0]= "<b><i>bold</i></b>", a[1]="<b>bold</b>", a[2]="<i>italic</i>"

      

no "truongpm" Please help me! thank

+1


source to share


2 answers


Here's the code you can use:



var l = new List<string>();
var html = "truongpm<b><i>bold italic</i></b><b>bold</b><i>italic</i>";
var tagRegex = @"[^<>]+|<\s*([^ >]+)[^>]*>.*?<\s*/\s*\1\s*>";
var matchesImgSrc = Regex.Matches(html, tagRegex, RegexOptions.IgnoreCase | RegexOptions.Singleline);
foreach (Match m in matchesImgSrc)
    l.Add(m.Value);

      

+2


source


Your RegExp only matches strings in tags. If you want to grab lines without any tag, you must add an alternative to your regex. This can be done by adding ([^<>]+)

to make your expression look like ([^<>]+)|{your existing expression}

. Websites such as Regex Pal will help you create regular expressions.



+1


source







All Articles