String.Split c # keeps text outside of delimiters

I am trying to take a list of email addresses along with first and last names and convert them to CSV format. My email addresses are in the following format:

First, Last <email1@example.com>; First, Last <email2@example.com>;

      

The output I need is the following:

email1@example.com,email2@example.com

      

I am using the following code:

string[] addresses = addresses_Delimited.Split(new Char[] { '<', '>' });

      

addresses_Delimited

- this is my list of addresses in original format.

The problem is that it doesn't exclude first and last names; instead, it returns names and names as array elements addresses

. So, addresses[0]

= "First, Last", addresses[1]

= " email1@example.com " and addresses[2]

= "; First, Last". All records of the first and last name after the first have a semicolon.

How do I string.Split

delete all text outside the "<" and ">"? Do I need to use something else?

+3


source to share


5 answers


Instead of using Split

that doesn't care if the delimiters are paired, use a regex like this:

<([^>]+)>

      

When you apply this regex to your input strings, you should capture the content of the angular brackets in a capturing group of group 1:



var s = "First, Last <email1@example.com>; First, Last <email2@example.com>;";
Regex regex = new Regex(@"<([^>]+)>");
foreach (Match m in regex.Matches(s)) {
    Console.WriteLine(m.Groups[1]);
}

      

Demo version

+6


source


Split

won't work in this case. You need to use Regular Expressions . try it



// using System.Text.RegularExpressions;
// pattern = any number of arbitrary characters between < and >.
var pattern = @"\<(.*?)\>";
var matches = Regex.Matches(addresses_Delimited, pattern);

foreach (Match m in matches) {
    Console.WriteLine(m.Groups[1]);
}

      

+2


source


Divide by ";" first, then "<" and ">".

string inputEmails = "First1, Last1 <email1@example.com>; First2, Last2 <email2@example.com>;";
string[] inputEmailsArray = inputEmails.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string email in inputEmailsArray)
{
    string[] inputEmailArray = email.Split(new char[] { '<', '>' }, StringSplitOptions.RemoveEmptyEntries);
    foreach (string emailPart in inputEmailArray)
    {
        string s = emailPart;   // First1, Last1     // email1@example.com
    }
}

      

0


source


You can do it with split - but it's really ugly:

var text = "First, Last <email1@example.com>; First, Last <email2@example.com>;";

var t = text.TrimEnd(';').Split(';');
foreach (var m in t)
{
    Console.WriteLine(m.Split('<')[1].TrimEnd('>'));
}

      

Use RegularExpression instead.

0


source


Assuming (and that's a big guess) that there are no symbols in any names or letters ;

and that there are no symbols in any letters ,

, this will work :

using System.Linq;
using System.Net.Mail;

...

var input = "First, Last <email1@example.com>; First, Last <email2@example.com>;";

var emails = String.Join(",", input
  .Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries)
  .Select(s => new MailAddress(s).Address));

      

0


source







All Articles