C #: split string by date keeping date intact

I have the following sample data:

21/10/2012 blahblah blah blahblah 265 blah 25 22/10/2012 blahblah blah blahblah 10 blah 14 blah 66 NK blahblah blah blahblah 25

      

I want the following data in the output:

21/10/2012 blahblah blah blahblah 265 blah 25
22/10/2012 blahblah blah blahblah 10 blah 14 blah 66 NK blahblah blah blahblah 25

      

I've tried the following:

var regex = new Regex ("(\d{1,2})/(\d{1,2})/(\d{4})");
var matches = regex.Matches(str);//str is given above
foreach(var item in matches)
{
  //my logic to do operations
}

      

This gives an array of dates. How can I split a string into dates?

+3


source to share


1 answer


You can split the string into an empty string before the date. For this you need this regex:

string[] arr = Regex.split(str, "(?<!\d)(?=\d{1,2}/\d{1,2}/\d{4})");

      

Splitting by the above regex will give you the result you want. It will split your line into an empty line, preceded 21/10/2012

and not preceded by the form date digit

. We need to make the stuff look-behind

so it doesn't rip apart part of the day. Without that, it will split into an empty string before 1

in 21

, keeping 2

it 1/10/2012

as a separate item.

Also note that you will get empty string

as the first element of your array, since the first blank line in your line meets the split criteria.




Verifying dates can be tricky with regex. Especially if you want to restrict all possible invalid dates, eg 30 Feb

. But still, if you want, you can try this regex, but it will match Feb 30 and 31 and even Nov 31.

string[] arr = Regex.split(str, "(?<!\\d)(?=(?:0[1-9]|[1-2][0-9]|[3][01])/(?:0[1-9]|1[0-2])/(?:19[0-9]{2}|2[0-9]{3}))");

      

+3


source







All Articles