Retrieving data from URL path through regex

I am trying to extract data from a URL path like this:

/ 12345678901234567890123456789012/1230345035 / Wibble / Wobble /

With this regex, I can extract into 3 groups with this regex:

\/([^\/]*)\/([^\/]*)(\/wibble\/wobble)

      

Which gives me:

group 1 = 12345678901234567890123456789012  
group 2 = /1230345035  
group 3 = /wibble/wobble  

      

However, this is not exactly what I need - I am trying to get the data retrieved from group 2 also in group 3, so like this:

group 1 = 12345678901234567890123456789012  
group 2 = /1230345035  
group 3 = /1230345035/wibble/wobble 

      

But I am afraid that I am struggling with a regex to extract such data.

thank

+3


source to share


1 answer


First, the regex you gave shouldn't give you the starting path delimiters. Since you are not capturing the delimiter, you should see something like this:

group 1 = 12345678901234567890123456789012  
group 2 = 1230345035
group 3 = wibble/wobble

      

It's a little easier to combine the last three elements into what you call group 2

, and then transfer the first part of those last three elements into group 3

using a complex capture group, like this:



\/([^\/]*)\/(([^\/]*)\/wibble\/wobble)

\/               # opening slash
([^\/]*)         # anything that is not a slash, repeated 0+ times, as group 1
\/               # separating slash
(                # begin group 2
([^\/]*)         # anything that is not a slash, repeated 0+ times, as group 3
\/wibble\/wobble # literal text to match
)                # end group 2

      

This should give you the following matches:

group 1 = 12345678901234567890123456789012  
group 2 = 1230345035/wibble/wobble
group 3 = 1230345035

      

+1


source







All Articles