R regex extract number after / before string

I'm trying to create a regex to define a string where "pack" / "pck" / "packs" / "Set" (case INsensitive) work, and if yes (the word exists), extract the number that precedes or follows that word. Examples:

"Fregon EcoClean Multipurpose Scrubber For Pots, Pans, Kitchen, and Bathroom, Green, 3-Pack" -> 3
Bathroom, Green, 3 Pack" -> 3
"Franklin Sports NHL Mini Hockey Goal Set of 2" ->2
"Make: Electronics Components Pack 2" -> 2
"Make: Electronics Components Pack of 2 -> 2

      

I tried using the following expression:

sub(".*pack(\\d+).*", "\\1", "inflow100 pack6 distance12")

      

However, it does not fit all the cases mentioned above. Any ideas?

+3


source to share


2 answers


The following regex matches all examples:

\b(?:(\d+)[-\s][Pp]ack|(?:[Pp]ack|[Ss]et)\s?(?:of\s)?(\d+))

      

See https://regex101.com/r/jZ4vE2/1



If you use it, you will notice that the number is placed in \ 1 or \ 2. The only thing left to do is get rid of the leading or following whitespace.

> gsub(".*\\b(?:(\\d+)[-\\s][Pp]ack|(?:[Pp]ack|[Ss]et)\\s?(?:of\\s)?(\\d+)).*", "\\1 \\2", "inflow100 pack6 distance12", perl=TRUE)
[1] " 6"

      

+5


source


Just enter the last number.

sub(".*\\b(\\d+).*", "\\1", str)

      



or

sub("(\\d+)\\D*$|.", "\\1", str)

      

+1


source







All Articles