R regex extract number after / before string
I'm trying to create a regex to define a string where "pack" / "pck" / "packs" / "Set" (case INsensitive) work, and if yes (the word exists), extract the number that precedes or follows that word. Examples:
"Fregon EcoClean Multipurpose Scrubber For Pots, Pans, Kitchen, and Bathroom, Green, 3-Pack" -> 3
Bathroom, Green, 3 Pack" -> 3
"Franklin Sports NHL Mini Hockey Goal Set of 2" ->2
"Make: Electronics Components Pack 2" -> 2
"Make: Electronics Components Pack of 2 -> 2
I tried using the following expression:
sub(".*pack(\\d+).*", "\\1", "inflow100 pack6 distance12")
However, it does not fit all the cases mentioned above. Any ideas?
+3
source to share
2 answers
The following regex matches all examples:
\b(?:(\d+)[-\s][Pp]ack|(?:[Pp]ack|[Ss]et)\s?(?:of\s)?(\d+))
See https://regex101.com/r/jZ4vE2/1
If you use it, you will notice that the number is placed in \ 1 or \ 2. The only thing left to do is get rid of the leading or following whitespace.
> gsub(".*\\b(?:(\\d+)[-\\s][Pp]ack|(?:[Pp]ack|[Ss]et)\\s?(?:of\\s)?(\\d+)).*", "\\1 \\2", "inflow100 pack6 distance12", perl=TRUE)
[1] " 6"
+5
source to share