Scala Pattern Matching Scala String
I want to extract a portion of a string that matches one of the drag regex patterns I have defined:
//should match R0010, R0100,R0300 etc
val rPat="[R]{1}[0-9]{4}".r
// should match P.25.01.21 , P.27.03.25 etc
val pPat="[P]{1}[.]{1}[0-9]{2}[.]{1}[0-9]{2}[.]{1}[0-9]{2}".r
When I now define my method to retrieve items as:
val matcher= (s:String) => s match {case pPat(el)=> println(el) // print the P.25.01.25
case rPat(el)=>println(el) // print R0100
case _ => println("no match")}
And check it for example:
val pSt=" P.25.01.21 - Hello whats going on?"
matcher(pSt)//prints "no match" but should print P.25.01.21
val rSt= "R0010 test test 3,870"
matcher(rSt) //prints also "no match" but should print R0010
//check if regex is wrong
val pHead="P.25.01.21"
pHead.matches(pPat.toString)//returns true
val rHead="R0010"
rHead.matches(rPat.toString)//return true
I'm not sure if the regex expression is wrong, but the match method works on elements. So what's wrong with this approach?
+3
source to share
3 answers
When you use string pattern matching, you should keep in mind that:
- The pattern
.r
you pass must match the whole string, otherwise no match will be returned (the solution is to make the pattern.r.unanchored
) - After you make it loose, watch out for unwanted matches:
R[0-9]{4}
will matchR1234
inCSR123456
(different solutions depend on your actual requirements, usually word boundaries are\b
sufficient or negative lookarounds can be used) - Inside the block, the
match
regex match function requires a capture group to be present if you want to return some value (you defined it asel
inpPat(el)
andrPat(el)
.
So, I propose the following solution :
val rPat="""\b(R\d{4})\b""".r.unanchored
val pPat="""\b(P\.\d{2}\.\d{2}\.\d{2})\b""".r.unanchored
val matcher= (s:String) => s match {case pPat(el)=> println(el) // print the P.25.01.25
case rPat(el)=>println(el) // print R0100
case _ => println("no match")
}
Then
val pSt=" P.25.01.21 - Hello whats going on?"
matcher(pSt) // => P.25.01.21
val pSt2_bad=" CP.2334565.01124.212 - Hello whats going on?"
matcher(pSt2_bad) // => no match
val rSt= "R0010 test test 3,870"
matcher(rSt) // => R0010
val rSt2_bad = "CSR00105 test test 3,870"
matcher(rSt2_bad) // => no match
Some notes on templates:
-
\b
- upper word border -
(R\d{4})
- capture group corresponding to exactly 4 digits -
\b
- end word boundary
Because of the triple quotes used to define a string literal, there is no need to avoid backslashes.
+2
source to share
If the code is written like this, the desired output will be generated. The following API documentation: http://www.scala-lang.org/api/2.12.1/scala/util/matching/Regex.html
//should match R0010, R0100,R0300 etc
val rPat="[R]{1}[0-9]{4}".r
// should match P.25.01.21 , P.27.03.25 etc
val pPat="[P]{1}[.]{1}[0-9]{2}[.]{1}[0-9]{2}[.]{1}[0-9]{2}".r
def main(args: Array[String]) {
val pSt=" P.25.01.21 - Hello whats going on?"
val pPatMatches = pPat.findAllIn(pSt);
pPatMatches.foreach(println)
val rSt= "R0010 test test 3,870"
val rPatMatches = rPat.findAllIn(rSt);
rPatMatches.foreach(println)
}
Please let me know if this works for you.
0
source to share