Scala Pattern Matching Scala String

I want to extract a portion of a string that matches one of the drag regex patterns I have defined:

  //should match R0010, R0100,R0300 etc 
  val rPat="[R]{1}[0-9]{4}".r
  // should match P.25.01.21 , P.27.03.25 etc
  val pPat="[P]{1}[.]{1}[0-9]{2}[.]{1}[0-9]{2}[.]{1}[0-9]{2}".r 

      

When I now define my method to retrieve items as:

  val matcher= (s:String) => s match {case pPat(el)=> println(el) // print the P.25.01.25
                                        case rPat(el)=>println(el) // print R0100 
                                        case _ => println("no match")}

      

And check it for example:

  val pSt=" P.25.01.21 - Hello whats going on?"
  matcher(pSt)//prints "no match" but should print P.25.01.21
  val rSt= "R0010  test test 3,870" 
  matcher(rSt) //prints also "no match" but should print R0010
  //check if regex is wrong
  val pHead="P.25.01.21"
  pHead.matches(pPat.toString)//returns true
  val rHead="R0010"
  rHead.matches(rPat.toString)//return true

      

I'm not sure if the regex expression is wrong, but the match method works on elements. So what's wrong with this approach?

+3


source to share


3 answers


When you use string pattern matching, you should keep in mind that:

  • The pattern .r

    you pass must match the whole string, otherwise no match will be returned (the solution is to make the pattern .r.unanchored

    )
  • After you make it loose, watch out for unwanted matches: R[0-9]{4}

    will match R1234

    in CSR123456

    (different solutions depend on your actual requirements, usually word boundaries are \b

    sufficient or negative lookarounds can be used)
  • Inside the block, the match

    regex match function requires a capture group to be present if you want to return some value (you defined it as el

    in pPat(el)

    and rPat(el)

    .

So, I propose the following solution :

val rPat="""\b(R\d{4})\b""".r.unanchored
val pPat="""\b(P\.\d{2}\.\d{2}\.\d{2})\b""".r.unanchored

val matcher= (s:String) => s match {case pPat(el)=> println(el) // print the P.25.01.25
    case rPat(el)=>println(el) // print R0100 
    case _ => println("no match")
}

      

Then



val pSt=" P.25.01.21 - Hello whats going on?"
matcher(pSt) // => P.25.01.21
val pSt2_bad=" CP.2334565.01124.212 - Hello whats going on?"
matcher(pSt2_bad) // => no match
val rSt= "R0010  test test 3,870" 
matcher(rSt) // => R0010
val rSt2_bad = "CSR00105  test test 3,870" 
matcher(rSt2_bad) // => no match

      

Some notes on templates:

  • \b

    - upper word border
  • (R\d{4})

    - capture group corresponding to exactly 4 digits
  • \b

    - end word boundary

Because of the triple quotes used to define a string literal, there is no need to avoid backslashes.

+2


source


Inject groups into your templates:



val rPat=".*([R]{1}[0-9]{4}).*".r

val pPat=".*([P]{1}[.]{1}[0-9]{2}[.]{1}[0-9]{2}[.]{1}[0-9]{2}).*".r 

...

scala> matcher(pSt)
P.25.01.21

scala> matcher(rSt)
R0010

      

+1


source


If the code is written like this, the desired output will be generated. The following API documentation: http://www.scala-lang.org/api/2.12.1/scala/util/matching/Regex.html

  //should match R0010, R0100,R0300 etc
  val rPat="[R]{1}[0-9]{4}".r
  // should match P.25.01.21 , P.27.03.25 etc
  val pPat="[P]{1}[.]{1}[0-9]{2}[.]{1}[0-9]{2}[.]{1}[0-9]{2}".r


  def main(args: Array[String]) {
    val pSt=" P.25.01.21 - Hello whats going on?"
    val pPatMatches = pPat.findAllIn(pSt);
    pPatMatches.foreach(println)
    val rSt= "R0010  test test 3,870"
    val rPatMatches = rPat.findAllIn(rSt);
    rPatMatches.foreach(println)

  }

      

Please let me know if this works for you.

0


source







All Articles