How can I find the maximum overlap between two strings in Scala?
This first solution is easily the most concise and also more efficient than the recursive version as it uses lazy-evaluated iteration
s.tails.find(t.startsWith).get
It has now been discussed whether to tails
copy the entire line over and over. In this case, you can use toList
on s
and then the mkString
result.
s.toList.tails.find(t.startsWith(_: List[Char])).get.mkString
For some reason, type annotation is required to compile it. I'm not really trying to see which one is faster.
UPDATE - OPTIMIZATION
As som-snytt pointed out, t
it cannot start on any line that is longer than this, and so we could do the following optimization:
s.drop(s.length - t.length).tails.find(t.startsWith).get
source to share
If we need to find the common overlapping part, then we can recursively take the tail of the first line (which should intersect with the beginning of the second line) until the remainder is the one that the second line starts with.This also applies to the case where lines don't have overlap, because then an empty string will be returned.
scala> def findOverlap(s:String, t:String):String = {
if (s == t.take(s.size)) s else findOverlap (s.tail, t)
}
findOverlap: (s: String, t: String)String
scala> findOverlap("abcxyz", "xyz123")
res3: String = xyz
scala> findOverlap("one","two")
res1: String = ""
UPDATE: It has been pointed out that it tail
may not be implemented in the most efficient way (i.e. it creates a new row when it is called). If this becomes an issue, using substring(1)
instead tail
(or converting both strings to lists where the tail / head should be O (1)) may give better performance. And thus we can replace t.take(s.size)
with t.substring(0,s.size)
.
source to share