Split Unicode String with Ruby
How can I split a string into a Unicode range in Ruby. I wanted to split under \ u1000 and more \ u000 with a comma. For example, I wanted to split this line ...
I love แป แแแน แแฌ
to that...
I love, แป แแแน แแฌ
In my example, you cannot see Unicode characters. This range is Unicode \ u000 and over.
Thank.
+2
source to share
1 answer
Depends on which version you are using; here is the solution for 1.9. I guess 1.8 might get ugly.
This falls for the elegance, but it seems to work.
"I love แปแแแนแแฌ".gsub(/([\u0000-\u0999])([\u1000-\u9999])/, '\1,\2')
If this method fits, you will have to put a different case (going from high to low)
+2
source to share