Split Unicode String with Ruby

How can I split a string into a Unicode range in Ruby. I wanted to split under \ u1000 and more \ u000 with a comma. For example, I wanted to split this line ...

I love แ€ป แ€™แ€”แ€น แ€™แ€ฌ

to that...

I love, แ€ป แ€™แ€”แ€น แ€™แ€ฌ

In my example, you cannot see Unicode characters. This range is Unicode \ u000 and over.

Thank.

+2


source to share


1 answer


Depends on which version you are using; here is the solution for 1.9. I guess 1.8 might get ugly.

This falls for the elegance, but it seems to work.



"I love แ€ปแ€™แ€”แ€นแ€™แ€ฌ".gsub(/([\u0000-\u0999])([\u1000-\u9999])/, '\1,\2')

      

If this method fits, you will have to put a different case (going from high to low)

+2


source







All Articles