Only 2 emoji return incorrect length when compared to a character set containing them

let myString = "โ˜บ๏ธ"

let emoji = "๐Ÿ˜€๐Ÿ˜๐Ÿ˜‚๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜…๐Ÿ˜†๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ‘ฟ๐Ÿ˜‰๐Ÿ˜Šโ˜บ๏ธ๐Ÿ˜‹๐Ÿ˜Œ๐Ÿ˜๐Ÿ˜Ž๐Ÿ˜๐Ÿ˜๐Ÿ˜‘๐Ÿ˜’๐Ÿ˜“๐Ÿ˜”๐Ÿ˜•๐Ÿ˜–๐Ÿ˜—๐Ÿ˜˜๐Ÿ˜™๐Ÿ˜š๐Ÿ˜›๐Ÿ˜œ๐Ÿ˜๐Ÿ˜ž๐Ÿ˜Ÿ๐Ÿ˜ ๐Ÿ˜ก๐Ÿ˜ข๐Ÿ˜ฃ๐Ÿ˜ค๐Ÿ˜ฅ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜จ๐Ÿ˜ฉ๐Ÿ˜ช๐Ÿ˜ซ๐Ÿ˜ฌ๐Ÿ˜ญ๐Ÿ˜ฎ๐Ÿ˜ฏ๐Ÿ˜ฐ๐Ÿ˜ฑ๐Ÿ˜ฒ๐Ÿ˜ณ๐Ÿ˜ด๐Ÿ˜ต๐Ÿ˜ถ๐Ÿ˜ท๐Ÿ™‚๐Ÿ™ƒ๐Ÿ™„๐Ÿค”๐Ÿ™โ˜น๏ธ๐Ÿค’๐Ÿค•๐Ÿค‘๐Ÿค“๐Ÿค—๐Ÿค๐Ÿค ๐Ÿคค๐Ÿคฅ๐Ÿคง๐Ÿคข๐Ÿคก๐Ÿคฃ"

let characterSet = CharacterSet(charactersIn: emoji)

let range = (myString as NSString).rangeOfCharacter(from: characterSet)
(myString as NSString).substring(with: range)
(range as NSRange).location
(range as NSRange).length
(myString as NSString).length

substring == myString

      

This code can be run on playgrounds. Try changing myString to any emotion face.

I am using NSString and NSRange here as their values โ€‹โ€‹are easier to demonstrate, but it has the same behavior with Swift String or Range.

When I set myString for most of the emojis face, the range is returned as having a length of 2, and the substring can be used appropriately elsewhere. Only 2 faces emojis - smiling face "emoji" and " frowning face " emoji, the range is returned as length 1. In all cases, the length of the string is returned as 2. The substring with the given range of 1 is incomplete and you can see the comparison it goes back to myString as an example comparing it to itself yields false The result for the range of these 2 emojis should be 2.

Interestingly, looking at the unicode spec, these 2 emojis have significantly different unicode values โ€‹โ€‹for their neighbors.

It seems like it might be an iOS bug. I can't think of anything I could personally do wrong here as it works with all other emojis.

+3


source to share


1 answer


Hardly an answer, but to fit in a comment, so bear with me :)

I don't know if you've seen this, but I think your issue is resolved in the State of the Union Platform from WWDC 2017 ( https://developer.apple.com/videos/play/wwdc2017/102/ ) in the section about What's New in Swift 4.

If you look at the video in about 23 minutes 12 seconds, you can see Ted Kremenek talking about how they captured Unicode character highlighting as expected in Swift 4 using Unicode 9 Grapheme Braking.



Also see this question and answer .

Yes ... Don't ask me in detail what this all means, but it seems like they are working on it :)

+1


source







All Articles