Only 2 emoji return incorrect length when compared to a character set containing them

Question

Only 2 emoji return incorrect length when compared to a character set containing them

let myString = "☺️"

let emoji = "😀😁😂😃😄😅😆😇😈👿😉😊☺️😋😌😍😎😏😐😑😒😓😔😕😖😗😘😙😚😛😜😝😞😟😠😡😢😣😤😥😦😧😨😩😪😫😬😭😮😯😰😱😲😳😴😵😶😷🙂🙃🙄🤔🙁☹️🤒🤕🤑🤓🤗🤐🤠🤤🤥🤧🤢🤡🤣"

let characterSet = CharacterSet(charactersIn: emoji)

let range = (myString as NSString).rangeOfCharacter(from: characterSet)
(myString as NSString).substring(with: range)
(range as NSRange).location
(range as NSRange).length
(myString as NSString).length

substring == myString

This code can be run on playgrounds. Try changing myString to any emotion face.

I am using NSString and NSRange here as their values are easier to demonstrate, but it has the same behavior with Swift String or Range.

When I set myString for most of the emojis face, the range is returned as having a length of 2, and the substring can be used appropriately elsewhere. Only 2 faces emojis - smiling face "emoji" and " frowning face " emoji, the range is returned as length 1. In all cases, the length of the string is returned as 2. The substring with the given range of 1 is incomplete and you can see the comparison it goes back to myString as an example comparing it to itself yields false The result for the range of these 2 emojis should be 2.

Interestingly, looking at the unicode spec, these 2 emojis have significantly different unicode values for their neighbors.

It seems like it might be an iOS bug. I can't think of anything I could personally do wrong here as it works with all other emojis.

+3

string ios unicode swift emoji

Andrew 06 June 17 at 9:59 am

source to share

1 answer

pbodsk · Accepted Answer · 2017-06-06T13:11:53+0000

Hardly an answer, but to fit in a comment, so bear with me :)

I don't know if you've seen this, but I think your issue is resolved in the State of the Union Platform from WWDC 2017 ( https://developer.apple.com/videos/play/wwdc2017/102/ ) in the section about What's New in Swift 4.

If you look at the video in about 23 minutes 12 seconds, you can see Ted Kremenek talking about how they captured Unicode character highlighting as expected in Swift 4 using Unicode 9 Grapheme Braking.

Also see this question and answer .

Yes ... Don't ask me in detail what this all means, but it seems like they are working on it :)

Only 2 emoji return incorrect length when compared to a character set containing them

More articles: