Java string representation of u1F000

I have a bunch of unicode characters from U1F000 and up and am wondering how to represent them in Java. Java unicode highlighting is in the form "\ uXXXX", and the Java language specification says that "The representation of additional characters requires two consecutive Unicode screens." How does this relate to the U1F000?

String mahjongTile = "\u0001\uf000";

      

Does not work (I only get two empty squares), but I can assume it might be a font glitch.

0


source to share


2 answers


Jon's answer should work, but you can also use a method appendCodePoint

in StringBuilder or StringBuffer.

StringBuilder sb = new StringBuilder();
sb.appendCodePoint(0x1f000);

      



Both methods turn you into surrogate couples for you.

It looks like your problem is displaying characters at the proper level now. If you are trying to display them on the console, forget about it; the console is too limited on most machines. I suggest you either write your output to a file, use a good text editor to read it, or display the output in a Swing component like a JTextPane.

+3


source


You will need to work out the appropriate surrogate pair if you want it in a string literal. (In C #, you can write "\ U0001f000" - \ u is used for BMP and \ U for full Unicode.)

In Java, you can do:



String foo = new String(new int[]{0x1f000}, 0, 1);

      

if you want to see it "1f000". I confess I can't remember that the high / low surrogate range extends from my head :(

+1


source







All Articles