Irreversible StreamTokenizer Characters

I am using Java StreamTokenizer

to tokenize text input of code.
When escape characters appear on the string, the tokenizer cancels them out, whereas I want the string to be the same.

For example:

Input: String str = "STRIN\tG";

StreamTokenizer Output: STRIN    G
Wanted Output: STRIN\tG

      

My code:

BufferedReader reader = new BufferedReader(new FileReader("test.java"));
StreamTokenizer tokenizer = new StreamTokenizer(reader);

boolean eof = false;
do {
    int type = 0;
    type = tokenizer.nextToken();
    switch (type) {
        case StreamTokenizer.TT_EOF:
                eof = true;
                break;

            case '"':
                System.out.println(tokenizer.sval);
                break;
    }
} while (!eof);

      

EDIT
I prefer to work with StreamTokenizer

because good comment handling removes

+3


source to share


2 answers


Constructor StreamTokenizer

constructor JavaDoc
states:

All byte values ​​'\ u0000' through '\ u0020' are considered blanks.

and \t

is the view \ u000a ... You can use a method whitespaceChars()

to change this behavior.



Note: if you selected println()

, the line containing \t

most / all terminals will move the cursor to the next tab position, instead of actually printing \t

...

Greetings,

+1


source


Add default

case

and process the symbol the way you want:



    switch (type) {
        case StreamTokenizer.TT_EOL:
            System.out.println("End of Line encountered.");
            break;
         case StreamTokenizer.TT_WORD:
            System.out.print(tokenizer.sval);
            break;
        case StreamTokenizer.TT_EOF:
            eof = true;
            break;
        case '"':
            System.out.println(tokenizer.sval);
            break;
        default:
            System.out.print((char) type);
        }

      

0


source







All Articles