Irreversible StreamTokenizer Characters
I am using Java StreamTokenizer
to tokenize text input of code.
When escape characters appear on the string, the tokenizer cancels them out, whereas I want the string to be the same.
For example:
Input: String str = "STRIN\tG";
StreamTokenizer Output: STRIN G
Wanted Output: STRIN\tG
My code:
BufferedReader reader = new BufferedReader(new FileReader("test.java"));
StreamTokenizer tokenizer = new StreamTokenizer(reader);
boolean eof = false;
do {
int type = 0;
type = tokenizer.nextToken();
switch (type) {
case StreamTokenizer.TT_EOF:
eof = true;
break;
case '"':
System.out.println(tokenizer.sval);
break;
}
} while (!eof);
EDIT
I prefer to work with StreamTokenizer
because good comment handling removes
source to share
Constructor StreamTokenizer
constructor JavaDoc states:
All byte values ββ'\ u0000' through '\ u0020' are considered blanks.
and \t
is the view \ u000a ... You can use a method whitespaceChars()
to change this behavior.
Note: if you selected println()
, the line containing \t
most / all terminals will move the cursor to the next tab position, instead of actually printing \t
...
Greetings,
source to share
Add default
case
and process the symbol the way you want:
switch (type) {
case StreamTokenizer.TT_EOL:
System.out.println("End of Line encountered.");
break;
case StreamTokenizer.TT_WORD:
System.out.print(tokenizer.sval);
break;
case StreamTokenizer.TT_EOF:
eof = true;
break;
case '"':
System.out.println(tokenizer.sval);
break;
default:
System.out.print((char) type);
}
source to share