Java StreamTokenizer
I am using a method quoteChar('"')
to process strings. Normal escape sequences such as "\ n" and "\ t" are recognized and converted to single characters when the string is parsed. Is there a way to get the string the way it is, which means that if I have a string:
Hello \ tworld
I want to receive
Hello \ tworld
and not:
Hello World
... Thanks to
0
source to share
2 answers
Looking at the source StreamTokenizer
, it looks like the escape behavior for strings is hardcoded. I can only think of a couple of ways to get around this:
- Return the string as soon as you return it. The problem here is that it won't match exactly what was in the file - \ t will be converted back, but \ 040 won't.
- Insert your own
Reader
between sourceReader
andStreamTokenizer
. Store all characters read for the last token in the buffer. Trim the spaces from the beginning of this buffer to get a raw token. - If your tokenization rules are simple enough, implement your own tokenizer.
+1
source to share
What worked for me:
public class MyReader extends BufferedReader {
// You can choose whatever replacement you'd like(one wont occur in your text)
private static final char TAB_REPLACEMENT = '\u0000';
public MyReader(Reader in) {
super(in);
}
@Override
public int read() throws IOException {
int charVal = super.read();
if (charVal == '\t') {
return TAB_REPLACEMENT;
}
return charVal;
}
}
and then create a tokenizer:
myTokenizer = new StreamTokenizer(new MyReader(new FileReader(file)));
and get a new strval on
MyTokenizer.sval.replace(TAB_REPLACEMENT, '\t')
+1
source to share