Java StreamTokenizer

I am using a method quoteChar('"')

to process strings. Normal escape sequences such as "\ n" and "\ t" are recognized and converted to single characters when the string is parsed. Is there a way to get the string the way it is, which means that if I have a string:

Hello \ tworld

I want to receive

Hello \ tworld

and not:

Hello World

... Thanks to

0


source to share


2 answers


Looking at the source StreamTokenizer

, it looks like the escape behavior for strings is hardcoded. I can only think of a couple of ways to get around this:



  • Return the string as soon as you return it. The problem here is that it won't match exactly what was in the file - \ t will be converted back, but \ 040 won't.
  • Insert your own Reader

    between source Reader

    and StreamTokenizer

    . Store all characters read for the last token in the buffer. Trim the spaces from the beginning of this buffer to get a raw token.
  • If your tokenization rules are simple enough, implement your own tokenizer.
+1


source


What worked for me:

public class MyReader extends BufferedReader {
    // You can choose whatever replacement you'd like(one wont occur in your text)
    private static final char TAB_REPLACEMENT = '\u0000';

    public MyReader(Reader in) {
        super(in);
    }

    @Override
    public int read() throws IOException {
        int charVal = super.read();
        if (charVal == '\t') {
            return TAB_REPLACEMENT;
        }
        return charVal;
    }
}

      

and then create a tokenizer:



myTokenizer = new StreamTokenizer(new MyReader(new FileReader(file)));

      

and get a new strval on

MyTokenizer.sval.replace(TAB_REPLACEMENT, '\t')

      

+1


source







All Articles