A programming idiom for parsing a string in multiple passes

I am working on a braille translation library and I need to translate a line of text to braille. I plan to do this in a few passes, but I need a way to keep track of which parts of the string have been translated and which have not, so I am not relaying them.

I could always create a class that would keep track of the ranges of positions in the processed string and then design a find / replace algorithm to ignore them on subsequent passes, but I'm wondering if there isn't a more elegant way to do the same.

I would suggest that multi-pass line feeds are not that unusual, I'm just not sure what these options are.

+2


source to share


2 answers


A more common approach would be to tokenize your input and then work with tokens. For example, start by tokenizing a string to a token for each character. Then in the first pass, a direct braille display is generated, a token with a token. In subsequent passes, you can replace more tokens - for example, replacing sequences of input tokens with a single output token.



Since your tokens are objects or structures, not simple symbols, you can attach additional information to them, such as the source code token that you translated (more precisely, transliterated) the current token.

+4


source


Check out basic compiler theory.



0


source







All Articles