How do I write a parser that doesn't consume space?

I am writing a program to modify source code files. I need to parse a file (for example with megaparsec), modify its abstract AST syntax tree (for example, Uniplate) and restore the file with minimal changes (such as keeping spaces, comments, etc.).

So the AST must contain spaces, for example:

data Identifier = Identifier String String

      

where the first line is the name of the identifier and the second is the spaces after it. The same applies to any character in the language.

How do I write a parser for an ID?

+3


source to share


1 answer


I ended up writing parseLexeme to replace lexeme in this tutorial

data Lexeme a = Lexeme a String -- String contains the spaces after the lexeme

whites :: Parser String
whites = many spaceChar

parseLexeme :: Parser a -> Parser (Lexeme a)
parseLexeme p = do
  value <- p
  w <- whites
  return $ Lexeme value w

instance PPrint a => PPrint (Lexeme a) where
  pprint (Lexeme value w) = (pprint value) ++ w

      



The parser for the identifier becomes:

data Identifier = Identifier (Lexeme String)

parseIdentifier :: Parser Identifier
parseIdentifier = do
  v <- parseLexeme $ (:) <$> letterChar <*> many (alphaNumChar <|> char '_')
  return $ Identifier v

instance PPrint Identifier where
  pprint (Identifier l) = pprint l

      

+2


source







All Articles