Raising Expectation Qi Qi

I am relatively new to Spirit Qi and I am trying to parse assembly language.

For example, I would like to parse:

Func Ident{
    Mov name, "hello"
    Push 5
    Exit
}

      

So far so good. I can parse it correctly. However, the error handler sometimes encounters strange error locations. Take the following code for example:

Func Ident{
    Mov name "hello" ; <-- comma is missing here
    Push 5
    Exit
}

      

Here are the rules used in this parsing:

    gr_function = lexeme["Func" >> !(alnum | '_')] // Ensure whole words
                    > gr_identifier
                    > "{"
                    > *( gr_instruction
                            |gr_label
                        |gr_vardecl
                        |gr_paramdecl)
                    > "}";

    gr_instruction = gr_instruction_names
                     > gr_operands;

    gr_operands = -(gr_operand % ',');

      

Parsing will notice the error, but complains about the missing "}" after Mov. I have a feeling that the problem is in the definition for "Func", but cannot pinpoint it. I would like the parser to complain about the absence "," It would be nice if it complained about indirect errors, but it should definitely identify the missing comma as the culprit.

I've tried options like:

gr_operands = -(gr_operand 
                >> *(','
                     > gr_operand)
                );

      

And others, but with other strange errors.

Does anyone have an idea of ​​how to say "OK, you might have an instruction with no operands, but if you find one and there is no comma in front of the next, fall through the comma"?

UPDATE

Thank you for your responses. Gr_operand is defined as follows:

    gr_operand = ( gr_operand_intlit
                  |gr_operand_flplit
                  |gr_operand_strlit
                  |gr_operand_register
                  |gr_operand_identifier);

    gr_operand_intlit = int_;

    gr_operand_flplit = double_;

    gr_operand_strlit = '"'
                        > strlitcont
                        > '"'
                    ;

    gr_operand_register = gr_register_names;

    // TODO: Must also not accept the keywords from the statement grammar
    gr_operand_identifier = !(gr_instruction_names | gr_register_names)
                            >> raw[
                                    lexeme[(alpha | '_') >> *(alnum | '_')]
                                  ];

    escchar.name("\\\"");
    escchar     = '\\' >> char_("\"");

    strlitcont.name("String literal content");
    strlitcont  = *( escchar | ~char_('"') );

      

+2


source to share


1 answer


You want to make it explicit, which can be an operand. I guessed:

gr_operand    = gr_identifier | gr_string;
gr_string     = lexeme [ '"' >> *("\"\"" | ~char_("\"")) >> '"' ];

      

Unrelated, but you can clarify that a new line starts a new statement (using blank_type as skipper):

        >> "{"
        >> -(
                  gr_instruction
                | gr_label
                | gr_vardecl
                | gr_paramdecl
            ) % eol
        > "}";

      



The parser will now be able to complain that it expects a newline to fail while parsing.

I put together a fully working sample using your sketches in the original post.

See live on Coliru :

#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>

namespace qi    = boost::spirit::qi;

template <typename It, typename Skipper = qi::blank_type>
    struct parser : qi::grammar<It, Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;

        start = lexeme["Func" >> !(alnum | '_')] > function;
        function = gr_identifier
                    >> "{"
                    >> -(
                              gr_instruction
                            //| gr_label
                            //| gr_vardecl
                            //| gr_paramdecl
                        ) % eol
                    > "}";

        gr_instruction_names.add("Mov", unused);
        gr_instruction_names.add("Push", unused);
        gr_instruction_names.add("Exit", unused);

        gr_instruction = lexeme [ gr_instruction_names >> !(alnum|"_") ] > gr_operands;
        gr_operands = -(gr_operand % ',');

        gr_identifier = lexeme [ alpha >> *(alnum | '_') ];
        gr_operand    = gr_identifier | gr_string;
        gr_string     = lexeme [ '"' >> *("\"\"" | ~char_("\"")) >> '"' ];

        BOOST_SPIRIT_DEBUG_NODES((start)(function)(gr_instruction)(gr_operands)(gr_identifier)(gr_operand)(gr_string));
    }

  private:
    qi::symbols<char, qi::unused_type> gr_instruction_names;
    qi::rule<It, Skipper> start, function, gr_instruction, gr_operands, gr_identifier, gr_operand, gr_string;
};

int main()
{
    typedef boost::spirit::istream_iterator It;
    std::cin.unsetf(std::ios::skipws);
    It f(std::cin), l;

    parser<It, qi::blank_type> p;

    try
    {
        bool ok = qi::phrase_parse(f,l,p,qi::blank);
        if (ok)   std::cout << "parse success\n";
        else      std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
        return ok;
    } catch(const qi::expectation_failure<It>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'\n";
    }

    return false;
}

      

+2


source







All Articles