Parsing two row vectors using boost: qi
I am new to using qi and ran into difficulties. I want to parse the input, for example:
X + Y + Z, A + B
Into two vectors of strings.
I have some code, but only if the grammar parses individual characters. Ideally, the following line should be readable:
Xi + Ye + Zou, Ao + Bi
Using a simple replacement such as elem = +(char_ - '+') % '+'
cannot be parsed because it will consume "," on the first element, but I haven't found an easy way to get around this.
Here is my one-character code, for reference:
#include <bits/stdc++.h>
#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
typedef std::vector<std::string> element_array;
struct reaction_t
{
element_array reactants;
element_array products;
};
BOOST_FUSION_ADAPT_STRUCT(reaction_t, (element_array, reactants)(element_array, products))
template<typename Iterator>
struct reaction_parser : qi::grammar<Iterator,reaction_t(),qi::blank_type>
{
reaction_parser() : reaction_parser::base_type(reaction)
{
using namespace qi;
elem = char_ % '+';
reaction = elem >> ',' >> elem;
BOOST_SPIRIT_DEBUG_NODES((reaction)(elem));
}
qi::rule<Iterator, reaction_t(), qi::blank_type> reaction;
qi::rule<Iterator, element_array(), qi::blank_type> elem;
};
int main()
{
const std::string input = "X + Y + Z, A + B";
auto f = begin(input), l = end(input);
reaction_parser<std::string::const_iterator> p;
reaction_t data;
bool ok = qi::phrase_parse(f, l, p, qi::blank, data);
if (ok) std::cout << "success\n";
else std::cout << "failed\n";
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
source to share
Using a simple replacement like elem = + (char_ - '+')% '+' cannot be parsed because it will consume "," on the first element, but I have not found an easy way to do this.
Well, a complete (braindead) simple solution would be to use +(char_ - '+' - ',')
or +~char_("+,")
.
Indeed, I would make the rule for element
more specific, for example:
elem = qi::lexeme [ +alpha ] % '+';
See Speeding up problems with skippers about token and skippers
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
typedef std::vector<std::string> element_array;
struct reaction_t
{
element_array reactants;
element_array products;
};
BOOST_FUSION_ADAPT_STRUCT(reaction_t, (element_array, reactants)(element_array, products))
template<typename Iterator>
struct reaction_parser : qi::grammar<Iterator,reaction_t(),qi::blank_type>
{
reaction_parser() : reaction_parser::base_type(reaction) {
using namespace qi;
elem = qi::lexeme [ +alpha ] % '+';
reaction = elem >> ',' >> elem;
BOOST_SPIRIT_DEBUG_NODES((reaction)(elem));
}
qi::rule<Iterator, reaction_t(), qi::blank_type> reaction;
qi::rule<Iterator, element_array(), qi::blank_type> elem;
};
int main()
{
reaction_parser<std::string::const_iterator> p;
for (std::string const input : {
"X + Y + Z, A + B",
"Xi + Ye + Zou , Ao + Bi",
})
{
std::cout << "----- " << input << "\n";
auto f = begin(input), l = end(input);
reaction_t data;
bool ok = qi::phrase_parse(f, l, p, qi::blank, data);
if (ok) {
std::cout << "success\n";
for (auto r : data.reactants) { std::cout << "reactant: " << r << "\n"; }
for (auto p : data.products) { std::cout << "product: " << p << "\n"; }
}
else
std::cout << "failed\n";
if (f != l)
std::cout << "Remaining unparsed: '" << std::string(f, l) << "'\n";
}
}
Printing
----- X + Y + Z, A + B
success
reactant: X
reactant: Y
reactant: Z
product: A
product: B
----- Xi + Ye + Zou , Ao + Bi
success
reactant: Xi
reactant: Ye
reactant: Zou
product: Ao
product: Bi
source to share