Stringstream with multiple delimiters

This is another question that I cannot find an answer to because every example I can find uses vectors and my teacher will not let us use vectors for this class.

I need to read in the text version of one book one word at a time, using (any number) spaces
' '

and (any number) non-letter characters as separators; therefore, any spaces or punctuation marks in any amount must separate the words. Here's how I did it when I needed to use spaces as separator:

while(getline(inFile, line)) {
    istringstream iss(line);

    while (iss >> word) {
        table1.addItem(word);
    }
}

      

EDIT: An example of reading text and how I need to split it.

"If they knew you wanted it, then the entertainment would be."

Here's how to strip the first line:

If a

they

at

known

you

wanted to

entertainment

will be

there is

The text will contain at least all the standard punctuation marks, but also things like ellipses, ...

double strokes --

, etc.

As always, thanks in advance.

EDIT:

So using the second string stream would look something like this?

while(getline(inFile, line)) {
    istringstream iss(line);

    while (iss >> word) {
        istringstream iss2(word);

        while(iss2 >> letter)  {
            if(!isalpha(letter))
                // do something?
        }
        // do something else?
        table1.addItem(word);
    }
}

      

+3


source to share


2 answers


I haven't tested this since now I don't have a g ++ compiler, but it should work (apart from minor C ++ syntax errors)



while (getline(inFile, line))
{
    istringstream iss(line);

    while (iss >> word)
    {
        // check that word has only alpha-numeric characters
        word.erase(std::remove_if(word.begin(), word.end(), 
                                  [](char& c){return !isalnum(c);}),
                   word.end());
        if (word != "")
            table1.addItem(word);
    }
}

      

+2


source


If you can use Boost

, you can do the following:

$ cat kk.txt
If they had known;; you ... wished it, the entertainment.would have

      

You can customize the behavior if needed tokenizer

, but the default should be sufficient.



#include <iostream>
#include <fstream>
#include <string>

#include <boost/tokenizer.hpp>

int main()
{
  std::ifstream is("./kk.txt");
  std::string line;

  while (std::getline(is, line)) {
    boost::tokenizer<> tokens(line);

    for (const auto& word : tokens)
      std::cout << word << '\n';
  }

  return 0;
}

      

Finally

$ ./a.out
If
they
had
known
you
wished
it
the
entertainment
would
have

      

+1


source







All Articles