Stringstream with multiple delimiters
This is another question that I cannot find an answer to because every example I can find uses vectors and my teacher will not let us use vectors for this class.
I need to read in the text version of one book one word at a time, using (any number) spaces ' '
and (any number) non-letter characters as separators; therefore, any spaces or punctuation marks in any amount must separate the words. Here's how I did it when I needed to use spaces as separator:
while(getline(inFile, line)) {
istringstream iss(line);
while (iss >> word) {
table1.addItem(word);
}
}
EDIT: An example of reading text and how I need to split it.
"If they knew you wanted it, then the entertainment would be."
Here's how to strip the first line:
If a
they
at
known
you
wanted to
entertainment
will be
there is
The text will contain at least all the standard punctuation marks, but also things like ellipses, ...
double strokes --
, etc.
As always, thanks in advance.
EDIT:
So using the second string stream would look something like this?
while(getline(inFile, line)) {
istringstream iss(line);
while (iss >> word) {
istringstream iss2(word);
while(iss2 >> letter) {
if(!isalpha(letter))
// do something?
}
// do something else?
table1.addItem(word);
}
}
source to share
I haven't tested this since now I don't have a g ++ compiler, but it should work (apart from minor C ++ syntax errors)
while (getline(inFile, line))
{
istringstream iss(line);
while (iss >> word)
{
// check that word has only alpha-numeric characters
word.erase(std::remove_if(word.begin(), word.end(),
[](char& c){return !isalnum(c);}),
word.end());
if (word != "")
table1.addItem(word);
}
}
source to share
If you can use Boost
, you can do the following:
$ cat kk.txt
If they had known;; you ... wished it, the entertainment.would have
You can customize the behavior if needed tokenizer
, but the default should be sufficient.
#include <iostream>
#include <fstream>
#include <string>
#include <boost/tokenizer.hpp>
int main()
{
std::ifstream is("./kk.txt");
std::string line;
while (std::getline(is, line)) {
boost::tokenizer<> tokens(line);
for (const auto& word : tokens)
std::cout << word << '\n';
}
return 0;
}
Finally
$ ./a.out
If
they
had
known
you
wished
it
the
entertainment
would
have
source to share