Erasing elements from a vector if they are also in another vector

Suppose I have vector a = {"the", "of"}

and vector b = {"oranges", "the", "of", "apples"}

.

I want to compare both vectors and remove elements from a

that are also in b

. This is what I came up with:

for (int i = 0; i < a.size(); i++) {
    for (int j =0; j < b.size(); j++) {
       if (a[i] == b[j]) {
          a.erase(a.begin() + i);
       }
    }
}

      

But this loop does not remove the last item in a

. Weird!

+3


source to share


5 answers


The problem is that when the first element is removed, the a

index increases from 0 to 1. At the next iteration of the loop, the size of the vector 1

that satisfies the condition of the outer loop forcing it to terminate. You can avoid any trickery that might be required to fix this by simply using std::remove_if

, std::find

and lambda.

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>

int main()
{
    std::vector<std::string> a{ "the", "of" };
    std::vector<std::string> b{ "oranges", "the", "of", "apples" };

    auto pred = [&b](const std::string& key) ->bool
    {
        return std::find(b.begin(), b.end(), key) != b.end();
    };

    a.erase(std::remove_if(a.begin(), a.end(), pred), a.end());

    std::cout << a.size() << "\n";
}

      



A better test would be to switch content a

and b

. This will remove "the" and "of", leaving you with "oranges" and "apples".

+6


source


Try to run

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cassert>

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };

    for ( auto it = a.begin(); it != a.end(); )
    {
        if ( std::find( b.begin(), b.end(), *it ) != b.end() )
        {
            it = a.erase( it ); 
        }
        else
        {
            ++it;
        }
    }

    assert( a.empty() );
}

      



Of course, it would be better if the vectors were ordered.

+5


source


In general, instead of moving around the vector content "manually" and selectively erasing its elements, I would suggest using the already built algorithms STL by combining them correctly.

Using erase idiom removal

In particular, to erase items that satisfy some property from std::vector

, you might consider using erase-remove idiom .
This Q&A on Stackoverflow discusses some options for erasing items from STL containers, including case std::vector

.

You can find the commented code below, live here :

#include <algorithm>    // for std::remove_if()
#include <iostream>     // for std::cout, std::endl
#include <string>       // for std::string
#include <vector>       // for std::vector
using namespace std;

void print(const char* name, const vector<string>& v);

int main() 
{
    // Input vectors
    vector<string> a = {"the", "of"};
    vector<string> b = {"oranges", "the", "of", "apples"};

    print("a", a);
    print("b", b);

    // Use the erase-remove idiom
    a.erase(
        remove_if(
            a.begin(), 
            a.end(), 

            // This lambda returns true if current string 's'
            // (from vector 'a') is in vector 'b'. 
            [&b](const string& s) 
            {
                auto it = find(b.begin(), b.end(), s);
                return (it != b.end());
            }
        ), 

        a.end()
    );

    cout << "\nAfter removing:\n";
    print("a", a);
}


void print(const char* name, const vector<string>& v) 
{
    cout << name << " = {";
    bool first = true;
    for (const auto& s : v) 
    {
        if (first) 
        {
            first = false;
            cout << s;
        } 
        else 
        {
            cout << ", " << s;
        }
    }
    cout << "}" << endl;
}

      

Output:

a = {the, of}
b = {oranges, the, of, apples}

After removing:
a = {}

      

PS
Please also check out this very similar question on Stackoverflow .


Using std::set_difference()

An alternative approach could be to use std::set_difference()

eg. something like the following code, live here .
(Note that in this case, according to the precondition, the set_difference()

input vectors must already be sorted.)

#include <algorithm>    // for std::set_difference(), std::sort()
#include <iostream>     // for std::cout, std::endl
#include <iterator>     // for std::inserter
#include <string>       // for std::string
#include <vector>       // for std::vector
using namespace std;

void print(const char* name, const vector<string>& v);

int main() 
{
    // Input vectors
    vector<string> a = {"the", "of"};
    vector<string> b = {"oranges", "the", "of", "apples"};

    print("a", a);
    print("b", b);

    // Sort the vectors before calling std::set_difference().
    sort(a.begin(), a.end());
    sort(b.begin(), b.end());

    // Resulting difference vector
    vector<string> c;
    set_difference(a.begin(), a.end(),
                   b.begin(), b.end(),
                   inserter(c, c.begin()));

    print("difference(a,b)", c);
}


void print(const char* name, const vector<string>& v) 
{
    cout << name << " = {";
    bool first = true;
    for (const auto& s : v) 
    {
        if (first) 
        {
            first = false;
            cout << s;
        } 
        else 
        {
            cout << ", " << s;
        }
    }
    cout << "}" << endl;
}

      

+1


source


The problem you are having is because you are removing elements from a

when you iterate over it, but you do not compensate for that. This is a common problem when trying to write a loop with erasures in it.

If it doesn't matter what order the contents of your vectors are in, and you can store the result in another vector, one of the best approaches is to sort both vectors and call std::set_difference

.

#include <algorithm>
#include <iterator>
#include <string>
#include <vector>

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };
    std::vector<std::string> res;

    std::sort(a.begin(), a.end());
    std::sort(b.begin(), b.end());

    std::set_difference(a.begin(), a.end(), b.begin(), b.end(),
        std::back_inserter(res));
}

      

res

will contain all elements a

that were not in b

, which in this case will be empty.

If order matters, or if it needs to be done, you can use erase-remove idiom . It costs nothing, it will probably be slower for large vectors, as this is inevitably an O (n ^ 2) algorithm.

#include <algorithm>
#include <iterator>
#include <string>
#include <vector>

struct Pred
{
    const std::vector<std::string>& filter;
    Pred(const std::vector<std::string>& x)
        :filter(x){}

    bool operator()(const std::string& str) const
    {
        return std::find(filter.begin(), filter.end(), str) != filter.end();
    }
};

int main()
{
    std::vector<std::string> a = { "the", "of" };
    std::vector<std::string> b = { "oranges", "the", "of", "apples" };

    Pred pred(b);

    a.erase(std::remove_if(a.begin(), a.end(), pred), a.end());
}

      

If you don't have access to a C ++ 11 compiler, the structure Pred

should be good enough for a lambda. Otherwise, this lambda will do the job:

auto pred = [&b](const std::string& str)
    {
        return std::find(b.begin(), b.end(), str) != b.end();
    };

      

+1


source


this is the correct syntax for erasing things from a vector:

myvector.erase (myvector.begin()+5);

      

Second, after you delete it, your index of that vector will be invalid.

Therefore, I recommend that you do a two-line scan. In the first round, you mark the items you want to remove. In the second round, you can remove them.

By the way, your algorithm is O (n ^ 2) time complexity. If you can, I recommend that you sort your vector first. Then you can use a much faster algorithm to deal with it.

0


source







All Articles