Pandas: how to delete lines with a value ending with a specific character?

I have a pandas DataFrame like this:

mail = DataFrame({'mail' : ['adv@gmail.com', 'fhngn@gmail.com', 'foinfo@yahoo.com', 'njfjrnfjrn@yahoo.com', 'nfjebfjen@hotmail.com', 'gnrgiprou@hotmail.com', 'jfei@hotmail.com']})

      

as follows:

                    mail
0          adv@gmail.com
1        fhngn@gmail.com
2       foinfo@yahoo.com
3   njfjrnfjrn@yahoo.com
4  nfjebfjen@hotmail.com
5  gnrgiprou@hotmail.com
6       jfei@hotmail.com

      

What I want to do is filter out (exclude) all those rows where the value in the column mail ends with "@ gmail.com".

+3


source to share


2 answers


You can use str.endswith

and negate the result of a boolean series with ~

:

mail[~mail['mail'].str.endswith('@gmail.com')]

      

What produces:



                    mail
2       foinfo@yahoo.com
3   njfjrnfjrn@yahoo.com
4  nfjebfjen@hotmail.com
5  gnrgiprou@hotmail.com
6       jfei@hotmail.com

      

Pandas has many other vectorized string operations that are accessible through the accessory .str

. Many of them are instantly familiar with Python's own string methods, but come will be built in value handling NaN

.

+3


source


The type column str

has a field.str

with which you can access the standard functions defined for one str

:

[6]: mail['mail'].str.endswith('gmail.com')
      Out[6]:
0     True
1     True
2    False
3    False
4    False
5    False
6    False
Name: mail, dtype: bool

      

Then you can filter this series:



[7]: mail[~mail['mail'].str.endswith('gmail.com')]
      Out[7]:
                    mail
2       foinfo@yahoo.com
3   njfjrnfjrn@yahoo.com
4  nfjebfjen@hotmail.com
5  gnrgiprou@hotmail.com
6       jfei@hotmail.com

      

A similar property.dt

exists to access date / time-related properties of a column if it contains date data.

+1


source







All Articles