How to slice continuous and discontinuous index in pandas?

Question

How to slice continuous and discontinuous index in pandas?

pandas

iloc

can truncate data in two cases like df.iloc[:,2:5]

and df.iloc[:,[6,10]]

. If I want to select columns 2:5, 6 and 10

, how to use iloc

for slice df

?

+3

pandas

Cobin 04 Apr 17 at 12:06

source to share

2 answers

I think you need numpy.r_

to concanecate indexes and then iloc

select by position:

ds = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3],
                   'G':[1,3,5],
                   'H':[5,3,6],
                   'I':[4,4,3],
                   'J':[6,4,3],
                   'K':[9,4,3]})

print (ds)
   A  B  C  D  E  F  G  H  I  J  K
0  1  4  7  1  5  7  1  5  4  6  9
1  2  5  8  3  3  4  3  3  4  4  4
2  3  6  9  5  6  3  5  6  3  3  3

print (np.r_[2:5, 6,10])
[ 2  3  4  6 10]

print (ds.iloc[:, np.r_[2:5, 6,10]])
   C  D  E  G  K
0  7  1  5  1  9
1  8  3  3  3  4
2  9  5  6  5  3

For discussion:

ix

vs iloc

- the main issue - ix

will be deprecated in Pandas 0.20.0 . And it seems the new version is coming soon - in April, so it's better to use iloc

.

+1

jezrael 04 Apr 17 at 12:09

source to share

MaxU · Accepted Answer · 2017-04-04T12:08:56+0000

Use numpy.r_ :

From the docs:

Converts slice objects to concatenation along the first axis.

This is an easy way to quickly create arrays. There are two use cases.

If the index expression contains comma separated arrays, then stack them along their first axis.

If the index expression contains slice notation or scalars, then create a one-dimensional array with the range specified by the slice notation.

Demo:

In [16]: df = pd.DataFrame(np.random.rand(3, 12))

In [17]: df.iloc[:, np.r_[2:5, 6, 10]]
Out[17]:
         2         3         4         6         10
0  0.760201  0.378125  0.707002  0.310077  0.375646
1  0.770165  0.269465  0.419979  0.218768  0.832087
2  0.253142  0.737015  0.652522  0.474779  0.094145

In [18]: df
Out[18]:
         0         1         2         3         4         5         6         7         8         9         10        11
0  0.668062  0.581268  0.760201  0.378125  0.707002  0.249094  0.310077  0.336708  0.847258  0.705631  0.375646  0.830852
1  0.521096  0.798405  0.770165  0.269465  0.419979  0.455890  0.218768  0.833776  0.862483  0.817974  0.832087  0.958174
2  0.211815  0.747482  0.253142  0.737015  0.652522  0.274231  0.474779  0.256119  0.110760  0.224096  0.094145  0.525201

UPDATE: since Pandas 0.20.1 . indexer index has been deprecated in favor of a stricter one. iloc and .loc indices .

So I updated my answer to fix the outdated options: Edit .ix[]

→df.iloc[...]

How to slice continuous and discontinuous index in pandas?

More articles: