How to slice continuous and discontinuous index in pandas?

pandas

iloc

can truncate data in two cases like df.iloc[:,2:5]

and df.iloc[:,[6,10]]

. If I want to select columns 2:5, 6 and 10

, how to use iloc

for slice df

?

+3


source to share


2 answers


Use numpy.r_ :

From the docs:

Converts slice objects to concatenation along the first axis.

This is an easy way to quickly create arrays. There are two use cases.

If the index expression contains comma separated arrays, then stack them along their first axis.

If the index expression contains slice notation or scalars, then create a one-dimensional array with the range specified by the slice notation.

Demo:



In [16]: df = pd.DataFrame(np.random.rand(3, 12))

In [17]: df.iloc[:, np.r_[2:5, 6, 10]]
Out[17]:
         2         3         4         6         10
0  0.760201  0.378125  0.707002  0.310077  0.375646
1  0.770165  0.269465  0.419979  0.218768  0.832087
2  0.253142  0.737015  0.652522  0.474779  0.094145

In [18]: df
Out[18]:
         0         1         2         3         4         5         6         7         8         9         10        11
0  0.668062  0.581268  0.760201  0.378125  0.707002  0.249094  0.310077  0.336708  0.847258  0.705631  0.375646  0.830852
1  0.521096  0.798405  0.770165  0.269465  0.419979  0.455890  0.218768  0.833776  0.862483  0.817974  0.832087  0.958174
2  0.211815  0.747482  0.253142  0.737015  0.652522  0.274231  0.474779  0.256119  0.110760  0.224096  0.094145  0.525201

      


UPDATE: since Pandas 0.20.1 . indexer index has been deprecated in favor of a stricter one. iloc and .loc indices .

So I updated my answer to fix the outdated options: Edit .ix[]

df.iloc[...]

+3


source


I think you need numpy.r_

to concanecate indexes and then iloc

select by position:

ds = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3],
                   'G':[1,3,5],
                   'H':[5,3,6],
                   'I':[4,4,3],
                   'J':[6,4,3],
                   'K':[9,4,3]})

print (ds)
   A  B  C  D  E  F  G  H  I  J  K
0  1  4  7  1  5  7  1  5  4  6  9
1  2  5  8  3  3  4  3  3  4  4  4
2  3  6  9  5  6  3  5  6  3  3  3

print (np.r_[2:5, 6,10])
[ 2  3  4  6 10]

print (ds.iloc[:, np.r_[2:5, 6,10]])
   C  D  E  G  K
0  7  1  5  1  9
1  8  3  3  3  4
2  9  5  6  5  3

      



For discussion:

ix

vs iloc

- the main issue - ix

will be deprecated in Pandas 0.20.0 . And it seems the new version is coming soon - in April, so it's better to use iloc

.

+1


source







All Articles