.loc index of type change

Question

.loc index of type change

If I have pandas.DataFrame

with columns of different types (like int64

and float64

), getting one item from the int

indexed column .loc

converts the output to float

:

import pandas as pd
df_test = pd.DataFrame({'ints':[1,2,3], 'floats': [4.5,5.5,6.5]})

df_test['ints'].dtype
>>> dtype('int64')

df_test.loc[0,'ints']
>>> 1.0

type(df_test.loc[0,'ints'])
>>> numpy.float64

If I use .at

for indexing it doesn't:

type(df_test.at[0,'ints'])
>>> numpy.int64

This also doesn't happen when all columns are int

:

df_test = pd.DataFrame({'ints':[1,2,3], 'ints2': [4,5,6]})
df_test.loc[0,'ints']
>>> 1

Is this a consequence of some of the basic properties of indexing pandas

? In other words, is this a function error? :)

Update . It turns out that this is a bug and will be fixed in pandas 0.20.0

.

+3

python pandas indexing

Sergey Antopolskiy Apr 12 17 at 10:19

source to share

1 answer

EdChum · Accepted Answer · 2017-04-12T11:00:09+0000

The problem here is that it loc

implicitly tries to return Series

initially even if you are returning a single column, and hence the scalar value from that row is dtype

incremented to a dtype which will support all dtypes for that row if you select that particular column and use loc

then it won't convert this:

In [83]:
df_test['ints'].loc[0]

Out[83]:
1

You can see what happens when you don't hook:

In [84]:
df_test.loc[0]

Out[84]:
floats    4.5
ints      1.0
Name: 0, dtype: float64

This might not be desirable, and I think there might be a github issue regarding this

this issue is related

.loc index of type change

More articles: