Creating a new pandas frame from specific columns of an existing dataframe

I have read the uploaded csv file into pandas framework and want to do some simple file frame manipulation. I cannot figure out how to create a new dataframe based on the selected columns from my original dataframe. My attempt:

names = ['A','B','C','D']
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset['A','D']

      

I would like to create a new framework with columns A and D from the original frame.

+3


source to share


1 answer


It's called subset

- passed a list of columns to []

:

dataset = pandas.read_csv('file.csv', names=names)

new_dataset = dataset[['A','D']]

      

what is the same:

new_dataset = dataset.loc[:, ['A','D']]

      

If you only want filtered output, add the parameter usecols

to read_csv

:

new_dataset = pandas.read_csv('file.csv', names=names, usecols=['A','D'])

      

EDIT:



If used only:

new_dataset = dataset[['A','D']]

      

and use some data manipulation, obviously you get:

The value is trying to set on a copy of the slice from the DataFrame.
Try using .loc [row_indexer, col_indexer] =

If you change the values new_dataset

later, you will find that the changes are not propagated to the original data ( dataset

) and that Pandas issues a warning.

As pointed out by EdChum add copy

to remove the warning:

new_dataset = dataset[['A','D']].copy()

      

0


source







All Articles