Creating a new pandas frame from specific columns of an existing dataframe
I have read the uploaded csv file into pandas framework and want to do some simple file frame manipulation. I cannot figure out how to create a new dataframe based on the selected columns from my original dataframe. My attempt:
names = ['A','B','C','D']
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset['A','D']
I would like to create a new framework with columns A and D from the original frame.
source to share
It's called subset
- passed a list of columns to []
:
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset[['A','D']]
what is the same:
new_dataset = dataset.loc[:, ['A','D']]
If you only want filtered output, add the parameter usecols
to read_csv
:
new_dataset = pandas.read_csv('file.csv', names=names, usecols=['A','D'])
EDIT:
If used only:
new_dataset = dataset[['A','D']]
and use some data manipulation, obviously you get:
The value is trying to set on a copy of the slice from the DataFrame.
Try using .loc [row_indexer, col_indexer] =
If you change the values new_dataset
later, you will find that the changes are not propagated to the original data ( dataset
) and that Pandas issues a warning.
As pointed out by EdChum add copy
to remove the warning:
new_dataset = dataset[['A','D']].copy()
source to share