How do I add columns to a data frame?
I have the following data:
Example:
DRIVER_ID; TSHEZTAMR; POSITION
156; 2014-02-01 00: 00: 00.739166 + 01; POINT (41.8836718276551 12.4877775603346)
I want to create a pandas framework with 4 columns that are id, time, longitude, latitude. So far I got:
cur_cab = pd.DataFrame.from_csv(
path,
sep=";",
header=None,
parse_dates=[1]).reset_index()
cur_cab.columns = ['cab_id', 'datetime', 'point']
path
specifies a .txt file containing data. I already wrote a function that returns longitude and latitude values ββfrom a gated point.
How to expand dataframe with extra column and split values?
source to share
Once downloaded, if you are using the latest version of pandas, you can use vectorized methods str
to parse the column:
In [87]:
df['pos_x'], df['pos_y']= df['point'].str[6:-1].str.split(expand=True)
df
Out[87]:
cab_id datetime \
0 156 2014-01-31 23:00:00.739166
point pos_x pos_y
0 POINT(41.8836718276551 12.4877775603346) 0 1
Also you have to stop using it from_csv
, it doesn't update anymore, use the top level read_csv
so your download code is:
cur_cab = pd.read_csv(
path,
sep=";",
header=None,
parse_dates=[1],
names=['cab_id', 'datetime', 'point'],
skiprows=1)
source to share