How do I add columns to a data frame?

I have the following data:

Example:

DRIVER_ID; TSHEZTAMR; POSITION

156; 2014-02-01 00: 00: 00.739166 + 01; POINT (41.8836718276551 12.4877775603346)

I want to create a pandas framework with 4 columns that are id, time, longitude, latitude. So far I got:

cur_cab = pd.DataFrame.from_csv(
            path,
            sep=";",
            header=None,
            parse_dates=[1]).reset_index()
cur_cab.columns = ['cab_id', 'datetime', 'point']

      

path

specifies a .txt file containing data. I already wrote a function that returns longitude and latitude values ​​from a gated point. How to expand dataframe with extra column and split values?

+3


source to share


1 answer


Once downloaded, if you are using the latest version of pandas, you can use vectorized methods str

to parse the column:

In [87]:
df['pos_x'], df['pos_y']= df['point'].str[6:-1].str.split(expand=True)
df

Out[87]:
   cab_id                   datetime  \
0     156 2014-01-31 23:00:00.739166   

                                      point  pos_x  pos_y  
0  POINT(41.8836718276551 12.4877775603346)      0      1  

      



Also you have to stop using it from_csv

, it doesn't update anymore, use the top level read_csv

so your download code is:

cur_cab = pd.read_csv(
            path,
            sep=";",
            header=None,
            parse_dates=[1],
            names=['cab_id', 'datetime', 'point'],
            skiprows=1)

      

+2


source







All Articles