Pandas: milliseconds removed when writing data to MySQL

I am trying to get DataFrame

millisecond timestamps in a database MySQL

. However, when doing so, the millisecond portion seems to be discarded. I've created a working example to show what's going on:

import pandas as pd
from sqlalchemy import create_engine # database connection

#Generate date_time with millisecond resolution and price column
df=pd.DataFrame({'date_time' : pd.date_range('1/1/2000 09:00:00', freq="5ms",periods=100),'price' : np.random.random_sample(100)})

#Connect with an empty MySQL database (which I simply created using CREATE DATABASE trading_db;)
disk_engine = create_engine("mysql+mysqldb://root:"+'MYPASSWORD'+"@localhost/trading_db")

#Dataframe to SQL in a Table called trading_data
df.to_sql('trading_data', disk_engine, if_exists='replace',index=False)

#When I read this back from MySQL, the milliseconds seem to dissapear
df_sql = pd.read_sql_query('SELECT *'
                   'FROM trading_data '
                   'LIMIT 20', disk_engine)

      

Compare the date-dates DataFrame

created in pandas

with those that were loaded from MySQL

:

df.head()

    date_time                   price
0   2000-01-01 09:00:00         0.371986
1   2000-01-01 09:00:00.005000  0.625551
2   2000-01-01 09:00:00.010000  0.631182
3   2000-01-01 09:00:00.015000  0.625316
4   2000-01-01 09:00:00.020000  0.522437

df_sql.head()

    date_time           price
0   2000-01-01 09:00:00 0.371986
1   2000-01-01 09:00:00 0.625551
2   2000-01-01 09:00:00 0.631182
3   2000-01-01 09:00:00 0.625316
4   2000-01-01 09:00:00 0.522437

      

As you can see, milliseconds are discarded. Is there a way to change the code to save the millisecond part?

Edit: I am using MySQL Workbench 6.2 and pandas 0.14.1

+3


source to share


1 answer


As noted in the comments, you need MySQL v5.6.4 + support to support fractional seconds ( docs ).
But as the docs explains , you need to explicitly specify this as DATETIME(fsp)

, where fsp

is the precision of the fractional seconds to include this in the datetime column.

The default to_sql

should be DateTime

(default data type sqlalchemy datetime). However, you can override this default with an argument dtype

and use the MySQLDateTime

precision type:



In [11]: from sqlalchemy.dialects.mysql import DATETIME

In [12]: df.to_sql('trading_data', engine, dtype={'date_time': DATETIME(fsp=6)}, if_exists='replace', index=False)

In [13]: df_sql = pd.read_sql_query('SELECT * FROM trading_data', engine)

In [14]: df_sql.head()
Out[14]:
                   date_time     price
0        2000-01-01 09:00:00  0.152087
1 2000-01-01 09:00:00.005000  0.927375
2 2000-01-01 09:00:00.010000  0.540021
3 2000-01-01 09:00:00.015000  0.499529
4 2000-01-01 09:00:00.020000  0.797420

      

Note: dtype

you need pandas 0.15.2+ for this argument .

+3


source







All Articles