Unsigned int for dataframe to_sql using sqlalchemy types
I cannot assign a type unsigned int
when using .to_sql()
to write my dataframe to a MySQL database. I can use other types int
, but I just can't get it unsigned
. A small representative sample of what I'm trying looks like this:
import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types
db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')
d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')
This gives:
>>> df
items problems
id
100 6 50
101 10 72
102 20000 2147483649
I am writing to the database like this:
df.to_sql('my_table',
db_engine,
flavor='mysql',
if_exists='replace',
index_label=['id'],
dtype={'id': sql_types.SMALLINT,
'items': sql_types.INT,
'problems': sql_types.INT}
But what happens is this value problems
on the last line ( id==102
) is truncated to 2147483647
(which is 2^31-1
) when written to db.
There are no other problems with joining or writing other standard data types, including int
. I could get away by using the option sql_types.BIGINT
instead (doing the maximum 2^63-1
), but that would just be redundant since I know my values ​​will drop below 4294967296
( 2^32-1
), which is basically the unsigned int
maximum.
So the question is: How do I assign a field unsigned int
using the approach .to_sql()
above?
I used the types sqlalchemy
from here . The MySQL types I see are here . I saw a question here that gets an unsigned int for MySQL, but does not use the approach .to_sql()
I would like to use. If I can just create a table from a single statement .to_sql()
, that would be ideal.
source to share
To get unsigned int you can specify this in the sqlalchemy constructor of type INTEGER for mysql (see the docs on mysql types sqlalchemy):
In [23]: from sqlalchemy.dialects import mysql
In [24]: mysql.INTEGER(unsigned=True)
Out[24]: INTEGER(unsigned=True)
So, you can provide this to the argument dtype
in to_sql
instead of the more general one sql_types.INT
:
dtype={'problems': mysql.INTEGER(unsigned=True), ...}
Note : you need at least pandas 0.16.0 for this to work.
source to share