Unsigned int for dataframe to_sql using sqlalchemy types

I cannot assign a type unsigned int

when using .to_sql()

to write my dataframe to a MySQL database. I can use other types int

, but I just can't get it unsigned

. A small representative sample of what I'm trying looks like this:

import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types

db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')

d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')

      

This gives:

>>> df
         items    problems
id
100          6          50
101         10          72
102      20000  2147483649

      

I am writing to the database like this:

df.to_sql('my_table',
          db_engine,
          flavor='mysql',
          if_exists='replace',
          index_label=['id'],
          dtype={'id': sql_types.SMALLINT,
                 'items': sql_types.INT,
                 'problems': sql_types.INT}

      

But what happens is this value problems

on the last line ( id==102

) is truncated to 2147483647

(which is 2^31-1

) when written to db.

There are no other problems with joining or writing other standard data types, including int

. I could get away by using the option sql_types.BIGINT

instead (doing the maximum 2^63-1

), but that would just be redundant since I know my values ​​will drop below 4294967296

( 2^32-1

), which is basically the unsigned int

maximum.

So the question is: How do I assign a field unsigned int

using the approach .to_sql()

above?

I used the types sqlalchemy

from here . The MySQL types I see are here . I saw a question here that gets an unsigned int for MySQL, but does not use the approach .to_sql()

I would like to use. If I can just create a table from a single statement .to_sql()

, that would be ideal.

+3


source to share


1 answer


To get unsigned int you can specify this in the sqlalchemy constructor of type INTEGER for mysql (see the docs on mysql types sqlalchemy):

In [23]: from sqlalchemy.dialects import mysql

In [24]: mysql.INTEGER(unsigned=True)
Out[24]: INTEGER(unsigned=True)

      

So, you can provide this to the argument dtype

in to_sql

instead of the more general one sql_types.INT

:



dtype={'problems': mysql.INTEGER(unsigned=True), ...}

      

Note : you need at least pandas 0.16.0 for this to work.

+1


source







All Articles