Python blaze (pandas) cannot convert custom dtype of user <i8
I want to read a file uk.txt
from uk nga geonames download using python blaze and then odo to insert it into Postgresql db.
Code:
import blaze as bz
from odo import odo
dataPath = 'uk.txt'
myData = bz.Data(dataPath, sep='\t')
out = odo(myData, 'postgresql://postgres:postgres@localhost:5432/blaze_test::uk_geonames')
I am getting an error ValueError: cannot safely convert passed user dtype of <i8 for object dtyped data in column 0
that I think I understand as "the datatype cannot be converted to insert into db"
Should I force dtype
something equal? How can I fix this?
Sample input from file:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG MGRS JOG FC DSG PC CC1 ADM1 POP ELEV CC2 NT LC SHORT_FORM GENERIC SORT_NAME_RO FULL_NAME_RO FULL_NAME_ND_RO SORT_NAME_RG FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK NAME_LINK TRANSL_CD NM_MODIFY_DATE
1 380952 475802 54.086111 -6.655556 540510 -63920 29UPV5334795644 NN29-06 H STM EI,UK EI,UK N Clarebane CLAREBANERIVER Clarebane River Clarebane River CLAREBANERIVER Clarebane River Clarebane River 2014-06-27 1,2,3 2
source to share
For some reason, the title is not displayed correctly. You can pass the keyword argument infer_header
like this:
In [12]: from blaze import Data
In [13]: from odo import CSV, odo
In [14]: d = Data(CSV('uk.txt', sep='\t', has_header=True))
In [15]: d.head(5)
Out[15]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]
After that, just odo
into the desired table:
In [16]: t = odo(d, 'postgresql://localhost::uk')
In [17]: uk = Data(t)
In [19]: uk.head(5)
Out[19]:
RC UFI UNI LAT LONG DMS_LAT DMS_LONG \
0 1 380952 475802 54.086111 -6.655556 540510 -63920
1 1 380952 475801 54.086111 -6.655556 540510 -63920
2 1 380954 475805 54.104722 -6.648889 540617 -63856
3 1 380955 475806 54.098056 -6.644167 540553 -63839
4 1 380958 475810 54.040556 -6.614444 540226 -63652
MGRS JOG FC ... SORT_NAME_RG \
0 29UPV5334795644 NN29-06 H ... CLAREBANERIVER
1 29UPV5334795644 NN29-06 H ... CLAREBANE
2 29UPV5371497729 NN29-06 H ... ALINA LOUGH
3 29UPV5404796997 NN29-06 H ... CORLISSLOUGH
4 29UPV5620690667 NN29-06 H ... DRUMBOYLOUGH
FULL_NAME_RG FULL_NAME_ND_RG NOTE MODIFY_DATE DISPLAY NAME_RANK \
0 Clarebane River Clarebane River NaN 2014-06-27 1,2,3 2
1 Clarebane Clarebane NaN 2014-06-27 1,2,3 1
2 Alina, Lough Alina, Lough NaN 2014-06-27 1,2,3 1
3 Corliss Lough Corliss Lough NaN 2014-06-27 1,2,3 1
4 Drumboy Lough Drumboy Lough NaN 2014-06-27 1,2,3 1
NAME_LINK TRANSL_CD NM_MODIFY_DATE
0 NaN NaN 2014-06-27
1 NaN NaN 2014-06-27
2 NaN NaN 2014-06-27
3 NaN NaN 2014-06-27
4 NaN NaN 2014-06-27
[5 rows x 34 columns]
source to share