Python error while reading number from csv
I have a csv file like - order_id, name, address
When I try to insert data from csv into postgresql table via python, it doesn't read the number correctly.
for example let the data be
order_id | name | address
----------+--------+----------
5432548543| Manish | Dummy Address
it reads the order_id as 5.43E + 9 instead of an integer. my code is like:
filename.encode('utf-8')
with open(filename) as file:
data = csv.DictReader(file)
cur.executemany("""Insert into temp_unicom values(%(Order Id)s,%(Name)s,%(Address)s)""", data)
Here the order id, name, address are the headers of my csv file.
How to format data correctly? EDIT ::
Link to csv file CSV file
source to share
When I modify the example provided in the csv:
order_id,name,address 5432548543,Manish,Dummy Address
And just iterate over the lines, printing them out:
with open('./data.txt') as f:
data = csv.DictReader(f)
for row in data:
print(l)
I get:
{'order_id': '5432548543', 'name': 'Manish', 'address': 'Dummy Address'}
Which suggests that the problem isn't csv parsing - but you should try the same on your dataset to double check.
The question then becomes: what is your postgres driver that might be causing the problem? Are you using psycopg2? Does it do some kind of automatic casting somewhere?
EDIT, so the problem is with the src data. Sometimes you have ints in scientific notation. You need to clear the data before transfer executemany
:
data = csv.DictReader(f)
clean_data = []
for d in data:
clean_data.append(d)
try:
d['Order Id'] = str(int(float(d['Order Id'])))
except ValueError:
pass
cur.executemany("""Insert into temp_unicom values (%(Order Id)s, %(Name)s, %(Address)s)""", clean_data)
source to share
Try formatting the float to string before posting the dict.
Example -
cur.executemany("""Insert into temp_unicom values(%(Order Id)f,%(Name)s,%(Address)s)""",dict((k,v )if k != "Order Id" else (k,'%f'%(v)) for k,v in dict1.iteritems()))
Also, rename the dictionary to something else (for example, I renamed it to dict1
because otherwise it will replace the built-in function dict
.
source to share