Can't convert pandas DataFrame to json using to_json
I know there are several other posts in Stack Overflow regarding this same problem, however, none of the solution found in these posts, or any other post I found on the internet for that matter, worked. I followed numerous tutorials, videos, books and posts on stack overflow on pandas and all the solutions mentioned have failed.
The inconvenience is that all the solutions I have found are correct, or at least should be; I am new to pandas, so my only conclusion is that I probably have something wrong.
Here is the pandas documentation I started with: Pandas to_json Doc . I can't seem to get pandas to_json to convert the pandas DataFrame to json object or json string.
Basically, I want to convert a csv string to a DataFrame, and then convert that DataFrame to a json object or json string (I don't care which one). Then, when I have my json data structure, I am going to bind it to the D3.js histogram
Here's an example of what I'm trying to do:
# Declare my csv string (Works):
csvStr = '"pid","dos","facility","a1c_val"\n"123456","2013-01-01 13:37:00","UOFU",5.4\n"65432","2014-01-01 14:32:00","UOFU",5.8\n"65432","2013-01-01 13:01:00","UOFU",6.4'
print (csvStr) # Just checking the variables contents
# Read csv and convert to DataFrame (Works):
csvDf = pandas.read_csv(StringIO.StringIO(csvStr))
print (csvDf) # Just checking the variables contents
# Convert DataFrame to json (Three of the ways I tried - None of them work):
myJSON = csvDf.to_json(path_or_buf = None, orient = 'record', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None) # Attempt 1
print (myJSON) # Just checking the variables contents
myJSON = csvDf.to_json() # Attempt 2
print (myJSON) # Just checking the variables contents
myJSON = pandas.io.json.to_json(csvDf)
print (myJSON) # Just checking the variables contents
The error I am getting:
argument 1 must be a string or read-only character buffer, not a DataFrame
Which is misleading as the documentation states that "A Series or DataFrame can be converted to a valid JSON string".
Regardless, I tried to give it a string anyway and it resulted in the same error.
I tried to create a test case by following the exact steps from books and other tutorials and / or posts and it just results in the same error. At this point, I need a simple solution as soon as possible. I am open to suggestions, but I must emphasize that I donโt have time to spend learning a brand new library.
source to share
For the first try, the correct line is 'records'
not 'record'
This worked for me:
myJSON = csvDf.to_json(path_or_buf = None, orient = 'records', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None) # Attempt 1
Printing gives:
[{"pid":123456,"dos":"2013-01-01 13:37:00","facility":"UOFU","a1c_val":5.4},
{"pid":65432,"dos":"2014-01-01 14:32:00","facility":"UOFU","a1c_val":5.8},
{"pid":65432,"dos":"2013-01-01 13:01:00","facility":"UOFU","a1c_val":6.4}]
source to share
It turns out this problem was caused by my own silly mistake. Checking out my use of to_json, I copy and paste the example into my code and walked away from there. Thinking I had commented out this code, I started using to_json with my test data. It turns out that the error I was getting was thrown from the example code, which I copied and pasted. Once I deleted everything and rewrote it using my test data, it worked.
However, as user667648 (Bair) pointed out, there was a different error in my code. The parameter orient
had to be orient = 'records'
NOT orient = 'record'
.
source to share