Pandas and Stata files 13

I have pandas 0.13.1 installed but pandas.read_stata()

cannot read .dta files generated in Stata 13 format with error

TypeError: cannot concatenate 'str' and 'NoneType' objects

      

No problem with the same dataset saved in Stata 12 format.

I thought the latest version of pandas (0.13.1) is handling Stata 13 dataset files. Am I doing something wrong?

+2


source to share


1 answer


I am assuming that you are not doing anything wrong, but your version of pandas cannot handle Stata 13 dataset files. As described in help dta , the data format of Stata.dta files changed with the release of Stata 13.

Solution 1.

Update your pandas to v0.14.0 (May 31, 2014):

read_stata now accepts Stata 13 format (GH4291)

Source: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html

Solution 2.



If you have access to Stata, there are several ways to revert to earlier / different formats that should work with an earlier version of pandas. See Answer:

Read Stata file 13 in R

Edit

Solution 3.

After some discussion on GitHub, the pandas issue seems to be with Stata 13 datasets - it's string variables. So another solution could be converting a string to a numeric type. See help encode

which will create the corresponding value labels; or perhaps help real

or help destring

, if the strings are just numbers in the string type.

+4


source







All Articles