Download STATA file: categorical values ​​must be unique

I am trying to download the file .dta

behind this zip file into pandas

. However, I get the error immediately. I also have stata on my command, but since the error message doesn't tell me anything else, like the erroneous column, I don't know what to do.

How do I upload a file to pandas


>>> df = pd.read_stata('cepr_org_2014.dta')

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/io/", line 69, in read_stata
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/io/", line 1315, in data
    cat_data.categories = categories
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/core/", line 442, in _set_categories
    categories = self._validate_categories(categories)
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas-0.15.2-py2.7-macosx-10.9-x86_64.egg/pandas/core/", line 437, in _validate_categories
    raise ValueError('Categorical categories must be unique')
ValueError: Categorical categories must be unique



source to share

1 answer

Load this with help pandas.read_stata('cepr_org_2014.dta', convert_categoricals=False, convert_missing=True)

and see what the data looks like. Optional debugging with ipdb as pointed out in the question shows that there is a duplicate category in your data.



All Articles