Python 3.4: Pandas DataFrame not responding to ordered dictionary
I am filling with an DataFrame
ordered dictionary, but pandas DataFrame
ordering the columns alphabetically.
code
labels = income_data[0:-1:4]
year1 = income_data[1:-1:4]
key = eachTicker
value = OrderedDict(zip(labels, year1))
full_dict[key] = value
df = pd.DataFrame(full_dict)
print(df)
As you can see below full_dict
- this is an encrypted dictionary from several lists, namely: labels
andyear1
output full_dict
print(full_dict)
OrderedDict([('AAPL', OrderedDict([('Total Revenue', 182795000), ('Cost of Revenue', 112258000), ('Gross Profit', 70537000), ('Research Development', 6041000), ('Selling General and Administrative', 11993000), ('Non Recurring', 0), ('Others', 0), ('Total Operating Expenses', 0), ('Operating Income or Loss', 52503000), ('Total Other Income/Expenses Net', 980000), ('Earnings Before Interest And Taxes', 53483000), ('Interest Expense', 0), ('Income Before Tax', 53483000), ('Income Tax Expense', 13973000), ('Minority Interest', 0), ('Net Income From Continuing Ops', 39510000), ('Discontinued Operations', 0), ('Extraordinary Items', 0), ('Effect Of Accounting Changes', 0), ('Other Items', 0), ('Net Income', 39510000), ('Preferred Stock And Other Adjustments', 0), ('Net Income Applicable To Common Shares', 39510000)]))])
The output is DataFrame
sorted alphabetically and I don't know why. I want it to be ordered like infull_dict
code output
AAPL AMZN LNKD
Cost of Revenue 112258000 62752000 293797
Discontinued Operations 0 0 0
Earnings Before Interest And Taxes 53483000 99000 31205
Effect Of Accounting Changes 0 0 0
Extraordinary Items 0 0 0
Gross Profit 70537000 26236000 1924970
Income Before Tax 53483000 -111000 31205
Income Tax Expense 13973000 167000 46525
Interest Expense 0 210000 0
Minority Interest 0 0 -427
Net Income 39510000 -241000 -15747
Net Income Applicable To Common Shares 39510000 -241000 -15747
Net Income From Continuing Ops 39510000 -241000 -15747
Non Recurring 0 0 0
Operating Income or Loss 52503000 178000 36135
Other Items 0 0 0
Others 0 0 236946
Preferred Stock And Other Adjustments 0 0 0
Research Development 6041000 0 536184
Selling General and Administrative 11993000 26058000 1115705
Total Operating Expenses 0 0 0
Total Other Income/Expenses Net 980000 -79000 -4930
Total Revenue 182795000 88988000 2218767
source to share
This looks like a bug in the DataFrame
ctor in the sense that it does not respect the order of the keys when the orientation is the "columns" to be used to use from_dict
and transpose the result when you specify the orientation as "index":
In [31]:
df = pd.DataFrame.from_dict(d, orient='index').T
df
Out[31]:
AAPL
Total Revenue 182795000
Cost of Revenue 112258000
Gross Profit 70537000
Research Development 6041000
Selling General and Administrative 11993000
Non Recurring 0
Others 0
Total Operating Expenses 0
Operating Income or Loss 52503000
Total Other Income/Expenses Net 980000
Earnings Before Interest And Taxes 53483000
Interest Expense 0
Income Before Tax 53483000
Income Tax Expense 13973000
Minority Interest 0
Net Income From Continuing Ops 39510000
Discontinued Operations 0
Extraordinary Items 0
Effect Of Accounting Changes 0
Other Items 0
Net Income 39510000
Preferred Stock And Other Adjustments 0
Net Income Applicable To Common Shares 39510000
EDIT
The error is related to line 5746 in index.py:
def _union_indexes(indexes):
if len(indexes) == 0:
raise AssertionError('Must have at least 1 Index to union')
if len(indexes) == 1:
result = indexes[0]
if isinstance(result, list):
result = Index(sorted(result)) # <------ culprit
return result
When it constructs the index, it fetches the key using result = indexes[0]
, but then checks to see if it is a list and if it sorts the result: result = Index(sorted(result))
that's why you get that result.
source to share