Pandas outputting aggregate function in xlsx
I have sqlite queries that I turned into pandas dataframes. I passed these dataframes to functions to get aggregated information. How can I populate an Excel spreadsheet using the results of this function? those. how can I turn a function into a data frame? (Note: I am using openpyxl to create a workbook)
Here is the code for df and function:
# Nationwide measure statistics
nationwide_measures = pd.read_sql_query("""select state,
measure_id,
measure_name,
score
from timely_and_effective_care___hospital;""", conn)
# Remove the non-numeric string values from 'score'
nationwide_measures1 = nationwide_measures[nationwide_measures['score'].astype(str).str.isdigit()]
# Change score to numeric
nationwide_measures1['score'] = pd.to_numeric(nationwide_measures1['score'])
# Function to grab measure values
def get_stats(group):
return {'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()}
# Function output
nationwide_measures1['score'].groupby(nationwide_measures1['measure_id']).apply(get_stats).unstack()
I tried:
# Function to grab measure values
def get_stats(group):
return pd.DataFrame({'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()})
but this returns "Value error: if you are using all scalar values, you must pass in the index"
I've also tried:
# Function to grab measure values
def get_stats(group):
df = pd.DataFrame({'Measure Name': group.columns['measure_name'],'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()}, index = [0])
return df
But this gives an error: "AttributeError: Object" Series "has no attributes" columns "
+3
source to share
1 answer
In your datafile creation statement, in the pd.DataFrame line, you pass all scalar values ββand there are no iterations, so if you add index = [0], you get a single dataframe.
pd.DataFrame({'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()},index=[0])
0
source to share