Pandas outputting aggregate function in xlsx

I have sqlite queries that I turned into pandas dataframes. I passed these dataframes to functions to get aggregated information. How can I populate an Excel spreadsheet using the results of this function? those. how can I turn a function into a data frame? (Note: I am using openpyxl to create a workbook)

Here is the code for df and function:

# Nationwide measure statistics
nationwide_measures = pd.read_sql_query("""select state,
          measure_id,
          measure_name,
          score
from timely_and_effective_care___hospital;""", conn)

# Remove the non-numeric string values from 'score'
nationwide_measures1 = nationwide_measures[nationwide_measures['score'].astype(str).str.isdigit()]

# Change score to numeric
nationwide_measures1['score'] = pd.to_numeric(nationwide_measures1['score'])

# Function to grab measure values
def get_stats(group):
    return {'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()}

# Function output    
nationwide_measures1['score'].groupby(nationwide_measures1['measure_id']).apply(get_stats).unstack()

      

I tried:

# Function to grab measure values
def get_stats(group):
    return pd.DataFrame({'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()})

      

but this returns "Value error: if you are using all scalar values, you must pass in the index"

I've also tried:

# Function to grab measure values
def get_stats(group):
    df = pd.DataFrame({'Measure Name': group.columns['measure_name'],'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()}, index = [0])
    return df

      

But this gives an error: "AttributeError: Object" Series "has no attributes" columns "

+3


source to share


1 answer


In your datafile creation statement, in the pd.DataFrame line, you pass all scalar values ​​and there are no iterations, so if you add index = [0], you get a single dataframe.



pd.DataFrame({'Minimum': group.min(), 'Maximum': group.max(), 'Average': group.mean(), 'Standard Deviation': group.std()},index=[0]) 

      

0


source







All Articles