Seaborn Barplot - Displaying Values

I am looking at how to do two things in Seaborn using a histogram to display values ​​that are in a dataframe but not in a graph

1) I want to display the values ​​of one field in a dataframe while plotting another. For example, below I am displaying the "tip" chart, but I would like to place the "total_bill" value centered over each of the bars (ie 325.88 above Friday, 1778.40 above Saturday, etc.)

2) Is there a way to scale the colors of the bars, with the lowest "total_bill" being the lightest color (in this case Friday), and the highest "total_bill" the darkest. Obviously I'll stick with one color (i.e. blue) when I do the scaling.

Thank! I'm sure it's easy, but I'm missing this ..

While I see others think this is a duplicate of another problem (or two), I am missing out on some of how I use a value that is not in the graph as a basis for a label or shading. As I can tell, use total_bill as a base. Sorry, but I just can't figure it out based on these answers.

Starting with the following code,

import pandas as pd
import seaborn as sns
%matplotlib inline
df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-    book/master/ch08/tips.csv", sep=',')
groupedvalues=df.groupby('day').sum().reset_index()
g=sns.barplot(x='day',y='tip',data=groupedvalues)

      

I am getting the following output:

enter image description here

Temporary solution:

for index, row in groupedvalues.iterrows():
    g.text(row.name,row.tip, round(row.total_bill,2), color='black', ha="center")

      

enter image description here

On shading using the example below I tried the following:

import pandas as pd
import seaborn as sns
%matplotlib inline
df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-book/master/ch08/tips.csv", sep=',')
groupedvalues=df.groupby('day').sum().reset_index()

pal = sns.color_palette("Greens_d", len(data))
rank = groupedvalues.argsort().argsort() 
g=sns.barplot(x='day',y='tip',data=groupedvalues)

for index, row in groupedvalues.iterrows():
    g.text(row.name,row.tip, round(row.total_bill,2), color='black', ha="center")

      

But it gave me the following error:

AttributeError: DataFrame has no argsort attribute

So, I tried the modification:

import pandas as pd
import seaborn as sns
%matplotlib inline
df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-book/master/ch08/tips.csv", sep=',')
groupedvalues=df.groupby('day').sum().reset_index()

pal = sns.color_palette("Greens_d", len(data))
rank=groupedvalues['total_bill'].rank(ascending=True)
g=sns.barplot(x='day',y='tip',data=groupedvalues,palette=np.array(pal[::-1])[rank])

      

and that leaves me with

IndexError: index 4 is off axis 0 with size 4

+14


source to share


4 answers


Let's stick to the solution from the linked question ( Changing the color scale on the marine pine plot ). You want to use argsort to determine the order of the colors used to color the stripes. In the linked question, argsort is applied to a Series object, which works fine, while here you have a DataFrame. Therefore, you need to select one column of this DataFrame in order to apply argsort.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

df = sns.load_dataset("tips")
groupedvalues=df.groupby('day').sum().reset_index()

pal = sns.color_palette("Greens_d", len(groupedvalues))
rank = groupedvalues["total_bill"].argsort().argsort() 
g=sns.barplot(x='day',y='tip',data=groupedvalues, palette=np.array(pal[::-1])[rank])

for index, row in groupedvalues.iterrows():
    g.text(row.name,row.tip, round(row.total_bill,2), color='black', ha="center")

plt.show()

      

enter image description here




The second try also works fine, the only problem is that the rank returned rank()

starts at 1

and not at zero. So you need to subtract 1 from the array. We also need integer values ​​for indexing, so we need to cast it to int

.
rank = groupedvalues['total_bill'].rank(ascending=True).values
rank = (rank-1).astype(np.int)

      

+20


source


Works with a single ax or an ax matrix (subplots)



from matplotlib import pyplot as plt
import numpy as np

def show_values_on_bars(axs):
    def _show_on_single_plot(ax):        
        for p in ax.patches:
            _x = p.get_x() + p.get_width() / 2
            _y = p.get_y() + p.get_height()
            value = '{:.2f}'.format(p.get_height())
            ax.text(_x, _y, value, ha="center") 

    if isinstance(axs, np.ndarray):
        for idx, ax in np.ndenumerate(axs):
            _show_on_single_plot(ax)
    else:
        _show_on_single_plot(axs)

fig, ax = plt.subplots(1, 2)
show_values_on_bars(ax)

      

+10


source


Just in case anyone is interested in labeling a horizontal plot , I modified Sharon's answer as shown below:

def show_values_on_bars(axs, h_v="v", space=0.4):
    def _show_on_single_plot(ax):
        if h_v == "v":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() / 2
                _y = p.get_y() + p.get_height()
                value = int(p.get_height())
                ax.text(_x, _y, value, ha="center") 
        elif h_v == "h":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() + float(space)
                _y = p.get_y() + p.get_height()
                value = int(p.get_width())
                ax.text(_x, _y, value, ha="left")

    if isinstance(axs, np.ndarray):
        for idx, ax in np.ndenumerate(axs):
            _show_on_single_plot(ax)
    else:
        _show_on_single_plot(axs)

      

Two parameters explained:

h_v

- horizontal or vertical barplot "h"

represents a horizontal plot, "v"

represents a vertical plot.

space

- space between the value text and the top edge of the panel. Works only in horizontal mode.

Example:

show_values_on_bars(sns_t, "h", 0.3)

      

enter image description here

+3


source


Hope this helps for # 2: a) You can sort by grand total, then reset the index to that column b) Use palette = "Blue" to use this color to scale the chart from light blue to blue (if blue or blue and then use palette = "Blues_d")

import pandas as pd
import seaborn as sns
%matplotlib inline

df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-book/master/ch08/tips.csv", sep=',')
groupedvalues=df.groupby('day').sum().reset_index()
groupedvalues=groupedvalues.sort_values('total_bill').reset_index()
g=sns.barplot(x='day',y='tip',data=groupedvalues, palette="Blues")

      

+2


source







All Articles