Planning a heatmap for 3 columns in python with marine

v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86


In the above data window, I want to plot a heating diagram using v1 and v2 as the x and y axes and yy as the value. How can I do this in python? I tried the boat trip:

df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)


However, this doesn't work. Any other solution?


source to share

2 answers

Marine heatmap

contains categorical data. This means that each value that occurs will occupy the same space in the heatmap as any other value, no matter how numerically separated. This is usually not desirable for numeric data. Instead, one of the following methods can be selected.


A color scatter plot can be as good as a heatmap. The point colors will represent the value yy


ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")


enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

sc = ax.scatter(df.v1, df.v2, c=df.yy,  cmap="copper")

fig.colorbar(sc, ax=ax)


Run codeHide result


You can take a look Hexbin

. The data will be displayed in hexagonal cells and the data is aggregated as an average within each bin. The advantage here is that if you choose a large grid it will look like a scatter plot, and if you make it small it will look like a heat map, allowing you to easily adjust the graph to your desired resolution.

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")


enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, (ax, ax2) = plt.subplots(nrows=2)

h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")

fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)

Run codeHide result



A Tripcolor

can be used to produce color responses on a graph according to the data, which is then interpreted as the edges of the triangles colored from the edge point data. Such a graph would require more data to provide meaningful insight.

ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")


enter image description here

u = u"""v1      v2      yy
15.25   44.34   100.00
83.05   59.78   100.00
96.61   65.09   100.00
100.00  75.47   100.00
100.00  50.00   100.00
100.00  68.87   100.00
100.00  79.35   100.00
100.00  100.00  100.00
100.00  63.21   100.00
100.00  100.00  100.00
100.00  68.87   100.00
0.00    56.52   92.86
10.17   52.83   92.86
23.73   46.23   92.86"""

import pandas as pd
import matplotlib.pyplot as plt
import io

df = pd.read_csv(io.StringIO(u), delim_whitespace=True )

fig, ax = plt.subplots()

tc = ax.tripcolor(df.v1, df.v2, df.yy,  cmap="copper")

fig.colorbar(tc, ax=ax)


Run codeHide result

Note that the graph tricontourf

may be the same if more points are available across the entire grid.

ax.tricontourf(df.v1, df.v2, df.yy,  cmap="copper")




The problem is your data has duplicate values ​​like:

100.00  100.00  100.00
100.00  100.00  100.00


You need to reset the duplicate values, then expand and write like this:

import seaborn as sns
import pandas as pd

# fill data

df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)

print (pivot)


enter image description here


v2      44.34   46.23   50.00   52.83   56.52   59.78   63.21   65.09   \
0.00       NaN     NaN     NaN     NaN   92.86     NaN     NaN     NaN   
10.17      NaN     NaN     NaN   92.86     NaN     NaN     NaN     NaN   
15.25    100.0     NaN     NaN     NaN     NaN     NaN     NaN     NaN   
23.73      NaN   92.86     NaN     NaN     NaN     NaN     NaN     NaN   
83.05      NaN     NaN     NaN     NaN     NaN   100.0     NaN     NaN   
96.61      NaN     NaN     NaN     NaN     NaN     NaN     NaN   100.0   
100.00     NaN     NaN   100.0     NaN     NaN     NaN   100.0     NaN   

v2      68.87   75.47   79.35   100.00  
0.00       NaN     NaN     NaN     NaN  
10.17      NaN     NaN     NaN     NaN  
15.25      NaN     NaN     NaN     NaN  
23.73      NaN     NaN     NaN     NaN  
83.05      NaN     NaN     NaN     NaN  
96.61      NaN     NaN     NaN     NaN  
100.00   100.0   100.0   100.0   100.0  




All Articles