XYZ massive data for heatmap

I could not find a consensus answer on this question or one that suits my needs. I have data in three columns of a text file: X, Y and Z. The columns are separated by tabs. I would like to make a heatmap view of this data with Python where the X and Y positions are shaded by a value in Z that ranges from 0 to 1 (the discrete probability of X and Y). I tried the nautical heatmap and matplotlib pcolormesh package, but unfortunately they need two dimensional datasets.

My data goes through X from 1 to 37 for constant y, and then iterates by 0.1 into y. y max fluctuates depending on the dataset, but ymin is always 0.

[XYZ] row1 [1 ... 37 0.0000 Zvalue], row2 [1 ... 37 0.1000 Zvalue], etc.

import numpy as np
from numpy import *
import pandas as pd
import seaborn as sns
sns.set()

df = np.loadtxt(open("file.txt", "rb"), delimiter="\t").astype("float")

      

Any hints for the next steps?

+3


source to share


1 answer


If you understood correctly, you have three columns with X and Y indicating the position of the Z value.

Consider the following example. There are three columns: X and Y contain positional information (categories in this case) and Z contains values ​​for shading the heat map.

x = np.array(['a','b','c','a','b','c','a','b','c'])
y = np.array(['a','a','a','b','b','b','c','c','c'])
z = np.array([0.3,-0.3,1,0.5,-0.25,-1,0.25,-0.23,0.25])

      

We then create a dataframe from those columns and wrap them around (so x, y and z actually become columns). Provide column names and make sure Z_value is a number.

df = pd.DataFrame.from_dict(np.array([x,y,z]).T)
df.columns = ['X_value','Y_value','Z_value']
df['Z_value'] = pd.to_numeric(df['Z_value'])

      

resulting in this data frame.

X_value Y_value Z_value
0   a   a   0.30
1   b   a   -0.30
2   c   a   1.00
3   a   b   0.50
4   b   b   -0.25
5   c   b   -1.00
6   a   c   0.25
7   b   c   -0.23
8   c   c   0.25

      

You cannot create a heatmap from this, however, by calling df.pivot('Y_value','X_value','Z_value')

, you rotate the data frame into a shape that can be used for a heatmap.



pivotted= df.pivot('Y_value','X_value','Z_value')

      

The resulting data file looks like this.

X_value a   b   c
Y_value         
a   0.30    -0.30   1.00
b   0.50    -0.25   -1.00
c   0.25    -0.23   0.25

      

Then you can send pivotted

in sns.heatmap

to create your memory card.

sns.heatmap(pivotted,cmap='RdBu')

      

The result is in this heatmap.

enter image description here

You may need to make some code changes for your exact needs. But since I didn't have any data for the example, I had to make my own example.

+1


source







All Articles