Limit / mask matplotlib contour in data area

I have a pandas DataFrame with non-evenly spaced data points given by columns x, y and z, where x and y are variable pairs and z is the dependent variable. For example:

import matplotlib.pyplot as plt
from matploblib.mlab import griddata
import numpy as np
import pandas as pd

df = pd.DataFrame({'x':[0, 0, 1, 1, 3, 3, 3, 4, 4, 4], 
                   'y':[0, 1, 0, 1, 0.2, 0.7, 1.4, 0.2, 1.4, 2], 
                   'z':[50, 40, 40, 30, 30, 30, 20, 20, 20, 10]})

x = df['x']
y = df['y']
z = df['z']

      

I want to make a contour plot of the dependent variable z over x and y. To do this, I create a new grid to interpolate the data using the matplotlib.mlab griddata function.

xi = np.linspace(x.min(), x.max(), 100)
yi = np.linspace(y.min(), y.max(), 100)
z_grid = griddata(x, y, z, xi, yi, interp='linear')
plt.contourf(xi, yi, z_grid, 15)
plt.scatter(x, y, color='k') # The original data points
plt.show()

      

As long as it works, the result is not what I want. I don't want griddata to interpolate beyond the bounds given by the minimum and maximum data values ​​x and y. The following plots are displayed after calling plt.show () and then highlighted in purple which data area I want to make interpolated and outlined. The outline outside the purple line must be empty. How can I mask remote data?

Site created by mpl The plot as it should be

the related question unfortunately does not answer my question as I have no clear mathematical way of defining conditions for triangulation. Is it possible to define a condition to mask data based on data only, taking the above Dataframe as an example?

+3


source to share


1 answer


As you can see from the answer to this question , you can enter a condition to mask values.

Suggestion on "I don't want griddata to interpolate beyond the bounds given by the minimum and maximum data values ​​x and y". implies that there is some min / max condition that can be used.

If it is not, you can click the path using the path. The points of this path must be specified as there is no general way of knowing which points should be edges. The following code does this for three different possible paths.



import matplotlib.pyplot as plt
from matplotlib.path import Path
from matplotlib.patches import PathPatch
from matplotlib.mlab import griddata
import numpy as np
import pandas as pd

df = pd.DataFrame({'x':[0, 0, 1, 1, 3, 3, 3, 4, 4, 4], 
                   'y':[0, 1, 0, 1, 0.2, 0.7, 1.4, 0.2, 1.4, 2], 
                   'z':[50, 40, 40, 30, 30, 30, 20, 20, 20, 10]})

x = df['x']
y = df['y']
z = df['z']

xi = np.linspace(x.min(), x.max(), 100)
yi = np.linspace(y.min(), y.max(), 100)
z_grid = griddata(x, y, z, xi, yi, interp='linear')

clipindex = [ [0,2,4,7,8,9,6,3,1,0],
              [0,2,4,7,5,8,9,6,3,1,0],
              [0,2,4,7,8,9,6,5,3,1,0]]

fig, axes = plt.subplots(ncols=3, sharey=True)
for i, ax in enumerate(axes):
    cont = ax.contourf(xi, yi, z_grid, 15)
    ax.scatter(x, y, color='k') # The original data points
    ax.plot(x[clipindex[i]], y[clipindex[i]], color="crimson")

    clippath = Path(np.c_[x[clipindex[i]], y[clipindex[i]]])
    patch = PathPatch(clippath, facecolor='none')
    ax.add_patch(patch)
    for c in cont.collections:
        c.set_clip_path(patch)

plt.show()

      

enter image description here

+3


source







All Articles