Compute gradient in masked area only

Question

Compute gradient in masked area only

I have a very large array with several small ROIs. I need to compute the gradient of this array, but for performance reasons I need this calculation to be limited to those areas of interest.

I cannot do something like this:

phi_grad0[mask] = np.gradient(phi[mask], axis=0)

Because of how fancy indexing works, it phi[mask]

just becomes a 1D array of masked pixels, losing spatial information and making gradient calculations useless.

np.gradient

processes np.ma.masked_array

s, but performance is an order of magnitude worse:

import numpy as np
from timeit_context import timeit_context

phi = np.random.randint(low=-100, high=100, size=[100, 100])
phi_mask = np.random.randint(low=0, high=2, size=phi.shape, dtype=np.bool)

with timeit_context('full array'):
    for i2 in range(1000):
        phi_masked_grad1 = np.gradient(phi)

with timeit_context('masked_array'):
    phi_masked = np.ma.masked_array(phi, ~phi_mask)
    for i1 in range(1000):
        phi_masked_grad2 = np.gradient(phi_masked)

The following output is displayed here:

[full array] finished in 143 ms
[masked_array] finished in 1961 ms

I think this is because the operations performed on masked_array

are not vectorized, but I'm not sure.

Is there a way to limit np.gradient

to achieve better performance?

This timeit_context

is a handy timer that works like this if anyone is interested:

from contextlib import contextmanager
import time

@contextmanager
def timeit_context(name):
    """
    Use it to time a specific code snippet
    Usage: 'with timeit_context('Testcase1'):'
    :param name: Name of the context
    """
    start_time = time.time()
    yield
    elapsed_time = time.time() - start_time
    print('[{}] finished in {} ms'.format(name, int(elapsed_time * 1000)))

+3

performance python numpy

Daniel 18 jul. 17 at 14:04

source to share

1 answer

Daniel · Answer 1 · 2017-07-21T11:47:04+0000

Not really an answer, but this is what I was able to put together for my situation, which works really well:

I get 1D pixel indices where the condition is true (in this case, the condition < 5

for example):

def get_indices_1d(image, band_thickness):
    return np.where(image.reshape(-1) < 5)[0]

This gives me a 1D array with these indices.

Then I manually compute the gradient at these positions in different ways:

def gradient_at_points1(image, indices_1d):
    width = image.shape[1]
    size = image.size

    # Using this instead of ravel() is more likely to produce a view instead of a copy
    raveled_image = image.reshape(-1)

    res_x = 0.5 * (raveled_image[(indices_1d + 1) % size] - raveled_image[(indices_1d - 1) % size])
    res_y = 0.5 * (raveled_image[(indices_1d + width) % size] - raveled_image[(indices_1d - width) % size])

    return [res_y, res_x]


def gradient_at_points2(image, indices_1d):
    indices_2d = np.unravel_index(indices_1d, dims=image.shape)

    # Even without doing the actual deltas this is already slower, and we'll have to check boundary conditions, etc
    res_x = 0.5 * (image[indices_2d] - image[indices_2d])
    res_y = 0.5 * (image[indices_2d] - image[indices_2d])

    return [res_y, res_x]


def gradient_at_points3(image, indices_1d):
    width = image.shape[1]

    raveled_image = image.reshape(-1)

    res_x = 0.5 * (raveled_image.take(indices_1d + 1, mode='wrap') - raveled_image.take(indices_1d - 1, mode='wrap'))
    res_y = 0.5 * (raveled_image.take(indices_1d + width, mode='wrap') - raveled_image.take(indices_1d - width, mode='wrap'))

    return [res_y, res_x]


def gradient_at_points4(image, indices_1d):
    width = image.shape[1]

    raveled_image = image.ravel()

    res_x = 0.5 * (raveled_image.take(indices_1d + 1, mode='wrap') - raveled_image.take(indices_1d - 1, mode='wrap'))
    res_y = 0.5 * (raveled_image.take(indices_1d + width, mode='wrap') - raveled_image.take(indices_1d - width, mode='wrap'))

    return [res_y, res_x]

My test arrays look like this:

a = np.random.randint(-10, 10, size=[512, 512])

# Force edges to not pass the condition
a[:, 0] = 99
a[:, -1] = 99
a[0, :] = 99
a[-1, :] = 99

indices = get_indices_1d(a, 5)

mask = a < 5

Then I can run these tests:

with timeit_context('full gradient'):
    for i in range(100):
        grad1 = np.gradient(a)

with timeit_context('With masked_array'):
    for im in range(100):
        ma = np.ma.masked_array(a, mask)
        grad6 = np.gradient(ma)

with timeit_context('gradient at points 1'):
    for i1 in range(100):
        grad2 = gradient_at_points1(image=a, indices_1d=indices)

with timeit_context('gradient at points 2'):
    for i2 in range(100):
        grad3 = gradient_at_points2(image=a, indices_1d=indices)

with timeit_context('gradient at points 3'):
    for i3 in range(100):
        grad4 = gradient_at_points3(image=a, indices_1d=indices)

with timeit_context('gradient at points 4'):
    for i4 in range(100):
        grad5 = gradient_at_points4(image=a, indices_1d=indices)

Which gives the following results:

[full gradient] finished in 576 ms
[With masked_array] finished in 3455 ms
[gradient at points 1] finished in 421 ms
[gradient at points 2] finished in 451 ms
[gradient at points 3] finished in 112 ms
[gradient at points 4] finished in 102 ms

As you can see, method 4 is by far the best (no matter how much memory it consumes).

This is only possible because my 2D array is relatively small (512x512). This may not be the case with much larger arrays.

Another caveat is that it ndarray.take(indices, mode='wrap')

will do some weird things around the edges of the image (one line will "loop" into the next, etc.) to maintain good performance, so if edges are always important to your application, you might want to type input array with 1 pixel around the edges.

Still wondering how slow they are masked_array

. Pulling the constructor ma = np.ma.masked_array(a, mask)

outside the loop does not affect the time, since it itself masked_array

just stores references to the array and its mask

Compute gradient in masked area only

More articles: