Find related elements (runs) in a 3D array for a given direction
I have a 3D array, for example:
array = np.array([[[ 1., 1., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]])
I need to find purlins for every possible size for different directions. For example, in direction (1,0,0), the horizontal direction, the output should be:
> **Runs of size 1: 2
> **Runs of size 2: 1
> **Runs of size 3: 0
In the direction (0, 1, 0) along the vertical axis:
> **Runs of size 1: 4
> **Runs of size 2: 0
> **Runs of size 3: 0
And in the direction (0, 0, 1) along the z-axis:
> **Runs of size 1: 1
> **Runs of size 2: 0
> **Runs of size 3: 1
Is there an efficient way to accomplish this?
EDIT:
I am attaching the solution I am working on:
dire = np.array(([1, 0, 0], [0, 1, 0], [0, 0, 1])) # Possible directions
results = np.zeros((array.shape[0], len(dire))) # Maximum runs will be 3
# Find all indices
indices = np.argwhere(array == 1)
# Loop over each direction
for idire, dire in enumerate(dire):
results[0, idire] = np.count_nonzero(array) # Count all 1 (conection 0)
indices_to_compare = indices
# Now we compare the list of indices with a displaced list of indices
for irun in range(1, array.shape[0]):
indices_displaced = indices + dire*irun
aset = set([tuple(x) for x in indices_displaced])
bset = set([tuple(x) for x in indices_to_compare])
indices_to_compare = (np.array([x for x in aset & bset]))
results[irun, idire] = len(indices_to_compare)
# Rest the counts off bigger runs to the smaller ones, for duplicates
for indx in np.arange(array.shape[0]-2,-1,-1):
results[indx, idire] -=
np.sum(results[np.arange(array.shape[0]-1,indx,-1), idire])
print(results) # Each column is a direction, each row is the number of bins.
> [[ 2. 4. 3.]
[ 1. 0. 1.]
[ 1. 0. 0.]]
So far this code doesn't work, only for direction (0,1,0), which has no connection and is trivial. With these notations, the expected result should be:
> [[ 2. 4. 1.]
[ 1. 0. 0.]
[ 0. 0. 1.]]
Comparison of arrays from this link .
EDIT 2:
I was wrong in my interpretation of the coordinates, it seems that the complement to the results np.argwhere
, (1,0,0) is above the z axist, so the expected result should be:
> [[ 1. 4. 2.]
[ 0. 0. 1.]
[ 1. 0. 0.]]
source to share
Here's a vectorial solution that works along the axial direction:
def run_lengths(arr, axis):
# convert to boolean, for speed - no need to store 0 and 1 as floats
arr = arr.astype(bool)
# move the axis in question to the start, to make the slicing below easy
arr = np.moveaxis(arr, axis, 0)
# find positions at the start and end of a run
first = np.empty(arr.shape, dtype=bool)
first[:1] = arr[:1]
first[1:] = arr[1:] & ~arr[:-1]
last = np.zeros(arr.shape, dtype=bool)
last[-1:] = arr[-1]
last[:-1] = arr[:-1] & ~arr[1:]
# count the number in each run
c = np.cumsum(arr, axis=0)
lengths = c[last] - c[first]
# group the runs by length. Note the above gives 0 for a run of length 1,
# so result[i] is the number of runs of length (i + 1)
return np.bincount(lengths, minlength=len(arr))
for i in range(3):
print(run_lengths(array, axis=i))
[1 0 1]
[4 0 0]
[2 1 0]
source to share
Here's a solution that works in any direction and combination of directions. Basically, we are using an array to create a graph in which True elements correspond to nodes and edges if there is a node in any allowed direction. We then calculate the related components and their sizes. Uses networkx to find connected components in the graph, which can be easily installed via pip.
import numpy as np
import networkx
def array_to_graph(bool_array, allowed_steps):
"""
Arguments:
----------
bool_array -- boolean array
allowed_steps -- list of allowed steps; e.g. given a 2D boolean array,
[(0, 1), (1, 1)] signifies that from coming from element
(i, j) only elements (i, j+1) and (i+1, j+1) are reachable
Returns:
--------
g -- networkx.DiGraph() instance
position_to_idx -- dict mapping (i, j) position to node idx
idx_to_position -- dict mapping node idx to (i, j) position
"""
# ensure that bool_array is boolean
assert bool_array.dtype == np.bool, "Input array has to be boolean!"
# map array indices to node indices and vice versa
node_idx = range(np.sum(bool_array))
node_positions = zip(*np.where(bool_array))
position_to_idx = dict(zip(node_positions, node_idx))
# create graph
g = networkx.DiGraph()
for source in node_positions:
for step in allowed_steps: # try to step in all allowed directions
target = tuple(np.array(source) + np.array(step))
if target in position_to_idx:
g.add_edge(position_to_idx[source], position_to_idx[target])
idx_to_position = dict(zip(node_idx, node_positions))
return g, idx_to_position, position_to_idx
def get_connected_component_sizes(g):
component_generator = networkx.connected_components(g)
component_sizes = [len(component) for component in component_generator]
return component_sizes
def test():
array = np.array([[[ 1., 1., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]], dtype=np.bool)
directions = [(1,0,0), (0,1,0), (0,0,1)]
for direction in directions:
graph, _, _ = array_to_graph(array, [direction])
sizes = get_connected_component_sizes(graph.to_undirected())
counts = np.bincount(sizes)
print direction
for ii, c in enumerate(counts):
if ii > 1:
print "Runs of size {}: {}".format(ii, c)
return
NOTE : "Runs" size 1 doesn't count (correctly), but I'm guessing you're not interested in them anyway. Otherwise, you can easily compute them ex-post by calculating the total number of True elements in a specific direction and subtracting the number of runtime runs.
source to share
I add a variant of the code I was working on, now it seems to work:
import numpy as np
array = np.array([[[ 1., 1., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]])
dire = np.array(([1, 0, 0], [0, 1, 0], [0, 0, 1])) # Possible directions
results = np.zeros((array.shape[0], len(dire))) # Maximum runs will be 3
# Find all indices
indices = np.argwhere(array == 1)
# Loop over each direction
for idire, dire in enumerate(dire):
results[0, idire] = len(indices) # Count all 1 (conection 0)
indices_to_compare = indices
for irun in range(1, array.shape[0]):
print('dir:',dire, 'run', irun)
indices_displaced = indices + dire*irun
aset = set([tuple(x) for x in indices_displaced])
bset = set([tuple(x) for x in indices_to_compare])
indices_to_compare = (np.array([x for x in aset & bset]))
results[irun, idire] = len(indices_to_compare)
for iindx in np.arange(array.shape[0]-2,-1,-1):
for jindx in np.arange(iindx+1, array.shape[0]):
print(iindx,jindx, results[jindx, idire])
results[iindx, idire] -= results[jindx, idire] *(jindx-iindx+1)
print('\n',results)
source to share