Is there a way to resize an array that does not support the original size (or a convenience mode of operation)?

Question

Is there a way to resize an array that does not support the original size (or a convenience mode of operation)?

As a simplified example, let's say I have a dataset of 40 sorted values. The values in this example are integers, although this is not necessary for the actual dataset.

import numpy as np
data = np.linspace(1,40,40)

I am trying to find the maximum value within a dataset for certain window sizes. The formula for calculating the window sizes gives a pattern that is best done with arrays (in my opinion). For simplicity, say the indices denoting the size of the window are a list [1,2,3,4,5]

; this matches the dimensions of the window [2,4,8,16,32]

(template 2**index

).

## this code looks long because I've provided docstrings
## just in case the explanation was unclear

def shapeshifter(num_col, my_array=data):
    """
    This function reshapes an array to have 'num_col' columns, where 
    'num_col' corresponds to index.
    """
    return my_array.reshape(-1, num_col)

def looper(num_col, my_array=data):
    """
    This function calls 'shapeshifter' and returns a list of the 
    MAXimum values of each row in 'my_array' for 'num_col' columns. 
    The length of each row (or the number of columns per row if you 
    prefer) denotes the size of each window.
    EX:
        num_col = 2
        ==> window_size = 2
        ==> check max( data[1], data[2] ),
                  max( data[3], data[4] ),
                  max( data[5], data[6] ), 
                               .
                               .
                               .
                  max( data[39], data[40] )
            for k rows, where k = len(my_array)//num_col
    """
    my_array = shapeshifter(num_col=num_col, my_array=data)
    rows = [my_array[index] for index in range(len(my_array))]
    res = []
    for index in range(len(rows)):
        res.append( max(rows[index]) )
    return res

So far, the code is fine. I tested it with the following:

check1 = looper(2)
check2 = looper(4)
print(check1)
>> [2.0, 4.0, ..., 38.0, 40.0] 
print(len(check1))
>> 20
print(check2)
>> [4.0, 8.0, ..., 36.0, 40.0] 
print(len(check2))
>> 10

So far so good. Now here's my problem.

def metalooper(col_ls, my_array=data):
    """
    This function calls 'looper' - which calls
    'shapeshifter' - for every 'col' in 'col_ls'.

    EX:
        j_list = [1,2,3,4,5]
        ==> col_ls = [2,4,8,16,32]
        ==> looper(2), looper(4),
            looper(8), ..., looper(32)
        ==> shapeshifter(2), shapeshifter(4),
            shapeshifter(8), ..., shapeshifter(32)
                such that looper(2^j) ==> shapeshifter(2^j)
                for j in j_list
    """
    res = []
    for col in col_ls:
        res.append(looper(num_col=col))
    return res

j_list = [2,4,8,16,32]
check3 = metalooper(j_list)

Running the above code provides this error:

ValueError: total size of new array must be unchanged

From an 40 data points

array can be changed to 2 columns

from 20 rows

or 4 columns

from 10 rows

or 8 columns

from 5 rows

, BUT in 16 columns

, an array cannot be changed without trimming the data with 40/16 ≠ integer

. I believe this is a problem with my code, but I don't know how to fix it.

I hope there is a way to truncate the last values on each line that don't fit in every window. If this is not possible, I hope I can add zeros to fill in records that maintain the size of the original array so that I can remove the zeros after. Or maybe even a complex block if

- try

- break

. What are some ways to solve this problem?

+3

arrays python-3.x numpy error-handling reshape

mikey Mar 30 17 at 9:57

source to share

2 answers

Here's a generalized way of modifying with truncation:

def reshape_and_truncate(arr, shape):
    desired_size_factor = np.prod([n for n in shape if n != -1])
    if -1 in shape:  # implicit array size
        desired_size = arr.size // desired_size_factor * desired_size_factor
    else:
        desired_size = desired_size_factor
    return arr.flat[:desired_size].reshape(shape)

Which yours shapeshifter

can be used insteadreshape

+2

Eric Mar 30 17 at 11:01

source to share

Daniel F · Accepted Answer · 2017-03-30T11:01:30+0000

I think this will give you what you want in one step:

def windowFunc(a, window, f = np.max):
    return np.array([f(i) for i in np.split(a, range(window, a.size, window))])

with a default f

which will give you the maximum maximum for your windows.

Typically using np.split

and range

, this will allow you to split into a (possibly dangling) list of arrays:

def shapeshifter(num_col, my_array=data):    
    return np.split(my_array, range(num_col, my_array.size, num_col))

You need a list of arrays, because a 2D array cannot be torn off (each row needs the same number of columns)

If you really want to use zeros, you can use np.lib.pad

:

def shapeshifter(num_col, my_array=data):
    return np.lib.pad(my_array, (0, num_col - my.array.size % num_col), 'constant',  constant_values = 0).reshape(-1, num_col)

Attention:

Also technically it is possible to use, for example, a.resize(32,2)

which will create a ndArray

null padded (as you requested). But there are some big caveats:

You will need to calculate the second axis because the -1

tricks don't work with resize

.

If the original array a

references anything else, it a.resize

will fail:

ValueError: cannot resize an array that references or is referenced
by another array in this way.  Use the resize function

The function is resize

(i.e. np.resize(a)

) not equivalent a.resize

, as instead of padding with zeros, it will go back to the beginning.

Since you seem to want to reference a

the number of windows, a.resize

not very helpful. But it's a rabbit hole that's easy to fall into.

EDIT:

Scrolling through the list is slow. If your entrance is long and the windows are small, then there windowFunc

will be a swamp higher in cycles for

. This should be more efficient:

def windowFunc2(a, window, f = np.max):
    tail = - (a.size % window)
    if tail == 0:
        return f(a.reshape(-1, window), axis = -1)
    else:
        body = a[:tail].reshape(-1, window)
        return np.r_[f(body, axis = -1), f(a[tail:])]

Is there a way to resize an array that does not support the original size (or a convenience mode of operation)?

More articles: