Removing strings at distinct indices from a numpy array

Question

Removing strings at distinct indices from a numpy array

In my dataset I have close to 200 rows, but for minimal work, for example, let's say the following array:

arr = np.array([[1,2,3,4], [5,6,7,8], 
               [9,10,11,12], [13,14,15,16], 
               [17,18,19,20], [21,22,23,24]])

I can take a random sample of 3 rows like this:

indexes = np.random.choice(np.arange(arr.shape[0]), int(arr.shape[0]/2), replace=False)

Using these indices, I can select my test cases like this:

testing = arr[indexes]

I want to delete rows in these indices and I can use the leftover items for my workout set.

From the post here , it seems like I training = np.delete(arr, indexes)

should. But I end up with a 1d array instead.

I also tried the suggestion here with help training = arr[indexes.astype(np.bool)]

, but it didn't give a clean separation. I am getting item [5,6,7,8] in train and test kits.

training = arr[indexes.astype(np.bool)]

testing
Out[101]: 
array([[13, 14, 15, 16],
       [ 5,  6,  7,  8],
       [17, 18, 19, 20]])

training
Out[102]: 
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Any idea what I am doing wrong? Thank.

+3

python arrays numpy

sedeh May 20 '15 at 5:01

source to share

2 answers

One approach is to get the remaining row indices with np.setdiff1d

and then use those row indices to get the desired result -

out = arr[np.setdiff1d(np.arange(arr.shape[0]), indexes)]

Or use np.in1d

to use boolean indexing

-

out = arr[~np.in1d(np.arange(arr.shape[0]), indexes)]

+2

Divakar May 20 '15 at 5:10

source to share

farhawa · Accepted Answer · 2015-05-20T05:10:49+0000

To remove indexed strings from a numpy array:

arr = np.delete(arr, indexes, axis=0)

Removing strings at distinct indices from a numpy array

More articles: