Slice 1D Array in Numpy without loop

Question

Slice 1D Array in Numpy without loop

I have an array x

as shown below:

x=np.array(["83838374747412E61E4C202C004D004D004D020202C3CF",
            "8383835F6260127314A0127C078E07090705023846C59F",
            "83838384817E14231D700FAC09BC096808881E1C1BC68F",
            "8484835C535212600F860A1612B90FCF0FCF012A2AC6BF",
            "848484787A7A1A961BAC1E731086005D005D025408C6CF",
            "8484845050620C300D500A9313E613E613012A2A5CC4BF",
            "838383757C7CF18F02192653070D03180318080101BE6F",
            "8584845557570F090E830F4309E5080108012A2A2AC6DF",
            "85858453536B07D608B3124C102A102A1026010101C61F",
            "83838384848411A926791C162048204820484D4444C3BF"], dtype=object)

These are the concatenated hex values that I need to trim to convert to integers and then apply the conversion factors. I need an array, for example:

[83,83,83,84,84,84,83,85,85,83]

What would be the equivalent x[:,0:2]

, but I cannot slice this array (10,)

. I am trying to do something similar to what a character array would do in MatLab. I will be doing this in millions of lines, so I am trying to avoid the loop.

+3

python arrays vectorization numpy slice

user3338505 20 oct. 14 at 17:54

source to share

2 answers

Here is a pythonic way to do it

Consider part of your line

x = "83838374747412E61E4C202C004D004D004D020202C3CF8383835F626012"

You can combine map

, join

, zip

and iter

to make it work

xArray = array(map(''.join, zip(*[iter(x)]*2)))

Then you can handle converting hex values to integer using the vectorized int form

intHex   = vectorize(int)
xIntForm = intHex(xArray,16)

I'm not sure about the performance of the function vectorize

, although this is part of numpy.

Greetings

0

mrcl 21 oct. '14 at 3:19

source to share

Alex Riley · Accepted Answer · 2014-10-20T19:29:24+0000

If you're right after the first two characters of each hex value, one option is to convert your array to dtype

from '|S2'

:

>>> x.astype('|S2')
array(['83', '83', '83', '84', '84', '84', '83', '85', '85', '83'], 
  dtype='|S2')

This idea can be generalized to return the first characters n

from each line.

In NumPy, it is much more difficult to do arbitrary parsing of string arrays. The answers on this page explain why this is not the best string tool, but show what might be possible.

Alternatively, the Pandas library facilitates a fast vectorization operation (which is built on top of NumPy). It has a number of very useful string operations that make slicing much easier than plain NumPy:

>>> import pandas as pd
>>> s = pd.Series(x)
>>> s.str.slice(2, 9)
0    8383747
1    83835F6
2    8383848
3    84835C5
4    8484787
5    8484505
6    8383757
7    8484555
8    8584535
9    8383848
dtype: object

Slice 1D Array in Numpy without loop

More articles: