Slice 1D Array in Numpy without loop
I have an array x
as shown below:
x=np.array(["83838374747412E61E4C202C004D004D004D020202C3CF",
"8383835F6260127314A0127C078E07090705023846C59F",
"83838384817E14231D700FAC09BC096808881E1C1BC68F",
"8484835C535212600F860A1612B90FCF0FCF012A2AC6BF",
"848484787A7A1A961BAC1E731086005D005D025408C6CF",
"8484845050620C300D500A9313E613E613012A2A5CC4BF",
"838383757C7CF18F02192653070D03180318080101BE6F",
"8584845557570F090E830F4309E5080108012A2A2AC6DF",
"85858453536B07D608B3124C102A102A1026010101C61F",
"83838384848411A926791C162048204820484D4444C3BF"], dtype=object)
These are the concatenated hex values ββthat I need to trim to convert to integers and then apply the conversion factors. I need an array, for example:
[83,83,83,84,84,84,83,85,85,83]
What would be the equivalent x[:,0:2]
, but I cannot slice this array (10,)
. I am trying to do something similar to what a character array would do in MatLab. I will be doing this in millions of lines, so I am trying to avoid the loop.
source to share
If you're right after the first two characters of each hex value, one option is to convert your array to dtype
from '|S2'
:
>>> x.astype('|S2')
array(['83', '83', '83', '84', '84', '84', '83', '85', '85', '83'],
dtype='|S2')
This idea can be generalized to return the first characters n
from each line.
In NumPy, it is much more difficult to do arbitrary parsing of string arrays. The answers on this page explain why this is not the best string tool, but show what might be possible.
Alternatively, the Pandas library facilitates a fast vectorization operation (which is built on top of NumPy). It has a number of very useful string operations that make slicing much easier than plain NumPy:
>>> import pandas as pd
>>> s = pd.Series(x)
>>> s.str.slice(2, 9)
0 8383747
1 83835F6
2 8383848
3 84835C5
4 8484787
5 8484505
6 8383757
7 8484555
8 8584535
9 8383848
dtype: object
source to share
Here is a pythonic way to do it
Consider part of your line
x = "83838374747412E61E4C202C004D004D004D020202C3CF8383835F626012"
You can combine map
, join
, zip
and iter
to make it work
xArray = array(map(''.join, zip(*[iter(x)]*2)))
Then you can handle converting hex values ββto integer using the vectorized int form
intHex = vectorize(int)
xIntForm = intHex(xArray,16)
I'm not sure about the performance of the function vectorize
, although this is part of numpy.
Greetings
source to share