What is the difference between types <type 'numpy.string _'> and <type 'str'>?
numpy.string_
is a NumPy data type used for arrays containing fixed-width byte strings. On the other hand, it str
is a native Python type and cannot be used as a data type for NumPy * arrays.
If you create a NumPy array containing strings, the array will use the type numpy.string_
(or type numpy.unicode_
in Python 3). More precisely, the array will use a subtype like np.string_
:
>>> a = np.array(['abc', 'xy'])
>>> a
array(['abc', 'xy'], dtype='<S3')
>>> np.issubdtype('<S3', np.string_)
True
In this case, the data type '<S3'
: <
denotes byte order (little-endian), S
denotes the string type, and 3
indicates that each value in the array contains up to three characters (or bytes).
One property that separates np.string_
and str
is immutable. Attempting to increase the length of a Python object str
will create a new object in memory. Likewise, if you want a fixed-width NumPy array to hold more characters, a new larger array must be created in memory.
* Note that it is possible to create a NumPy array object
that contains references to Python objects str
, but such arrays behave differently to regular arrays.
source to share