Convert character array that includes asterisks to numeric number in MATLAB

Question

Convert character array that includes asterisks to numeric number in MATLAB

I am trying to convert character arrays containing asterisks ('*') to numeric numbers.

I have a character vectors cell array based on data imported from a .dat file. For example, a cell array C

contains a column of cells (for example C{1,1}, C{2,1}, ... C{n,1}

), each containing a character vector, for example C{1,1}

contains:

'23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719'

When I try to convert C{1,1}

to a numeric double MATLAB returns an empty double character like

new_double = str2num(C{1,1})

new_double =

     []

When I remove the asterisk manually the code works:

 new_double = str2num(C{1,1})

 new_double =

   23.0000    1.0000    1.0000    1.0000   34.0000    5.0650    6.7190

All I want to do is read the data into a double array for further processing. I don't care if the command ignores the asterisks or replaces them with NaNs - the data with asterisks is not important to me. What matters is that I am reading data from the last two columns, for example 5.065 6.71. Unfortunately I cannot index them as they are embedded in a character vector.

I have also tried using:

c2 = C{1,1};
new_double = sscanf(c2,'%f%');

But he stops reading in an asterisk, for example,

new_double =

    23

I have searched all over the world, the only helpful post is: https://uk.mathworks.com/matlabcentral/answers/127847-how-to-read-csv-file-with-asterix However, I cannot use this method because I I am working with a character vector, not delimiters.

+3

string arrays double matlab character

PajamaNinja May 19 '17 at 17:55

source to share

2 answers

Let both do. In the first case, when you want to ignore the asterisks, you can remove them from the string and execute str2num

as usual. Defining your data:

C{1,1} = '23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719';

... you can use regular expressions to potentially remove multiple asterisks that are in a sequence (for example, if you have **

, ***

etc.) and change them to an empty string with regexprep

:

out = regexprep(C, '*+', '');

This tells us that for all strings in the cell array, C

we replace any existing sequence with *

an empty string.

In this case, we get:

>> out = regexprep(C, '*+', '')

out =

  cell

    '23.000                          1.000      1.000      1.000     34.000      5.065      6.719'

You can proceed and call str2num

accordingly. If you decide to replace the asterisks with NaN

, for example, just use regexprep

again, and NaN

instead of an empty string instead:

out = regexprep(C, '*+', 'NaN');

We get:

>> out = regexprep(C, '*+', 'NaN');

out =

  cell

    '23.000          NaN          NaN      1.000      1.000      1.000     34.000      5.065      6.719'

The point is to replace the affected parts of your string with something else, and regexprep

can definitely help.

+2

rayryeng May 19 '17 at 18:09

source to share

Luis mendo · Accepted Answer · 2017-05-19T18:12:27+0000

Here's another way:

C{1,1} = '23.000          *          *      1.000      1.000      1.000     34.000      5.065      6.719';
result = str2double(strsplit(C{1}));

This gives

result =
   23.0000       NaN       NaN    1.0000    1.0000    1.0000   34.0000    5.0650    6.7190

It works like this:

strsplit

separates the line in spaces. This gives a cell array of substrings formed by adjacent nonspatial characters,
str2double

converts each of the cells to a number and gives a numeric vector as the result, when NaN

for records that cannot be interpreted as numbers.

The advantage of using str2double

over str2num

is that the former doesn't use internally eval

, so it can't run potentially dangerous code.

Convert character array that includes asterisks to numeric number in MATLAB

More articles: