Matlab - extract numbers from (odd) string

I have a series of lines in a cvs file, they all look like the two below:

7336598,"[4125420656L, 2428145712L, 1820029797L, 1501679119L, 1980837904L, 380501274L]"
7514340,"[507707719L, 901144614L, 854823005L]"
....

      

how can i extract numbers in it? As in .. to restore 7336598, 4125420656, etc.

Tried textscan

and regexp

but not much success ...

Sorry for the newbie question ... and thanks for watching! :)

Edit: The size of each line is variable.

+3


source to share


2 answers


You can only use textread

and regexp

extract numbers from a CSV file:

C = textread('file.cvs', '%s', 'delimiter', '\n');
C = regexp(C, '\d+', 'match'); 

      

The regex is pretty simple. In the MATLAB template, regexp

\d

denotes a digit, and +

indicates that the digit should be repeated at least once. The mode match

says to regexp

return matched rows.

The result is the string cell array . You can go ahead and convert strings to numeric values:



C = cellfun(@(x)str2num(sprintf('%s ', x{:})), C, 'Uniform', false)

      

The result is stored in an array of cells. If you can guarantee that each row will have the same number of numeric values, you can convert the cell array to a matrix:

A = cell2mat(C);

      

+6


source


I don't have a Matlab benchmark to test, but does this [[0-9] +] get the job done?

This works for me outside matlab:



echo '7336598,"[4125420656L, 2428145712L, 1820029797L, 1501679119L, 1980837904L, 380501274L]"' | grep -o '[0-9]\+'
7336598
4125420656
2428145712
1820029797
1501679119
1980837904
380501274

      

+2


source







All Articles