Is it possible to ignore comment lines when reading to a CSV file in Octave / MATLAB?
I have a data file that looks like this:
# data file
# blah
# blah
0.000000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.020000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.040000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.060000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.080000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.100000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
0.120000, 0.0, 24.198, 6.864,NaN,NaN,NaN,NaN
and I would like to read it using the Octave program.
csvread (file, 3.0) works fine in this case, but I'm worried about having to work 3 manually.
Is there a way to say "throw away all lines starting with C # and any blank lines before doing csvread"?
source to share
In an octave, you can do
d = load("yourfile")
which should ignore # lines
Edit:
The above uses automatic file type detection, you can also force it with d = load ("-ascii", "yourfile").
Quoting from help load
:
'-ascii'
Force Octave to assume the file contains columns of numbers in
text format without any header or other information. Data in
the file will be loaded as a single numeric matrix with the
name of the variable derived from the name of the file.
Unfortunately, the help doesn't mention that lines starting with% or # are ignored. To do this, you need to look at the source code (which is fortunately available since GNU Octave is free software) get_mat_data_input_line from octave source
From there, you will see that all characters after the% or # are skipped.
source to share
csvread
does not allow this option. You can use instead textscan
, but then you need to know how many columns (or rows) your csv file has.
For example:
fid = fopen('csvFile.csv','r');
c = textscan(fid,'%f','commentStyle','#','delimiter',',');
fclose(fid); %# close the file as soon as we don't need it anymore
array = reshape([c{:}],[],7)';
source to share
Here's a way to skip the header line, starting with the comment line. The string csvread
can be replaced with a call dlmread
for delimiters other than ','
. Both of these functions are much faster than textscan
the 3.8.2 octave.
fid = fopen('csvFile.csv','r');
comment = '#';
while strcmp(fgets(fid, length(comment)), comment)
% line begins with a comment, skip it
fskipl(fid);
endwhile
% get back, because the last read length(comment) characters
% are not comments, actually
fseek(fid, -length(comment), SEEK_CUR);
c = csvread(fid);
fclose(fid);
source to share