How to import Excel data into sas
I solved a similar problem in SAS 9.2 to import in two moves , one for exploring the worksheet and one for extracting data.
This is a generalization of what I did there, but excuse me for posting a source that I haven't tested: I don't have SAS installed on my PC. Let's assume your data might look like this (when saved as a tab-delimited file):
Some title that does not interust us
Author Dirk Horsten
Date 01-Jan-15
Other Irrelevant thing
Bar Foo Val Remark
A Alfa 1 This is the first line
B Beta 2 This is the second line
C Gamma 3 This is the last line
So the actual data starts in cell C6 with the column heading "Bar". Suppose also that we know that we are finding columns "Foo", "Bar" and "Val" and perhaps some other columns that we are not interested in in an unknown order, and we do not know in advance how many rows of data there are.
Now we naively import list for the first time and ask for sasHelp, to see what it was read: ;
/** First stroke import, to explore the content of the sheet **/
proc import datafile="&file_name" out=temp_out dbms=excelcs replace;
sheet="&sheet_name";
run;
/** Find out what SAS read in **/
proc sql;
select couint(*) into :nrColstempCos separ by ' '
from sashelp.vcolumn where libName = 'WORK' and memName = 'TEMP_OUT';
select name into :tempCos separated by ' '
from sashelp.vcolumn where libName = 'WORK' and memName = 'TEMP_OUT';
quit;
Next, we look for headers and data, so we know how to read it correctly. ; This works if all columns have been interpreted as signed values, but unfortunately Excel cannot be forced to do so.
data _null_;
set temp_out end=last;
array temp {*} &tempCols.;
retain foo_col bar_col val_col range_bottom 0;
if not (foo_col and bar_col and val_col) then do;
range_left = 0;
range_right = 0;
/* Find out if we finally found the headers */
do col = 1 to &nrCols.;
select (upcase(temp(col));
when ('FOO') do;
foo_col = col;
if not range_left then range_left = col;
rang_right = col;
end;
when ('BAR') do;
bar_col = col;
if not range_left then range_left = col;
rang_right = col;
end;
when ('VALUE') do;
val_col = col;
if not range_left then range_left = col;
rang_right = col;
end;
otherwise;
end;
end;
if (foo_col and bar_col and val_col) then do;
/** remember where the headers were found **/
range_top = _N_ + 1;
call symput ('rangeTop', range_top);
rangeLeft = byte(rank('A') + range_left - 1);
call symput ('rangeLeft', rangeLeft);
rangeRight = byte(rank('A') + range_right - 1);
call symput ('rangeRight', rangeRight);
end;
end;
else do;
/** find out if there is data on this line **/
if (temp(foo_col) ne '' and temp(bar_col) ne '' and temp(val_col) ne '')
then range_bottom = _N_ + 1;
end;
/** remember where the last data was found **/
if last then call symput ('rangeBottom', range_bottom);
run;
To calculate rangeTop and rangeBottom, we take into account that the _N_th observation in SAS comes from the N + 1th row in excel, because the first row in excel is interpreted as headers.
To calculate rangeLeft and rangeRight, we have to find the relative position in the left column in the range we will be reading and convert it to letters
Now we only read the relevant data ;
/** Second stroke import, to read in the actual data **/
proc import datafile="&file_name" out=&out_ds dbms=excelcs replace;
sheet="&heet_name";
range="&rangeLeft.&rangeTop.&rangeRight.&rangeBottom.";
run;
Success. Feel free to test this code if you have SAS on your machine and fix it.
source to share
The following should work no matter how many lines precede your data, as long as the lines preceding your data are completely empty.
libname xl excel 'C:\somefile.xlsx';
data sheet;
set xl.'Sheet1$'n;
run;
libname xl clear;
This sets up your Excel workbook as a database and the sheets link directly to tables. I should note that my setup is 64-bit SAS 9.4 with 64-bit Excel; I understand that this approach may not work as expected if, for example, you have 64-bit SAS and 32-bit Excel.
source to share