Difference between not calling an array and setting each array element to zero
Consider
data want;
set have;
array ay1{7};
retain ay1:;
if first.tcol then do;
call missing(of ay1{*});
end;
if ay1{7} > a then do;
[other statements to determine value of ay1]
end;
else do;
[other statements to determine value of ay1]
end;
by tcol;
run;
Since the array ay1
will be automatically saved between observations, if I want the program to do some group processing, I need to reset the ay1
values ββwhen it encounters a new value tcol
(otherwise it will inherit the values ββfrom the last observation of the previous value tcol
. This will affect if ay{7} > a
, so everyone else subsequent statements will be affected).
In my current codes, I will reset the array
do i = 1 to dim(ay1);
ay1{i} = 0;
end;
And this work is fine. For the first generality of each value tcol
, you will first reset all values ay1
to 0 and then execute [other statement]
to update ay1
in this observation.
If I use call missing(of ay1{*});
, for the first volume of each value, tcol
it will disable each value ay1
(as expected). But the following [other statement]
will NOT be updated ay1
. (I put a few put statements in [other statement]
as a debugging step to make sure this part is done. It does all the other work besides updating the value ay1
).
If it first.tcol
fails, everything appears to be normal (there is no error, but the set of output is incorrect, since the first step in each of the groups has all unexpected values). So I think there is something wrong with using it call missing
here.
source to share
Your statement "Since the array ay1 will be automatically saved between observations" is incorrect. Arrays declared as _TEMPORARY_
are saved automatically. Arrays of constant variables are not.
You can check this with:
data want;
set have;
array ay1{7};
by tcol;
if first.tcol then do;
do i=1 to 7;
ay1[i] = 1;
end;
end;
run;
You will see that after the first tcol value, there will be no values ββin each group. I.E. they weren't kept between the lines.
Add to
retain ay:;
to your data step and it should work as you expect.
EDIT: Added this to show that it works as described.
data have;
tcol = 1;
a = 1;
output;
tcol = 1;
a = 10;
output;
run;
data want;
set have;
array ay1{7};
retain ay1:;
by tcol;
if first.tcol then do;
call missing(of ay1{*});
end;
if ay1{7} > a then do;
do i=1 to 7;
ay1[i] = -a;
end;
end;
else do;
do i=1 to 7;
ay1[i] = 999;
end;
end;
run;
source to share