How can I discard observations based on their number of observations (row) within the group?

In Stata, I can do this:

bysort group_var: drop if _n > 6

      

keep only the first six observations within each group as indicated group_var

. How do I do this in SAS?

I tried:

proc sort data=indata out=sorted_data;
    by group_var;
run;

data outdata;
    set sorted_data;
    by group_var;
    if (_n_ > 6) then delete;
run;

      

but that excludes all but the first six observations in the entire dataset (leaving me just six observations).

+3


source to share


1 answer


You need to count the entries for each group.



data outdata;
   set sorted_data;
   by group_var;
   retain count;

   if first.group_var then
      count = 0;

   count = count + 1;
   if count > 6 then delete;

   drop count;
run;

      

+6


source







All Articles