Most Efficient Way to Order Columns in SAS

I have data temp

containing variables A1, A2, ... Amax. I want to change its internal order so that after opening it it shows A2, A5, .....

I know a couple of ways to do this. I usually use instructions retain

.

If the number of observations is large (N> 1,000,000), what is the most efficient way to do this? Data step from retain

or proc sql

or something else?

The most efficient method means the least processing time for me. I would appreciate it if you can also provide an analysis of the memory and disk space required for each method.

+3


source to share


1 answer


Several years ago I attended a SAS conference at one of my headquarters in the UK. They did a workshop very similar to your question, where they looked at the speed of different methods for reordering and merging / merging datasets.

3 ways to represent SAS, where:

  • Traditional Datastep (Save)

  • Proc SQL (Create Table)

  • Hash tables (especially around merging tables that don't have to be reordered)

The interesting result was that unless you are talking about a very large dataset, the save and create table is evenly consistent.



Obviously, if you want to merge / merge and reorder, then proc sql is the way to go as using the data step to merge requires you to sort first, and proc sql does not. And if it's really big, Hash tables can save 90% of the processing time on merges / joins.

One of the other results in the group discussion is the use of large datasets of improved IO Views performance when reordering:

proc sql noprint;
  create view set2 as
  select title, *
  from set1;
quit;

** OR;

data set2 / view=set2;
  retain title salary name;
  set set1;
run;

      

(Link here: http://www2.sas.com/proceedings/sugi27/p019-27.pdf )

+7


source







All Articles