# SAS Date Difference by Group

Consider the dataset ` test `

as follows:

```
Group Date
1 05JAN2014
1 08JAN2014
1 14JAN2014
2 05JAN2013
2 10FEB2015
2 27FEB2015
```

I want to calculate the difference in dates by group. The code below takes into account the difference between every two dates:

```
data test;
datediff = dif(Date);
run;
```

How can I only distinguish between dates in one group? Also, is there a way to change the difference between the last and first dates in each group?

source to share

Starting from this:

```
data test;
datediff = dif(Date);
run;
```

Let me address your numbers one at a time. Firstly, by adding an operator `set`

and an operator `by`

, we can also add `first`

and `last`

so that you can determine where you are in the group. This assumes that it is already sorted `by group`

.

```
data test2;
set test;
by group;
datediff=dif(date);
run;
```

It doesn't work any differently (assuming you had the set assignment originally, anyway). But now you have new possibilities.

First, if you can use `dif`

, I recommend the method for that `retain`

. You can more easily see what it does and avoid some common pitfalls: in particular, `lag`

and `dif`

not actually being compared to the previous entry - they create a queue and are compared to what can lead to complications when using conditional statements.

```
data test2;
set test;
by group;
retain last_date;
if first.group then last_date=0;
datediff = date - last_date;
output;
last_date = date;
run;
```

This does the same thing as before - it compares the previous value to the current value, but makes it easier to see, and we add an option to reset the variable `last_date`

when `first.group`

true - that means we are in the first line of the new group value. I won't discard any of these intermediate variables, but in production code, you can and should. `retain`

means the value will be stored across lines (instead of resetting every time you get a new line).

Now that you have one variable keeping track of the previous string value `date`

, it's pretty easy to hope to see how we can do this for the first and last difference.

```
data test2;
set test;
by group;
retain last_date orig_date;
if first.group then do;
last_date=0;
orig_date=date; **new;
end;
datediff = date - last_date;
if last.group then group_datediff = date-orig_date; **new;
output;
last_date = date;
run;
```

Now we have done the same as before, but we restart orig_date every time we see `first.group`

and calculate the group_datediff when we click `last.group`

.

source to share

Here is a method using the `lag`

new feature `ifn`

as well ( `ifn`

included in SAS 9.2 or later). Be careful with the delay function as it can sometimes lead to unexpected results. See this article for details .

```
*Data must be sorted to use BY groups;
proc sort data=have; by group date;run;
data want;
set have;
by group;
dateDiff = ifn(first.group, . , dif(date));
retain firstDate;
if first.group then firstDate = date;
if last.group then dateDiff_all = date - firstDate;
run;
```

source to share

Here is a solution using proc sql and a group statement using the min and max pivot functions.

This will give you the difference in each group for the first and last dates.

Technically this will give you the difference in the smallest and largest dates, but you are date in chronological order, so this will work for that data.

```
proc sql;
create table want as select
group,
max(date) - min(date) as datediff
from have
group by group;
quit;
```

source to share