Script the next row data for current

2019-03-03 10:27发布

问题:

What can I do if I want to copy the data from the next row. For example customer A started his current trip on 01JAN2015 and next trip on 15JAN2015. Therefore, his end trip date for his current trip will be on 14JAN2015, which is a day before his next trip starts. What can I script for the end trip date?

回答1:

As there is no lead() function in SAS, you can either sort your data into descending date order and use lag() then re-sort it back again, as per Vasilij's answer, or you can do a 'look-ahead merge'.

Example:

proc sort data=have ;
  by customer date_start ;
run ;

data want ;
  merge have
        have (firstobs=2 rename=(date_start=next_date customer=next_customer)) ;
  if customer = next_customer then do ;
    date_end = next_date ;
  end ;

  format date_end date7. ;
  drop next_: ;
run ;


回答2:

Here is the code that would do what you are asking.

It sorts the data in descending order in order to use LAG() function. That way, any previous record is actually your future record and you can use it to work out the data points you need. Last PROC SORT sorts the data in the original order.

NOTE: I didn't take into account different customers. You might want to introduce some BY GROUP processing to make sure you don't take the next trip date for another customer.

data have; 
input customer $ date_start date7.; 
format date_start date7.;
datalines; 
A 01JAN15
A 15JAN15
;

PROC SORT data=have;
by customer Descending date_start ;
RUN; 

data want; 
set have; 
by customer Descending date_start ; 
format date_end date7.;
date_end = lag(date_start)-1; 
RUN; 

PROC SORT data=want;
by customer date_start ;
RUN;


回答3:

lag() is a horrible misnomer-ed function that has nothing to do with 'previous row' and should be almost always be avoided. It often creates buggy, very hard to spot mistakes. There are some rare cases where it makes sense to use it. This is not one of them. I really wish people would stop recommending its use. [/end rant].

Instead, consider using one of the below methods.

1) The point= method (not sure if there's a name for this). Some notes, be sure to keep just those variables you need on the second set statement and no more. Rename them so they don't overwrite the existing variable values.

data want;

  set sashelp.class end=last;  

  * GET THE NAME FROM THE NEXT ROW OF DATA;
  if not last then do;
    recno=_n_+1;
    set sashelp.class(keep=name rename=(name=next_name)) point=recno;
  end;
  else do;
    call missing(next_name);
  end;

run;

2) The retain method:

* REVERSE THE ORDER OF THE DATA;
proc sort data=sashelp.class out=have;
  by descending name;
run;

* KEEP TRACK OF THE PRIOR RECORDS NAME AS WE ITERATE ACROSS OBSERVATIONS;
data have2;
  set have;
  length next_name $8;
  retain next_name '';
  output; 
  next_name = name;
run;

* SORT THE DATA BACK TO ITS ORIGINAL ORDER;
proc sort data=have2 out=want;
  by name;
run;

3) The look-ahead-merge method as suggested in Chris J's answer.



标签: sas