Here is a sample dataset. The scenario below represents a list of account transaction history:
- Two customers A and B;
- Customer A has 5 accounts 1, 2, 3, 4, 5 at Bank BA and Bank UB;
- Customer B has 2 accounts 1, 6 at Bank WF;
Note that Bank WF has issued an account ID to B that's same as A has at Bank BA (which could be true in reality).
data have; input Customer $ Account_id Bank_id $ Year; datalines; A 1 BA . A 2 UB . A 3 UB 2012 A 3 UB . A 4 UB . A 4 UB 2013 A 5 UB . B 1 WF 2014 B 1 WF . B 6 WF . ;
Some of the account open dates are missing, but for the same account, the open date should be the same. What I'm trying to do is to replace the missing date of an account with its previous/later value, if there's any; otherwise, keep it null.
data want;
input Customer $ Account_id Bank_id $ Year;
datalines;
A 1 BA .
A 2 UB .
A 3 UB 2012
A 3 UB 2012
A 4 UB 2013
A 4 UB 2013
A 5 UB .
B 1 WF 2014
B 1 WF 2014
B 6 WF .
;
I looked at this example: How to write first non-missing value to first missing observations which was helpful, but I can't tweak it to work for my case, which has multiple groups.