SAS 9.4
Example dataset
╔═════════╦═══════╦════════════╦══════════╗
║ subject ║ event ║ start_date ║ end_date ║
╠═════════╬═══════╬════════════╬══════════╣
║ s1 ║ e1 ║ 2-Mar-19 ║ 1-Jun-19 ║
║ s2 ║ e1 ║ 1-Mar-19 ║ 1-May-19 ║
╚═════════╩═══════╩════════════╩══════════╝
Expected Output:
╔═════════╦═══════╦════════════╦══════════╦════════════╗
║ subject ║ event ║ start_date ║ end_date ║ query_text ║
╠═════════╬═══════╬════════════╬══════════╬════════════╣
║ s1 ║ e1 ║ 1-Mar-19 ║ 1-Jun-19 ║ error? ║
║ s2 ║ e1 ║ 1-Mar-19 ║ 1-May-19 ║ error? ║
╚═════════╩═══════╩════════════╩══════════╩════════════╝
I want to return all that have the same 'event' and where the 'end_date' of one with the earlier 'start_date' is greater than (>) the 'start_date' of the other one with the later start_date.
Here, subject 's2' has an earlier 'start_date' than 's1' but 's2' has a later 'end_date' than "s1's" 'start_date', so these two would get returned
This is what I have so far, but I'm not sure how to make sure what I'm comparing have the same 'event':
data Out_2;
/* set relevant dataset */
if input(compress(end_date,"/"), ??date9.) > input(compress(start_date,"/"), ??date9.);
/* do something */
run;
Thanks
The way to process rows in groups in a DATA step is to sort by the key, then use
BY key_field
(event
in your example) in the DATA step,retain
ing the data from the previous rows in a group, while identifying the group's first and last row usingif first.key_field
andif last.key_field
).That would be easy if you have a maximum of two subjects per event. If you have more you'd have to write duplicate code to retain the values from the first, second, etc rows, or retain the data into an array.
So I'd ditch the DATA step in favor of SQL, which lets you do this easily:
The output I get with this SQL on your input data (not in SAS, but I'm pretty sure it shouldn't give different results!):
View on DB Fiddle