I am studying data merge in SAS, and find the following example
data newdata;
merge yourdata (in=a) otherdata (in=b);
by permno date;
I do not know what do "(in=a)" and "(in=b)" mean? Thanks.
I am studying data merge in SAS, and find the following example
data newdata;
merge yourdata (in=a) otherdata (in=b);
by permno date;
I do not know what do "(in=a)" and "(in=b)" mean? Thanks.
yourdata(in=a)
creates a flag variable in the program data vector called 'a' that contains 1 if the record is from yourdata and 0 if it isn't. You can then use these variables to perform conditional operations based on the source of the record.
It might be easier to understand if you saw
data newdata;
merge yourdata(in=ThisRecordIsFromYourData) otherdata(in=ThisRecordIsFromOtherData);
by permno date;
run;
Suppose that records from yourdata needed to be manipulated in this step, but not those from otherdata, you could then do something like
data newdata;
merge yourdata(in=ThisRecordIsFromYourData) otherdata(in=ThisRecordIsFromOtherData);
by permno date;
if ThisRecordIsFromYourData then do;
* some operation here for yourdata records only ;
end;
run;
An obvious use for these variables is to control what kind of 'merge' will occur, using if
statements. For example, if ThisRecordIsFromYourData and ThisRecordIsFromOtherData;
will make SAS only include rows that match on the by variables from both input data sets (like an inner join).