Is there a way to change and manipulate the proportion of a variable in SAS in random sampling?
Lets say that I have table consisting 1000 people. (500 male and 500 female)
If I want to have a random sample of 100 with gender strata - I will have 50 males and 50 females in my output.
I want to learn if there is a way to have the desired proportion of gender values?
Can ı have a random sample of 100 with 70 males and 30 females ?
PROC SURVEYSELECT
is the way to do this, using a dataset for n
or samprate
instead of a number.
data strata_to_Sample;
length sex $1;
input sex $ _NSIZE_;
datalines;
M 70
F 30
;;;;
run;
proc sort data=strata_To_sample;
by sex;
run;
data to_sample;
set sashelp.class;
do _i = 1 to 1e5;
output;
end;
run;
proc sort data=to_Sample;
by sex;
run;
proc surveyselect data=to_sample n=strata_to_sample out=sample;
strata sex;
run;
Generally that is what proc surveyselect
is for.
But for a quick and dirty datastep solution:
data in_data;
do i= 1 to 500;
sex = 'M'; output;
sex = 'F'; output;
end;
run;
data in_data;
set in_data;
rannum = ranuni(12345);
run;
proc sort data= in_data; by rannum; run;
data sample_data;
set in_data;
retain count_m count_f 0;
if sex = 'M' and count_m lt 70 then do; count_m + 1; output; end;
else if sex = 'F' and count_f lt 30 then do; count_f + 1; output; end;
run;
proc freq data= sample_data;
table sex;
run;