Lets suppose we have the following dataset:
ID Stress_Level Heart_Rate
1 5 10
2 7 12
3 9 16
And the code one would use to rename a variable would be:
data test1;
set test0;
rename Stress_Level=A Heart_Rate=B;
run;
However, what I would like to do is to rename the 2 columns without using their names. Is there an "internal" SAS command that addresses the variable depending on which column it is? So for instance Stress_Level which is the 2nd column could be addressed as "COL2 " or something similar. Thus the code would be:
data test1;
set test0;
rename COL2=A COL3=B;
run;
Where "COL2" would always refer to the second column in the dataset regardless of its name. Is there a direct or maybe an indirect way to achieve that?
I think the easiest way is to build up a rename statement string from the metadata table DICTIONARY.COLUMNS (the view of this is SASHELP.VCOLUMN). This holds the column names and position for all tables in active libraries.
I've taken advantage of the ASCII sequence (the byte
function) to rename the columns A, B etc, obviously you'd run into problems if there are more than 26 columns to be renamed in the table!
You'll also need to tweak the varnum+63
calculation if you wanted to start from a different column than 2.
proc sql noprint;
select cats(name,"=",byte(varnum+63)) into :newvars separated by ' '
from dictionary.columns
where libname = 'WORK' and memname='HAVE' and varnum>=2;
quit;
data want;
set have;
rename &newvars.;
run;
/* or */
/*
proc datasets lib=work nolist nodetails;
modify have;
rename &newvars.;
quit;
*/
There are a couple of ways you can do this.
The shortest approach is probably to use an array. The only drawbacks are that you need to know the types of the variables in advance and the name of the first variable.
If they are all numeric as in your example the following could be used:
data test1;
set test0;
array vars[*] _numeric_;
A = vars[2];
B = vars[3];
keep ID A B;
run;
You can only have one type of variable in an array, so it's slightly more complicated if they are not all numeric or all character. Additionally you will need to know the name of the first variable and any other variables that you wish to keep if you don't want to have the duplicates of the second and third variables.
A more robust approach is to use information from a dictionary table and a macro variable to write your rename statement:
proc sql;
/* Write the individual rename assignments */
select strip(name) || " = " || substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", varnum - 1, 1)
/* Store them in a macro variable and separate them by spaces */
into :vars separated by " "
/* Use a sas dictionary table to find metadata about the dataset */
from sashelp.vcolumn
where
libname = "WORK" and
memname = "TEST0" and
2 <= varnum <= 3;
quit;
data test1;
set test0;
rename &vars.;
run;
SAS stores information about datasets in dictionary tables, which have views available in the sashelp library. Take a look in some of the sashelp.v*
tables to see what kind of information is available. The proc sql
colon is used to store values in a macro variable, which can then be used in the rename
statement.
I'd recommend the second approach as it is considerably more flexible and less dependent on the exact structure of your data. It also expands better when you have more than a couple of variables to rename.
Finally, if you want to make the changes to a dataset in place you may want to take a look at using proc datasets
(in combination with the dictionary table approach) to do the renaming, as this can change the variable names without having to read and write every line of data.