Is there a simple way in SAS to convert a string to a SAS-safe name that would be used as a column name?
ie.
Rob Penridge ----> Rob_Penridge
$*@'Blah@* ----> ____Blah__
I'm using a proc transpose
and then want to work with the renamed columns after the transpose.
proc transpose
will take those names without any modification, as long as you set options validvarname=any;
If you want to work with the columns afterwards, you can use the NLITERAL function to construct named literals that can be used to refer to them:
options validvarname=any;
/* Create dataset and transpose it */
data zz;
var1 = "Rob Penridge";
var2 = 5;
output;
var1 = "$*@'Blah@*";
var2 = 100;
output;
run;
proc transpose
data = zz
out = zz_t;
id var1;
run;
/* Refer to the transposed columns in the dataset using NLITERAL */
data _null_;
set zz;
call symput(cats("name", _n_), nliteral(var1));
run;
data blah;
set zz_t;
&name1. = &name1. + 5;
&name2. = &name2. + 200;
run;
May try perl regular expression function.
Since for column name, the first character should not be numerical, it's more complicated then.
data _null_;
name1 = "1$*@' Blah1@*";
name2 = prxchange("s/[^A-Za-z_]/_/",1,prxchange("s/[^A-Za-z_0-9]/_/",-1,name1));
put name2;
run;
Take a look at the VALIDVARNAME System Option. It might allow you to accept non-valid SAS names.
Also the NOTNAME function could facilitate in helping find invalid characters.
How about using SAS's regular expression
functionality? For example:
data names;
set name;
name_cleaned = prxchange('s/[^a-z0-9 ]/_/i', -1, name);
run;
This will convert anything that isn't a letter, number, or space into a _
. You can add other characters that you want to allow to the list after the 9
. Just be aware that some characters are "special" and must be preceded by a \
.
You could also use the IDLABEL
statement in the transpose to add labels that match the original values. Then use the VARLABEL
function to retrieve the labels and work with them that way.