Simple question.
PROC IMPORT OUT= braw.address
DATAFILE= "&path.\address_data.csv"
DBMS=csv REPLACE;
GETNAMES=YES;
RUN;
This statement will create the dataset columns as character or numeric depending on the values, which is smart, but not what I want.
I want to import them all as character, to make for easier regex evaluation.
Is there a simple way to do this?
If you do not want to write a SAS macro to read all the columns as character, you could try a "cheat". Manually edit the file and duplicate the first row (the one containing column headers. Since those will most likely all be character strings, SAS should import all the columns as character.
Of course, a macro to do this would not be that difficult. You can try something like this:
%macro readme(dsn,fn);
/* Macro to read all columns of a CSV as character */
/* Parameters: */
/* DSN - The name of the SAS data set to create */
/* FN - The external file to read (quoted) */
/* Example: */
/* %readme(want, 'c:\temp\tempfile.csv'); */
data _null_;
infile &fn;
input;
i = 1;
length headers inputstr $200;
headers = compress(_infile_,"'");
newvar = scan(headers,1,',');
do until (newvar = ' ');
inputstr = trim(inputstr) || ' ' || trim(newvar) || ' $';
i + 1;
newvar = scan(headers,i,',');
end;
call symput('inputstr',inputstr);
stop;
run;
data &dsn;
infile &fn firstobs=2 dsd dlm=',' truncover;
input &inputstr.;
run;
%mend;
%readme(want, 'c:\temp\tempfile.csv');
I would generally just write my own input statement for the CSV, then you can make them whatever you want.
IE:
data braw.address;
infile "&path.\address_data.csv" dlm=',' dsd missover;
input
field1 $
field2 $
....
;
run;
You can use the log from the PROC IMPORT to generate this the first time and just edit it to contain $ for each variable.
Here is my macro to read dlm file with all vars as char:
%MACRO ImportText(file,dsn,dlm);
* Read data use proc import to get variable name and length;
PROC IMPORT DATAFILE="&file" OUT=temp DBMS=dlm REPLACE;
DELIMITER = &dlm;
GETNAMES = YES;
GUESSINGROWS = 32767;
RUN;
* Put variable names into macro variable;
PROC CONTENTS DATA=temp out=vars NOPRINT; RUN;
PROC SQL NOPRINT;
SELECT CATT(name,' : $',length,'.') INTO :vars SEPARATED BY ' ' FROM vars ORDER BY varnum;
QUIT;
* Read real data;
DATA &dsn;
INFILE "&file" DELIMITER=&dlm MISSOVER DSD FIRSTOBS=2 LRECL=32767;
INPUT &vars;
RUN;
%MEND;