OK I'll start with the problem: I have product tables being created every week which are named in the format:
products_20130701
products_20130708
.
.
.
I'm trying to automate some campaign analysis so that I don't have to manually change the table name in the code every week to use whichever product table is the first one after the maximum end date of my campaign.
e.g
%put &max_enddate.;
/*20130603*/
my product tables in June are:
products_20130602
*products_20130609*
products_20130616
products_20130623
in this instance i would like to use the second table in the list, ignoring over 12 months worth of product tables and just selecting the table who's date is just after my max_enddate macro.
I've been Googling all day and I'm stumped so ANY advice would be much appreciated.
Thanks!
First, get all possible tables:
Next, sort it by date, easily done due to the format of your dataset names.
Finally, you just need to get out the first record where the date is large enough.
Now you can just put the macro variable when you want to use the appropriate dataset, e.g.:
A SQL solution:
Now you have &datasetname that you can use in the set statement, so
data my_analysis; set &datasetname; (whatever you are doing); run;
Modify 'WORK' to the appropriate libname, and if there are any other restrictions add those as well. You might get some warnings about invalid dates if you have product_somethingnotadate, but that shouldn't matter.
The way this works - the dictionary.tables is a list of all tables in all libnames you have accessed (same as sashelp.vtable, but only available in PROC SQL). First this selects all rows that have a name with a date greater than or equal to your campaign end date; then it takes the
min(memname)
from that. Memname is of course a string, but in strings that are identical except for a number, you can still usemin
and get the expected result.This is probably not suitable for your application, however I find it very useful for the datasets I have as they absolutely must exist for each Sunday and I evaluate the existence of the dataset at the beginning of my code. If they don't exist then it sends an email to our IT guys that tells them that the file is missing and needs to be re-created\restored.
With the other suggestions above they will only give you results for datasets that exist, therefore if the one you should have been using has been deleted then it will grab the next one and run the job regardless.