What is the most efficient way to drop a table in SAS?
I have a program that loops and drops a large number of tables, and would like to know if there is a performance difference between PROC SQL; and PROC DATASETS; for dropping a single table at a time..
Or if there is another way perhaps???
We are discussing tables or datasets?
Tables implies database tables. To get rid of these in a fast way, using proc SQL pass-through facility would be the fastest. Specifically if you can connect to the database once and drop all of the tables, then disconnect.
If we are discussing datasets in SAS, I would argue that both proc sql and proc datasets are extremely similar. From an application standpoint, they both go through the same deduction to create a system command that deletes a file. All testing I have seen from SAS users groups or presentations have always suggested that the use of one method over the other is marginal and based on many variables.
If it is imperative that you have the absolute fastest way to drop the datasets / tables, you may just have to test it. Each install and setup of SAS is different enough to warrant testing.
Simple Solution for temporary tables that are named similarly:
If all of your tables start with the same prefix, for example p1_table1 and p1_table2, then the following code will delete any table with that starts with p1
proc datasets; delete p1: ; run;
If it is reasonable to outsource to the OS, that might be fastest. Otherwise, my unscientific observations seem to suggest that drop table in
proc sql
is fastest. This surprised me as I expectedproc datasets
to be fastest.In the code below, I create 4000 dummy data sets then try deleting them all with different methods. The first is with sql and on my system took about 11 seconds to delete the files.
The next two both use
proc datasets
. The first creates a delete statement for each data set and then deletes. The second just issues a blanket kill command to delete everything in the work directory. (I had expected this technique to be the fastest). Both proc datasets routines reported about 20 seconds to delete all 4000 files.proc delete is another, albeit undocumented, solution..
http://www.sascommunity.org/wiki/PROC_Delete
In terms of which is faster, excluding extremely large data, I would wager that there is little difference between them. When handling permanent SAS datasets, however, I like to use PROC DATASETS rather than PROC SQL, simply because I feel better manipulating permanent datasets using the SAS-designed method, and not the SQL implementation
I tried to fiddle with the OS-delete approach.
Deleting with the X-command can not be recommended. It took forever!
I then tried with the system command in a datastep:
As you can see, I had to split my deletes into 9 separate delete commands. The reason is, I'm using wildcards, "*", and the underlying operating system (AIX) expands these to a list, which then becomes too large for it to handle...
The program basically constructs a delete command for each of the nine filegroups "temp[1-9]*.sas7*" and issues the command.
Using the create macro function from cmjohns answer to create 4000 data tables, I can delete those in only 5 seconds using this approach.
So a direct operating system delete is the fastest way to mass-delete, as I expected.