SAS Hash Table (Right Join/Union)

2019-09-04 11:42发布

I'm looking to do a lookup that is sort of a hybrid between a join and union. I have a large number of records in my main dataset, so I'm looking to do something that wouldn't be a "brute force" method of a many-to-many matrix.

Here is my main dataset, called 'All', which already contains price for each of the products listed.

product date        price   
apple   1/1/2011    1.05    
apple   1/3/2011    1.02
apple   1/4/2011    1.07

pepper  1/2/2011    0.73
pepper  1/3/2011    0.75
pepper  1/6/2011    0.79

My other data dataset ('Prices' - not shown here, but contains the same two keys, product and date) contains prices for all products, on each possible date. The hash table look up I would like to create would essentially look up every date in the 'All' table, and output prices for ALL products for that date, resulting in a table such as this:

product date        price
apple   1/1/2011    1.05    
pepper  1/1/2011    0.71 *
apple   1/2/2011    1.04 *
pepper  1/2/2011    0.73
apple   1/3/2011    1.02
pepper  1/3/2011    0.75
apple   1/4/2011    1.07
pepper  1/4/2011    0.76 *
apple   1/6/2011    1.10 *
pepper  1/6/2011    0.79

That is, as long as one product has a date and price specified 'All' table, all other products should pull that in from the lookup table as welll. The asterisks indicate that the price was looked up from the prices table, and new rows containing prices for products were essentially inserted into the new table.

If hash table are not a great way to go about this, please let me know alternative methods.

标签: sas hashtable
1条回答
仙女界的扛把子
2楼-- · 2019-09-04 12:35

Well this is far from elegant, but curious if the below gives you the desired result? Since you have multiple records per key in ALL (which I assume you want to maintain), I basically unioned ALL with the records in PRICES that have a date in All, but I added an Except so as to excluded records that were already in ALL. No idea if this makes sense, or is doing what you want. Certainly doesn't qualify as 'elegant'.

data all;
  input product $7. date mmddyy10. price;
  Y=1;
  format date mmddyy10.;
  cards;
apple  01/01/2011  1.05
apple  01/01/2011  1.05
apple  01/03/2011  1.02
pepper 01/02/2011  0.73
pepper 01/03/2011  0.75
pepper 01/06/2011  0.79
;
run;
data prices;
  input product $7. date mmddyy10. price;
  format date mmddyy10.;
  cards;
apple  01/01/2011  1.05
apple  01/02/2011  1.04
apple  01/03/2011  1.02
apple  01/04/2011  1.07
apple  01/05/2011  1.01
pepper 01/01/2011  0.70
pepper 01/02/2011  0.73
pepper 01/03/2011  0.75
pepper 01/04/2011  0.76
pepper 01/05/2011  0.77
pepper 01/06/2011  0.79
;
run;

proc sql;
  create table want as 
  select * from all 
  union corr all
  ( (select product,date,price from
      prices
      where date IN (select distinct date from all)
    )
    except corr
    select product,date,price from all
  )
  ;
quit;
查看更多
登录 后发表回答