I have a query using a hash in the data step as follows:
if _n_ = 1 then do;
if 0 then
set TableA;
declare hash A(dataset: "TableA");
A.definekey ("UniqueKeyA" );
A.definedata ("DataA" );
A.definedone();
end;
if _n_ = 1 then do;
if 0 then
set TableB;
declare hash B(dataset: "TableB");
B.definekey ("UniqueKeyB" );
B.definedata ("DataB" );
B.definedone();
end;
rcA = A.find(key:UniqueKeyA);
rcB = B.find(key:UniqueKeyB);
When I run the query, TableA
returns rcA
and the data as expected, but the hash for TableB
for some reason brings every column from TableB
with it. These queries look identical to me and I can't think of anything qualitatively different between the tables that would cause them to act differently, but I don't know much about SAS.
Explicitly, if my original table is
UniqueKeyA UniqueKeyB
1 A
2 B
3 C
and TableA is
UniqueKeyA [...Other TableA columns...] DataA
1 53
3 ... 46
and TableB is
UniqueKeyB [...Other TableB columns...] DataB
B .45
C ... .22
my resulting table is
UniqueKeyA UniqueKeyB rcA DataA rcB [...Other TableB columns...] DataB
1 A 0 53 21
2 B 513 0 .45
3 C 0 46 0 ... .22
This doesn't make sense to me, since the hashes seem very basic, and although theyre identical only one of them returns every column.
Please let me know if you need more information, thanks!
Also, I have another question about these same hashes entering values into the data column even though, for example, rcA
doesn't equal 0. That question is here: Hash find method entering values despite returning non-zero - SAS
This line:
is where the columns come from. This is a 'lazy' way of avoiding the uninitialized variable message that would appear from the hash creation if there isn't something technically initializing the variables before the hash. It's fine if you are bringing every variable in, but of course if you're only bringing 2 and you have 40, you get 38 extras that you don't much want. They're empty, but still unneeded.
Add
(keep=uniquekey datab)
to the set TableB statement and it should get rid of those other columns.