I am coding a Manager in Excel-VBA with several buttons.
One of them is to generate a tab using another Excel file (let me call it T
) as input.
Some properties of T
:
~90MB
size~350K
lines- Contains sales data of the last
14
months (unordered). - Relevant columns:
year/month
- total-money
- seller-name
- family-product
- client-name
- There is not id columns (like: cod-client, cod-vendor, etc.)
Main relation:
- Sellers sells many Products to many Clients
I am generating a new Excel tab with data from T
of the last year/month
grouped by Seller.
Important notes:
T
is the only available input/source.- If two or more Sellers sells the same Product to the same Client, the
total-money
should be counted to all of those Sellers.
This is enough, now you know what I have already coded.
My code works, but, it takes about 4 minutes of runtime.
I have already coded some other buttons using smaller sources (not greater than 2MB
) which runs in 5 seconds.
Considering T
size, 4 minutes runtime could be acceptable.
But I'm not proud of it, at least not yet.
My code is mainly based on Scripting.Dictionary
to map data from T
, and then I use for each key in obj ... next key
to set the grouped data to the new created tab.
I'm not sure, but here are my thoughts:
- If
N
is the total keys in aScripting.Dictionary
, and I need to check forobj.Exists(str)
before aggregatingtotal-money
. It will runN
string compares to returnfalse
. - Similarly it will run maximun
N
string compares when I doSet seller = obj(seller_name)
.
I want to be wrong with my thoughts. But if I'm not wrong, my next step (and last hope) to reduce the runtime of this function is to code my own class object with Tries
.
I will only start coding tomorrow, what I want is just some confirmation if I am in the right way, or some advices if I am in the wrong way of doing it.
Do you have any suggestions? Thanks in advance.
Memory Limit Exceeded
In short:
~ 13 seconds
.There are things we learn the hard way. But I'm glad I found the answer.
Scripting.Dictionary
reached 100% really faster.cls_trie
implementation also reached 100%, but later than the first.~4-5 min
compared to~2-3 min
total runtime of above attempts.2%
.Solution was simple, I had to balance CPU and Memory usages.
for-loops
withif-conditions
.~15%
.~65%
.15 seconds
now.I created one GitHub repository with my cls_trie implementation and added one excel file with an example usage.
I'm new to the excel-vba world (4 months working with it right now). There might probably have some ways to improve my cls_trie implementation, I'm openned to suggestions:
You can see in the above code:
[a-z]
letters and numbers, and the probability that two text get mapped to the same branch node tends zero.My conclusion
Scripting.Dictionary
, even if it is proven that somehow it could be better than mycls_trie
implementation.Thank you all for the help.