Let's take a small example python dictionary, where the values are lists of integers.
example_dict1 = {'key1':[367, 30, 847, 482, 887, 654, 347, 504, 413, 821],
'key2':[754, 915, 622, 149, 279, 192, 312, 203, 742, 846],
'key3':[586, 521, 470, 476, 693, 426, 746, 733, 528, 565]}
Let's say I need to parse the values of the lists, which I've implemented into the following function:
def manipulate_values(input_list):
return_values = []
for i in input_list:
new_value = i ** 2 - 13
return_values.append(new_value)
return return_values
Now, I can easily parse the values of this dictionary as follows:
for key, value in example_dict1.items():
example_dict1[key] = manipulate_values(value)
resulting in the following:
example_dict1 = {'key1': [134676, 887, 717396, 232311, 786756, 427703, 120396, 254003, 170556, 674028],
'key2': [568503, 837212, 386871, 22188, 77828, 36851, 97331, 41196, 550551, 715703],
'key3': [343383, 271428, 220887, 226563, 480236, 181463, 556503, 537276, 278771, 319212]}
That works very well for small dictionaries.
My problem is, I have a massive dictionary with millions of keys and long lists. If I were to apply the above approach, the algorithm would be prohibitively slow.
How could I optimize the above?
(1) Multithreading---are there more efficient options available for multithreading this for statement in the dictionary besides the traditional threading
module?
(2) Would a better data structure be appropriate?
I'm asking this question as, I'm quite stuck how to best proceed in this case. I don't see a better data structure than a dictionary, but the for loops across the dictionary (and then across the value lists) is quite slow. There may be something here which has been designed to be faster.
EDIT: As you can imagine, this is somewhat of a toy example---the function in question is a bit more complicated than x**2-13.
I'm more interested in how to possibly worth with a dictionary with millions of keys, with long lists of values.