I have a list of tuples, where each tuple
contains a string
and a number in the form of:
[(string_1, num_a), (string_2, num_b), ...]
The strings are nonunique, and so are the numbers, e.g. (string_1 , num_m)
or (string_9 , num_b)
are likely to exist in the list.
I'm attempting to create a dictionary with the string as the key and a set of all numbers occurring with that string as the value:
dict = {string_1: {num_a, num_m}, string_2: {num_b}, ...}
I've done this somewhat successfully with the following dictionary comprehension with nested set comprehension:
#st_id_list = [(string_1, num_a), ...]
#st_dict = {string_1: {num_a, num_m}, ...}
st_dict = {
st[0]: set(
st_[1]
for st_ in st_id_list
if st_[0] == st[0]
)
for st in st_id_list
}
There's only one issue: st_id_list
is 18,000 items long. This snippet of code takes less than ten seconds to run for a list of 500 tuples, but over twelve minutes to run for the full 18,000 tuples. I have to think this is because I've nested a set comprehension inside a dict comprehension.
Is there a way to avoid this, or a smarter way to it?