I'm reading the following article: Elements of Scale: Composing and Scaling Data platforms
I'm stuck on understanding the following sentences:
A secondary index is an index that isn’t on the primary key. This means the data will not be partitioned by the values in the index. Directed routing via a hash function is no longer an option. We have to broadcast requests to all machines.
Can anyone explain why this is the case? I am a beginner to data platforms but have gotten so far and understand the article.
Specifically, why can't we look up values in the secondary index for their primary key, then look up their location via a hash function on that primary key? Why broadcast requests to all machines?
Thank you for your time