Introduction
According to several documentation 1, 2, 3 HDFS' Location Awareness is about knowing the physical location of nodes and replicating data on different racks to reduce the impact of rack issues due to, e.g. power supply and/or switch issues.
Question
How does HDFS know the physical location of nodes and racks and subsequently decide to replicate data to nodes located on other racks?