Here is an answer from a bit of an abstract, graph theoretic angle:
Let's simplify the problem by only looking at (directed) dependency graphs between stateful objects.
An extremely simple answer can be illustrated by considering two limiting cases of dependency graphs:
The 1st limiting case: a cluster graphs .
A cluster graph is the most perfect realisation of a high cohesion and low coupling (given a set of cluster sizes) dependency graph.
The dependence between clusters is maximal (fully connected), and inter cluster dependence is minimal (zero).
This is an abstract illustration of the answer in one of the limiting cases.
The 2nd limiting case is a fully connected graph, where everything depends on everything.
Reality is somewhere in between, the closer to the cluster graph the better, in my humble understanding.
From another point of view: when looking at a directed dependency graph, ideally it should be acyclic, if not then cycles form the smallest clusters/components.
One step up/down the hierarchy corresponds to "one instance" of loose coupling, tight cohesion in a software but it is possible to view this loose coupling/tight cohesion principle as a repeating phenomena at different depths of an acyclic directed graph (or on one of its spanning tree's).
Such decomposition of a system into a hierarchy helps to beat exponential complexity (say each cluster has 10 elements). Then at 6 layers it's already 1 million objects:
10 clusters form 1 supercluster, 10 superclusters form 1 hypercluster and so on ... without the concept of tight cohesion, loose coupling, such a hierarchical architecture would not be possible.
So this might be the real importance of the story and not just the high cohesion low coupling within two layers only. The real importance becomes clear when considering higher level abstractions and their interactions.