I'm building a system that comprises a multiple heterogeneous services that talk to each other over a network, although in the standard deployment model they are all on the same machine. The UI client for managing the entities within that complex system should be able to display aggregated data from all comprising services while enabling search across that aggregated data.
I'm wondering how to design the data retrieval within this system so that it is scalable as the amount of data to be searched is already high and increases?
I'm thinking about two approaches:
The client queries data from all services on demand and aggregates the results in its layer. In many cases it will have to do joins between data coming from multiple services, so I'm concerned about performance here.
Denormalize the services data in a way so that it is convenient for the client queries and even store aggregations between the multiple services data so that the client doesn't have to do joins on demand. Probably, it would be better to store each service's denormalized data in its own database or cache as thus it would be easier to keep all denormalized data up-to-date. However, I'll need to put the aggregated views across multiple services' data in some other place and I'm concerned about the overhead of keeping this remote cache up-to-date.
Any examples or references to existing architectures that solve similar problems would be highly appreciated. Thanks!