Here is my problem:
My application is a distributed real-time message broker for web applications. Clients from web-browsers connect to one of the application nodes. Those nodes connected by ZeroMQ PUB/SUB mechanism. If one client sends message - node publishes it into PUB socket, other nodes receive those message from SUB socket and send it to their own connected clients.
But now I need presence and history functionality. Presence - provide a list, containing description of all connected (to all nodes) clients. History - provide a list of last several messages sent. I.e. I need to get entire state of application. I consider several ways to achieve it:
1) Send all information about connected clients to central server. Then when a client asks for presence - ask central server and return response to client.
2) Keep all information on every node. When client connect to any node send information about it to other nodes - using PUBLISH operation. So when a client asks for presence I can immediately return a response.
3) Gather information on demand from all nodes. I really can’t imagine how to program this at moment but this allows to get rid of duplicating information that leads to reducing memory consuption. In this case I don’t need to worry about fitting all information in memory.
4) Use some distributed data store, something like Dooserd. But I don’t like this idea because of extra dependency.
Client needs presence information on every connect to the node, presence information changes on every client's connect/disconnect, history information changes on every message.
This is an open-source application, so I don't know how much connected clients it must support. Load tests in the end will say this number.
There is no strong requirement about reliability of those presence and history data.
I really need your advice, which of these options is the right way to solve my problem. Or maybe there is another better way?