-1

I have a use case where i had to analyze real time data using Apache Spark. But i still have a confusion related to choosing data store for my application. The analysis mostly include aggregation, KPI based identity analysis and machine learning tools to predict trends and analysis. Cassandra has good support and large tech companies are already using it in production. But after research i found Druid is faster than Cassandra and is good for OLAP queries but it's results are inconsistent of queries like Count Distinct.

Guys any help related that will be appreciated. Thanks

4

1 回答 1

1

由于您的用例是分析实时数据,我建议您不要Druid使用Apache Cassandra。因为Apache Cassandra,由于它的异步主控复制较少,您可能会错过实时分析的更新数据。另一方面,Druid专为实时分析而设计。

Druid详情:http
Apache Cassandra ://druid.io/druid.html详情:https ://en.wikipedia.org/wiki/Apache_Cassandra

于 2017-01-01T05:14:24.217 回答