database-design - 时序数据读取性能

Question

如何将传感器时间序列数据存储到 cassandra？

在这里，我检查了性能

在具有 10000 个时间序列数据数据的 cassandra 复合列族单行键中，查询：select * from deviceidcomposite where did='Dev001' limit 5000

情况1：

除（行键）

   20120702105554 colname1=value
   20120702105554 colname2=value
   20120702105554 colname3=value
   20120702105554 colname4=value
   20120703105555 colname1=value
   20120703105555 colname2=value
   20120703105555 colname3=value
   20120703105555 colname4=value



    while we using cql3 to read single row key 5000 timeseries record it is taking nearly 3 min for 4 clumn

案例2：

标准柱族

    diviceidcolumname1(row key)

      20120703105552=value
      20120703105553=value
      20120703105554=value
      20120703105555=value
      ..
      ..
    diviceidcolumname2(row key)

      20120703105552=value
      20120703105553=value
      20120703105554=value
      20120703105555=value
      ..
      ..
    diviceidcolumname4(row key)

      20120703105552=value
      20120703105553=value
      20120703105554=value
      20120703105555=value
      ..
      ..
    diviceidcolumname4(row key)
      20120703105552=value
      20120703105553=value   
      20120703105554=value
      20120703105555=value
      ..
      ..

      (20120703105552->y/m/d/HH/MM/Sec)
  using thrift api reading data perticular column name value or whole column name value
     for one day(5000 timeseries data)
     one month 
     it is comparing with cql less amount if time it's taking 
     nearly it taking 2 min
     in this method reading single column name for one month is reading Very quick

哪一个是时间序列模型？

还有什么更好的方法！提高我的表现

score 2 · Accepted Answer

我不认为你的问题不是数据模型（我在你之前的问题中建议的）..

简单的答案：不要使用限制！

限制协同努力来决定哪 5000 行将作为结果集返回。这将导致严重的性能下降。

如果您需要限制结果的数量，请使用 WHERE 子句（列切片）。它们可以由每个节点单独评估 - “限制”的对立面！

另外，我想我已经回答了你之前对这个后续的问题。只有当（且仅当）您发现它有用时，您相应地标记答案才是公平的。谢谢。

database-design - 时序数据读取性能

1 回答 1

Related

Reference