0

I have a Web Application (Java backend) that processes a large amount of raw data that is uploaded from a hardware platform containing a number of sensors.

Currently the raw data is uploaded and the data is decompressed and stored as a 'text' field in a Postgresql database to allow the users to log in and generate various graphs / charts of the data (using a JS charting library clientside).

Example string...

[45,23,45,32,56,75,34....]

The arrays will typically contain ~300,000 values but this could be up to 1,000,000 depending on how long the sensors are recording so the size of the string being stored could be a few hundred kilobytes

This currently seems to work fine for now as there are only ~200 uploads per day but as I am looking at the scalability of the application and the ability to backup the data I am looking at alternatives for storing this data

DynamoDB looked like a great option for me as I can carry on storing the uploads details in my SQL table and just save a URL endpoint to be called to retrieve the arrays....but then I noticed the item size is limited to 64kb

As I am sure there are a million and one ways to do this I would like to put this out to the SO community to hear what others would recommend, either web services or locally stored....considering performance, scalability, maintainability etc etc...

Thanks in advance!

UPDATE:

Just to clarify the data shown above is just the 'Y' values as it is time-sampled the X values are taken as the position in the array....so I dont think storing as a tuple would have any benefits.

4

4 回答 4

0

如果您要存储此类字符串,您可能希望使用S3(包含数组字符串的 1 个对象),在这种情况下,您将通过启用存储桶
版本控制来获得“备份”。

于 2013-11-11T09:44:52.143 回答
0

我刚刚遇到了 Google Cloud Datastore,它允许我存储高达 1Mb(未索引)的单个项目字符串,这似乎是 Dynamo 的一个很好的替代品

于 2013-11-11T09:11:18.500 回答
0

可能您应该使用RedisSSDB,两者都旨在存储大型数据列表(数组)。这两个数据库的区别在于Redis只是内存(磁盘用于备份),而SSDB是基于磁盘的,使用内存作为缓存。

于 2013-11-14T09:51:15.740 回答
0

你可以试试 Couchbase 和 ElasticSearch 的元组。Couchbase 是非常快速的面向文档的 NoSql 数据库。几千次插入操作对于 CB 来说是正常的。项目大小限制为 20MB。“get”操作的性能是数万次。有一个缺点,您只能通过 id 查询数据(有“视图”,但我认为它们很难适应绘图)。ElasticSearch 可以弥补这一不足,它可以非常快速地执行任何查询。Couchbase 和 ElasticSearch 中的格式数据是 json-document。

于 2013-11-11T02:30:34.150 回答