2

我正在制作一个 mysql 表并编写一个 API,以每秒接收和存储 1000 多个设备的数据。每台设备都会向这个 PHP 服务器推送超过 100 个数据点。我正在测试 360 台设备,每个设备有 10 个数据点,每秒 3600 次写入计数运行良好,这是可以理解的。但是,我注意到每秒写入操作的计数随着设备数量的增加而增加。我正在尝试搜索每秒写入计数的饱和点,但找不到任何内容。每秒写入计数是否有最大值?当写入计数达到每秒 10 万时,系统性能如何。有没有mysql数据库的高手给我指点一下,谢谢。

4

2 回答 2

2

You might be able to find a benchmark to that shows some high number in a very limited test case. But there are too many factors that impact 'writes per second':

  • Spinning drive vs SSD, plus brand, etc
  • RAID
  • Batched insert / LOAD DATA / single-row inserts / MyISAM
  • Number of indexes
  • BEGIN...COMMIT / autocommit
  • Concurrency -- both of multiple writes, and also of simultaneous reads
  • Settings: innodb_flush_log_at_trx_commit, sync_binlog, etc
  • Version (5.6 made some improvements; 5.7 made more; MariaDB has some of those improvements, plus others)
  • Schema
  • Client and Server vying for resources
  • etc.

I have heard of a benchmark showing a million "transactions" per second in 5.7.

But, to get 100K is quite a challenge. Here's what I recommend:

  • SSD (probably exists in AWS; get the max IOPs)
  • RAID striping (parity hurts some, but probably worth having)
  • MyISAM, because of table locking, may not be a good idea if you use multiple-threaded inserting. (I am assuming InnoDB in the rest of this discussion.)
  • What will you be doing with the data? If you do not need SQL to look at individual values, store the 100 values in a JSON string and compress it into a BLOB. Now you are down to a leisurely 1000 writes/second.
  • FusionIO SSDs might do the compression for you. I don't like InnoDB's automatic compression. Doing it in the Client offloads the Server.
  • Indexes: Once you have a huge amount of data, the random updates of the indexes will kill you. Design the PRIMARY KEY so that the inserts can be "at the end of the table".
  • Insert 100-10K rows per batch -- less than that leads to overhead costs; more than that leads to inefficiencies in overrunning the undo log, etc.
  • innodb_flush_log_at_trx_commit=2, sync_binlog may not matter because of the batching.
  • 5.7, possibly MariaDB 10.1
  • If necessary, move the Client(s) to separate servers.

As for how to collect lots of data fast, possibly with multiple threads, read my "High speed ingestion" blog. It talks about ping-ponging a pair of tables -- one for receiving data, the other for processing (normalizing, compressing, summarizing) and shoveling into the Fact table.

Another issue... You are trying to push a few MB into a table every second; that adds up to nearly a terabyte per day. How long will you keep the data? How much disk space do you have? If you will be deleting 'old' data, then PARTITION BY RANGE is a must. My Partitioning blog goes into detail on how to do the DROP PARTITION and REORGANIZE PARTITION to do the deletes very cheaply.

That leads to another suggestion -- process the data, but don't save it. OK, maybe you need an hour's data to process. In this case all the above discussion still applies (except for INDEX restrictions). And my High speed ingestion is probably still worth doing. And you could ping-pong once an hour. One hour might be 10GB -- enough to keep in RAM, hence avoiding the I/O bottleneck.

于 2016-02-23T16:37:56.250 回答
1

还要考虑您预置的 RDS 的底层 EC2 实例大小。

于 2016-02-24T15:13:17.007 回答