java - 使用大量连接时 Kinesis 出错

Question

我正在使用 Kinesis 存储从安装在 EC2 服务器上的 Jmeter 发送的记录。问题是当我启动 7200 个线程并在我的 Kinesis 流上使用 1 个共享时，一切正常。如果我启动 9000 个线程，我会收到此错误

Rate exceeded for shard shardId-000000000001 in stream Jmeter under account 769870455028. (Service: AmazonKinesis; Status Code: 400; Error Code: ProvisionedThroughputExceededException; Request ID: 98f687d9-ffbe-11e4-a897-357ee8c24764)

所以我增加了分片的数量，将其设置为 2 和 3，但它不起作用。所以我认为问题不在于分片号，而在于我的 java 代码，或者我不知道还有什么。这是我的代码：

  public MyKinesisClient( String streamName, int partitionKey, String accessKey, String secretKey, String endpoint, String serviceName, String regionId ) {
        this.streamName=streamName;
        this.partitionKey=partitionKey;
        AWSCredentials credentials = null;
        credentials = new BasicAWSCredentials(accessKey, secretKey);
        kinesisClient = new AmazonKinesisClient(credentials);
        kinesisClient.setEndpoint(endpoint,serviceName,regionId);
    }

    /**
     * Metodo utilizzato per l'invio di un json a Kinesis
     * @param json: com.amazonaws.util.json.JSONObject da inviare a Kinesis
     * @throws UnsupportedEncodingException
     * @throws JSONException
     */
    public void sendJson(JSONObject json) throws UnsupportedEncodingException, JSONException {
        try{
        PutRecordRequest putRecordRequest = new PutRecordRequest();
        putRecordRequest.setStreamName(streamName);
        putRecordRequest.setData(ByteBuffer.wrap(json.toString().getBytes("utf-8")));
        //putRecordRequest.setData(ByteBuffer.wrap(String.format("testData-%d", createTime).getBytes()));
        putRecordRequest.setPartitionKey(String.format("partitionKey-%d", partitionKey));
        kinesisClient.putRecord(putRecordRequest);
        }catch(Exception e){
            System.out.println(e.getMessage());
        }

    }

有使用更多分片的说明吗？提前致谢

score 2 · Accepted Answer

我解决了使用两个分区键。来自亚马逊的定义：

分区键用于将记录隔离并路由到流的不同分片。将数据添加到 Amazon Kinesis 流时，您的数据生产者会指定分区键。例如，假设您有一个包含两个分片（分片 1 和分片 2）的流。您可以将数据生产者配置为使用两个分区键（键 A 和键 B），以便将具有键 A 的所有记录添加到分片 1，并将具有键 B 的所有记录添加到分片 2。

很明显，我必须为每个分片使用一个分区键但这很重要：

由于这种散列机制，具有相同分区键的所有数据记录都映射到流中的相同分片。但是，如果分区键的数量超过分片的数量，一些分片必然包含具有不同分区键的记录。从设计的角度来看，为了确保您的所有分片都得到充分利用，分片的数量（由 CreateStreamRequest 的 setShardCount 方法指定）应该大大少于唯一分区键的数量，以及流向单个分区的数据量key 应该大大小于分片的容量。

score 0 · Accepted Answer

@luca 如果您使用 6 个线程组和 6 个分片，您将收到速率超出异常。因为每个分片每秒有 5 个 getRecord 请求的限制。请参考读取油门

java - 使用大量连接时 Kinesis 出错

2 回答 2

Related

Reference