cassandra - PHPCassa + Cassandra 上的 TFramedTransport 错误

Question

我们正在删除 Cassandra 中的大量记录。我们得到以下错误。当我们插入大量记录时，我们也会收到此错误：

Error performing remove on 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268
    Stack trace:
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4)
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4)
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame()
    3 [internal function]: TFramedTransport->read(8192)
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(691): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false)
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(664): CassandraClient->recv_remove()
    6 [internal function]: CassandraClient->remove('CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    7 /home/zonefiles/php/connection.php(230): call_user_func_array(Array, Array)
    8 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    9 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...')
    10 {main}
    Error connecting to 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268
    Stack trace:
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4)
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4)
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame()
    3 [internal function]: TFramedTransport->read(8192)
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(1015): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false)
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(992): CassandraClient->recv_describe_version()
    6 /home/zonefiles/php/connection.php(63): CassandraClient->describe_version()
    7 /home/zonefiles/php/connection.php(163): ConnectionWrapper->__construct('CDTMain1', '10.130.279.40:9...', NULL, true, 5000, 5000)
    8 /home/zonefiles/php/connection.php(254): ConnectionPool->make_conn()
    9 /home/zonefiles/php/connection.php(241): ConnectionPool->handle_conn_failure(Object(ConnectionWrapper), 'remove', Object(TTransportException), 1)
    10 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    11 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...')
    12 {main}

这是我们用来生成错误的 PHP：

<?php
set_time_limit(2000);
require 'connection.php';
require 'columnfamily.php';
$servers[0]['host'] = 'private ip';
$servers[0]['port'] = '9160';
$conn = new Connection('Server11', $servers);
$urlFamily = new ColumnFamily($conn, 'Domain'); // ColumnFamily

$start = microtime(true);

$limit = 100000000;

$rows = $urlFamily->get_range($key_start='', $key_finish='zzzzzzzzzzzzzzz',100000000);

$num = 0;
$delCount = 0;

foreach($rows as $key => $columns) {
   // Do stuff with $key or $columns
       if (strpos($key, ' .net') !== false) {
               //echo 'deleting ' . $key . "\n";
               $urlFamily->remove($key);
               $delCount++;
       }
       if ($num++ > 100000000) break;
       //$num++;
       if ($num % 100000 == 0) echo $num . "\n";
}

$end = microtime(true);

echo $num . " total\n";
echo $delCount . ' deleted in ' . ($end - $start) . " seconds\n";
echo $delCount / ($end - $start) . " deleted per second\n";

?>

我们在 Fedora 14 Laughlin 和 Thrift 0.5.0 上运行 PHP 5.3.5。

一种理论是，这是由于 Cassandra 无法足够快地处理命令造成的。你同意/不同意吗？你以前见过这个吗？

如果您建议删除其他方式（例如 Truncate），当我们使用 Cassandra 执行其他操作时，我们如何仍然防止此问题发生？

score 2 · Accepted Answer

这些只是日志消息，还是实际上引发了异常？每次在使用不同的连接重试之前捕获到这样的异常时，phpcassa 都会调用 error_log()。基本上，这意味着您应该密切关注被记录的堆栈跟踪，但您不必太担心它们。

这些是客户端套接字超时，这意味着调用花费的时间超过了默认的 5 秒超时。为什么这些会发生在很大程度上取决于 Cassandra 的行为方式。监控 Cassandra 可能是最好的起点。

score 0 · Accepted Answer

根据我的程序员的说法，我们实际上是通过将超时设置为一个非常高的值来解决这个问题的。我们试图导入一个 5GB 的文件，所以我猜数据库每次读取需要超过 5 秒。

以下是设置的特定超时：

$send_timeout=60000 $recv_timeout=60000

cassandra - PHPCassa + Cassandra 上的 TFramedTransport 错误

2 回答 2

Related

Reference