1

我的代码如下:

for key,data in table.scan(columns=["raw:dataInfo"]):
   count+=1
   ...

raw:dataInfo 列可能有 50MB 大,当我运行上面的代码时,happybase 崩溃并抛出以下异常:

Traceback (most recent call last):
  File "happybasetestscan.py", line 8, in <module>
    for key,data in table.scan(columns=["raw:sample"],limit=10):
  File "/usr/lib/python2.6/site-packages/happybase/table.py", line 374, in scan
    self.name, scan, {})
.......
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes 

请有任何想法,如何计算大专栏。谢谢!

4

1 回答 1

2

我猜 Thrift 服务器没有正确回答。happybase 报告(通过 thrift 库)无法从套接字读取数据。

无论如何,如果您想进行全表扫描以进行计数(效率低但可以),请在扫描中使用过滤器:

# Scan, get only keys (data will be empty)
scanner = table.scan(
    row_start=b'aaa',
    row_stop=b'bbb',
    filter=b'KeyOnlyFilter() AND FirstKeyOnlyFilter()',
)

for row_key, data in scanner:
    pass  # do something with row_key

有关更多信息,请参阅 https://github.com/wbolster/happybase/issues/12#issuecomment-12754400

于 2016-02-02T10:56:27.410 回答