1

When I try to do an insert/update to HBase via Thrift (Python, specifically), mutateRow() requires a fourth argument "attributes". Thrift says that this column is a string->string map. None of the examples and online discussions mention this fourth column, and even the Thrift examples provided with the same, exact version of HBase don't have it.

If you can, please just include a full example of creating a table, defining a column family, inserting a row, and dumping the data.

4

2 回答 2

3

No problem. Also, instead of just dumping the value of the created column, I actually dump the last three versions of the modified column, just because its cool.

For completeness, I, roughly, did the following to get Thrift working:

  • Downloaded and built Thrift (using SVN.. 2012-11-15/1429368).
  • Ran "thrift -gen py <thrift file>" from the path that I wanted the Python interface files created in.
  • Installed "thrift" package via PIP.

I ran the the following code from the root of the generated files.

from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol

from hbase import Hbase
from hbase.ttypes import *

from random import randrange
from pprint import pprint

socket = TSocket.TSocket('localhost', 9090)
transport = TTransport.TBufferedTransport(socket)
transport.open()
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)

table_name = 'test_table'
row_key = 'test_row1'
colfamily1 = 'test_colfamily1'
column1 = 'test_col1'
fullcol1 = ('%s:%s' % (colfamily1, column1))
value = ('%d' % randrange(1000, 9999))

num_versions = 3

try:
    desc = ColumnDescriptor(colfamily1)
    client.createTable(table_name, [desc])
except AlreadyExists:
    pass

client.mutateRow(table_name, row_key, [Mutation(column=fullcol1, value=value)], {})
results = client.getVer(table_name, row_key, fullcol1, num_versions, {})

pprint(results)

Output:

$ python test.py 
[TCell(timestamp=1357463438825L, value='9842')]
$ python test.py 
[TCell(timestamp=1357463439700L, value='9166'),
 TCell(timestamp=1357463438825L, value='9842')]
$ python test.py 
[TCell(timestamp=1357463440359L, value='2978'),
 TCell(timestamp=1357463439700L, value='9166'),
 TCell(timestamp=1357463438825L, value='9842')]
于 2013-01-06T09:26:30.563 回答
-1

Instead of messing with the low level Thrift API, you should use HappyBase to use HBase from Python. See https://github.com/wbolster/happybase and the tutorial in its documentation at http://happybase.readthedocs.org/en/latest/ for code samples. It includes samples for exactly the things you asked for.

于 2013-01-25T23:27:43.190 回答