thrift 是facebook開發並開源的一個二進位制通訊中介軟體,通過thrift,我們可以充分利用各個語言的優勢,編寫高效的程式碼。
關於thrift的論文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275
安裝thrift:http://thrift.apache.org/docs/install/ubuntu/
安裝完成後到hbase的目錄下,找到Hbase.thrift,該檔案在
hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到
thrift --gen python hbase.thrift 會生成gen-py資料夾,將其修改成hbase
安裝python的thrift庫
sudo pip install thrift
啟動hbase的thrift服務:bin/hbase-daemon.sh start thrift 預設埠是9090
建立hbase表:
from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090); transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport); client = Hbase.Client(protocol) transport.open() contents = ColumnDescriptor(name='cf:', maxVersions=1) client.createTable('test', [contents]) print client.getTableNames()
執行程式碼,成功後,進入hbase的shell,用命令list可以看到剛剛的test表已經建立成功。
插入資料:
from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() row = 'row-key1' mutations = [Mutation(column="cf:a", value="1")] client.mutateRow('test', row, mutations, None)
獲取一行資料:
from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() tableName = 'test' rowKey = 'row-key1' result = client.getRow(tableName, rowKey, None) print result for r in result: print 'the row is ' , r.row print 'the values is ' , r.columns.get('cf:a').value
返回多行則需要使用scan:
from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() scan = TScan() tableName = 'test' id = client.scannerOpenWithScan(tableName, scan, None) result2 = client.scannerGetList(id, 10) print result2
scannerGet則是每次只取一行資料:
from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() scan = TScan() tableName = 'test' id = client.scannerOpenWithScan(tableName, scan, None) result = client.scannerGet(id) while result: print result result = client.scannerGet(id)