python通過thrift操作hbase例項

pythontab發表於2013-01-22

thrift 是facebook開發並開源的一個二進位制通訊中介軟體,通過thrift,我們可以充分利用各個語言的優勢,編寫高效的程式碼。


關於thrift的論文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275


安裝thrift:http://thrift.apache.org/docs/install/ubuntu/


安裝完成後到hbase的目錄下,找到Hbase.thrift,該檔案在


hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到


thrift --gen python hbase.thrift 會生成gen-py資料夾,將其修改成hbase


安裝python的thrift庫


sudo pip install thrift


啟動hbase的thrift服務:bin/hbase-daemon.sh start thrift 預設埠是9090


建立hbase表:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090);
 
transport = TTransport.TBufferedTransport(transport)
 
protocol = TBinaryProtocol.TBinaryProtocol(transport);
 
client = Hbase.Client(protocol)
transport.open()
 
 
contents = ColumnDescriptor(name='cf:', maxVersions=1)
client.createTable('test', [contents])
 
print client.getTableNames()

執行程式碼,成功後,進入hbase的shell,用命令list可以看到剛剛的test表已經建立成功。

插入資料:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
 
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090)
 
transport = TTransport.TBufferedTransport(transport)
 
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
 
transport.open()
 
row = 'row-key1'
 
mutations = [Mutation(column="cf:a", value="1")]
client.mutateRow('test', row, mutations, None)

獲取一行資料:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090)
transport = TTransport.TBufferedTransport(transport)
 
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
 
transport.open()
 
tableName = 'test'
rowKey = 'row-key1'
 
result = client.getRow(tableName, rowKey, None)
print result
for r in result:
    print 'the row is ' , r.row
    print 'the values is ' , r.columns.get('cf:a').value

返回多行則需要使用scan:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090)
transport = TTransport.TBufferedTransport(transport)
 
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
transport.open()
 
scan = TScan()
tableName = 'test'
id = client.scannerOpenWithScan(tableName, scan, None)
 
result2 = client.scannerGetList(id, 10)
 
print result2

scannerGet則是每次只取一行資料:

from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
 
from hbase import Hbase
from hbase.ttypes import *
 
transport = TSocket.TSocket('localhost', 9090)
transport = TTransport.TBufferedTransport(transport)
 
protocol = TBinaryProtocol.TBinaryProtocol(transport)
 
client = Hbase.Client(protocol)
transport.open()
 
scan = TScan()
tableName = 'test'
id = client.scannerOpenWithScan(tableName, scan, None)
result = client.scannerGet(id)
while result:
    print result
    result = client.scannerGet(id)


相關文章