druid相關的時間序列資料庫——也用到了倒排相關的優化技術
Cattell [6] maintains a great summary about existing Scalable SQL and NoSQL data stores. Hu [18] contributed another great summary for streaming databases. Druid feature-wise sits some-
where between Google’s Dremel [28] and PowerDrill [17]. Druid has most of the features implemented in Dremel (Dremel handles arbitrary nested data structures while Druid only allows for a single
level of array-based nesting) and many of the interesting compression algorithms mentioned in PowerDrill. Although Druid builds on many of the same principles as other distributed columnar data stores [15], many of these data stores are
designed to be more generic key-value stores [23] and do not sup
port computation directly in the storage layer. There are also other
data stores designed for some of the same data warehousing issues
that Druid is meant to solve. These systems include in-memory
databases such as SAP’s HANA [14] and VoltDB [43]. These data
stores lack Druid’slowlatency ingestion characteristics. Druidalso
has native analytical features baked in, similar to ParAccel [34],
however, Druid allows system wide rolling software updates with
no downtime.
Druid is similiar to C-Store [38] and LazyBase [8] in that it has
twosubsystems,aread-optimizedsubsysteminthehistoricalnodes
andawrite-optimizedsubsysteminreal-timenodes. Real-timenodes
are designed to ingest a high volume of append heavy data, and do
not support data updates. Unlike the two aforementioned systems,
Druid is meant for OLAP transactions and not OLTP transactions.
Druid’s low latency data ingestion features share some similar-
ities with Trident/Storm [27] and Spark Streaming [45], however,
both systems are focused on stream processing whereas Druid is
focused on ingestion and aggregation. Stream processors are great
complements to Druid as a means of pre-processing the data before
the data enters Druid.
There are a class of systems that specialize in queries on top of
cluster computing frameworks. Shark [13] is such a system for
queriesontopofSpark,andCloudera’sImpala[9]isanothersystem
focused on optimizing query performance on top of HDFS. Druid
historical nodes download data locally and only work with native
Druid indexes. We believe this setup allows for faster query laten
cies.
Druid leverages a unique combination of algorithms in its archi-
tecture. Although we believe no other data store has the same set
of functionality as Druid, some of Druid’s optimization techniques
suchas using inverted indices to perform fast filter sarealsousedin
other data stores [26].
druid白皮書:http://static.druid.io/docs/druid.pdf
本文轉自張昺華-sky部落格園部落格,原文連結:http://www.cnblogs.com/bonelee/p/6433333.html,如需轉載請自行聯絡原作者
相關文章
- 資料庫中的相關術語資料庫
- 資料庫效能優化-索引與sql相關優化資料庫優化索引SQL
- Sybase資料庫空間相關資料庫
- 時間相關的操作
- 資料庫相關資料庫
- 大資料相關技術有哪些?大資料
- 時間相關的工具類
- 【OPTIMIZATION】Oracle影響優化器選擇的相關技術Oracle優化
- Oracle資料庫提高命中率及相關優化Oracle資料庫優化
- Weex技術相關
- sql優化相關SQL優化
- 效能優化的相關策略整理優化
- Mysql的優化的相關知識MySql優化
- 資料庫 (相關練習)資料庫
- ios效能優化相關iOS優化
- 系統優化相關優化
- 【轉載】Oracle資料庫提高命中率及相關優化Oracle資料庫優化
- MSSQL系列 (一):資料庫的相關操作SQL資料庫
- Java 相關的編譯技術(轉)Java編譯
- MySQL資料庫部署及初始化相關MySql資料庫
- python 時間相關模組Python
- 大資料相關術語(1)大資料
- 大資料相關術語(2)大資料
- 整理有關Flashback的相關資料
- 區塊鏈(BlockChain)技術開發相關資料區塊鏈Blockchain
- c++ 相關的技術資源整理歸類C++
- 有關動態規劃的相關優化思想動態規劃優化
- 資料庫事物相關問題資料庫
- sqlite 資料庫 相關知識SQLite資料庫
- iOS 端 DNS 相關技術iOSDNS
- java 相關技術與框架Java框架
- 資料卷的相關命令
- 記憶體優化相關記憶體優化
- Hive優化相關設定Hive優化
- HINT篇---優化器相關優化
- [轉]Mysql資料庫相關資料索引MySql資料庫索引
- 關於相機相簿的一些實用技術
- oracle資料庫網路相關的若干概念Oracle資料庫