本文來自主要介紹目前最為流行NOSQL 資料庫，介紹了每個NOSQL資料庫的優點，缺點，和適用的場景。
本文是來自德國的一位技術架構師寫的，，從 Kristof Kovacs技術文章上分析
Kristof Kovacs應該是位做機械相關的軟體架構師。
最近一直在讀英文資料，順便翻譯了下，有可能有些地方翻譯的不準確，還請多多指教。
NoSQL(MongoDB,Riak,CouchDB,Redis)
While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.)
關係型資料庫（sql database）是非常有用的工具，sql 資料庫壟斷了10多年了，但這局面即將被打破。這只是時間問題：關聯式資料庫不能適應需求的所有情況。
（話雖這麼說，關聯式資料庫永遠是最好的關係型資料庫）

But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another. This means that it is a bigger responsibility on to choose the appropriate one for a project right at the beginning.

但是，NoSQL資料庫的不同遠超過了關聯式資料庫（sql database）和其他資料庫。這意味著軟體架構師在專案開始時有更大的需求空間選擇好一個適合的 NoSQL資料庫。
In this light, here is a comparison of , , , , , , , , Accumulo, , , , and :
針對這種情況，這裡對 , , , , , , , , Accumulo, , , , 和進行了比較：

The most popular ones

MongoDB (2.2)

Written in: C++
Main point: Retains some friendly properties of SQL. (Query, index)
License: AGPL (Drivers: Apache)
Protocol: Custom, binary (BSON)
Master/slave replication (auto failover with replica sets)
Sharding built-in
Queries are javascript expressions
Run arbitrary javascript functions server-side
Better update-in-place than CouchDB
Uses memory mapped files for data storage
Performance over features
Journaling (with --journal) is best turned on
On 32bit systems, limited to ~2.5Gb
An empty database takes up 192Mb
GridFS to store big data + metadata (not actually an FS)
Has geospatial indexing
Data center aware

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

?使用語言：C++
?主要特點：保留了SQL一些友好的特性（查詢，索引）。
?許可： AGPL（發起者： Apache）
?通訊協議： Custom, binary（ BSON）（譯註：沒使用過該協議）
?Master/slave主從複製（支援自動故障轉移與恢復）
?分片機制
?支援 javascript表示式查詢
?可在伺服器端執行任意javascript 函式
?update-in-place比CouchDB更好
?使用記憶體對映檔案的資料儲存
?效能性比功能性強
?最好開啟日誌功能（可修改引數journal）
?在32位作業系統上，資料庫大小限制在約2.5Gb
?一個空資料庫大約佔192MB
?採用 GridFS儲存大資料和後設資料（不是真正的NF檔案系統）
?有索引（譯註：翻譯不準）
?有資料中心意思（譯註：翻譯不準）
最佳的應用場景：適用於需要動態查詢支援.如果你需要使用索引而不是 map/reduce功能；如果您需要對大資料庫有良好的效能要求，
如果您需要使用CouchDB但資料改變太頻繁而快速佔滿磁碟空間。

例如： Riak (V1.2)

Written in: Erlang & C, some JavaScript
Main point: Fault tolerance
License: Apache
Protocol: HTTP/REST or custom binary
Stores blobs
Tunable trade-offs for distribution and replication
Pre- and post-commit hooks in JavaScript or Erlang, for validation and security.
Map/reduce in JavaScript or Erlang
Links & link walking: use it as a graph database
Secondary indices: but only one at once
Large object support (Luwak)
Comes in "open source" and "enterprise" editions
Full-text search, indexing, querying with Riak Search
In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB"
Masterless multi-site replication replication and SNMP monitoring are commercially licensed

Best used: If you want something Dynamo-like data storage, but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server.

?使用語言：Erlang&C，以及一些Javascript
?主要特點：具備容錯能力
?許可： Apache
?通訊協議： HTTP/REST或者 custom binary
?儲存集中

?可調諧的權衡分配和複製
?JavaScript or Erlang在操作前或操作後進行驗證和安全支援。
?在JavaScript或Erlang中進行 Map/reduce管理
?連線及連線遍歷：可作為圖形資料庫使用
?Secondary indices: but only one at once
?支援大資料物件
?提供開源版和企業版
?支援全文字搜尋，索引，環型查詢
?在遷移的過程中，儲存後端可從“bitcask“到google的“LevelDB”
?支援Masterless多站點複製的複製和SNMP監控商業許可
最佳的應用場景：如果你想使用動態資料儲存，但沒有方式處理膨脹及複雜性的情況。如果你需要很好的單站點的可擴充套件性，可用性和容錯性，但是你已經準備支付多站點複製。
例如：銷售站點的資料蒐集，工廠的控制系統；對當機有嚴格要求的，適用於易於更新的 web伺服器。

CouchDB (V1.2)

Written in: Erlang
Main point: DB consistency, ease of use
License: Apache
Protocol: HTTP/REST
Bi-directional (!) replication,
continuous or ad-hoc,
with conflict detection,
thus, master-master replication. (!)
MVCC - write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via '_changes' (!)
Attachment handling
thus, (standalone js apps)

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

?使用語言： Erlang
?主要特點：DB一致性、易於使用
?許可： Apache
?通訊協議： HTTP/REST
?雙向資料複製
?持續進行或臨時處理
?衝突檢查
?master-master複製
?MVCC – 寫操作不阻塞讀
?檔案之前的版本可用
?Crash-only（可靠的）設計
?實時的進行資料壓縮
?檢視：嵌入式map/reduce
?格式化檢視：列表顯示
?支援伺服器端驗證
?支援認證
?支援實時更新
?支援附件處理
?thus, (standalone js apps)

最佳的應用場景：適用於資料變化較少，執行預定義查詢的應用程式。適用於需要資料版本支援的應用程式。

例如： CRM、CMS系統。 master-master複製對於多站點部署是非常簡單。

Redis (V2.8)

Written in: C
Main point: Blazing fast
License: BSD
Protocol: Telnet-like, binary safe
Disk-backed in-memory database,
Dataset size limited to computer RAM (but can span multiple machines' RAM with clustering)
Master-slave replication, automatic failover
Simple values or data structures by keys
but like ZREVRANGEBYSCORE.
INCR & co (good for rate limiting or statistics)
Bit operations (for example to implement bloom filters)
Has sets (also union/diff/inter)
Has lists (also a queue; blocking pop)
Has hashes (objects of multiple fields)
Sorted sets (high score table, good for range queries)
Lua scripting capabilities (!)
Has transactions (!)
Values can be set to expire (as in a cache)
Pub/Sub lets one implement messaging

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before.

?使用語言：C
?主要特點：執行非常快
?許可： BSD
?通訊協議： Telnet-like, binary safe
?有硬碟儲存支援的記憶體資料庫
?資料集的大小限制為計算機RAM（但可以跨多個機器的記憶體和聚類）
?主從複製，自動故障轉移
?簡單的值、鍵資料結構
?但也支援複雜操作，例如 ZREVRANGEBYSCORE
?INCR & co （適合計算極限值或統計資料）
?支援位操作
?支援 sets（同時也支援 union/diff/inter）
?支援列表（同時也支援佇列、阻塞式pop操作）
?支援雜湊表（帶有多個屬性的物件）
?支援排序
?支援事務
?可將資料設定成過期資料
?Pub/Sub允許使用者實現訊息機制

最佳應用場景：適用於資料變化快且資料庫較小的應用程式（資料常在記憶體處理的）。

例如：股票價格、資料分析、實時資料蒐集、實時通訊。

Clones of Google's Bigtable

HBase (V0.92.0)

Written in: Java
Main point: Billions of rows X millions of columns
License: Apache
Protocol: HTTP/REST (also Thrift)
Modeled after Google's BigTable
Uses Hadoop's HDFS as storage
Map/reduce with Hadoop
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
HTTP supports XML, Protobuf, and binary
Jruby-based (JIRB) shell
Rolling restart for configuration changes and minor upgrades
Random access performance is like MySQL
A cluster consists of several different types of nodes

Best used: Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already.

For example: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

?使用語言： Java
?主要特點：支援數十億、數百萬以上的列
?許可： Apache
?通訊協議：HTTP/REST
?Modeled after Google's BigTable

?使用類似 Hadoop's HDFS 進行儲存
?Map/reduce with Hadoop
?實現謂詞在server端掃描及過濾
?對實時查詢進行最佳化
?支援 HTTP、XML、Protobuf、binary
?基於 Jruby（ JIRB）的shell
?實現滾動式配置和升級
?隨機訪問效能類似MySQL
?一個叢集包含幾種不同型別的節點

最佳的應用場景：適用於非常大的表，並且需要實時訪問的場合。

例如：搜尋引擎。分析日誌資料。任何需要巨大的二維表的要求

Cassandra (1.2)

Written in: Java
Main point: Best of BigTable and Dynamo
License: Apache
Protocol: Thrift & custom binary CQL3
Tunable trade-offs for distribution and replication (N, R, W)
Querying by column, range of keys (Requires indices on anything that you want to search on)
BigTable-like features: columns, column families
Can be used as a distributed hash-table, with an "SQL-like" language, CQL (but no JOIN!)
Data can have expiration (set on INSERT)
Writes can be much faster than reads (when reads are disk-bound)
Map/reduce possible with Apache Hadoop
All nodes are similar, as opposed to Hadoop/HBase
Very good and reliable cross-datacenter replication

Best used: When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.")

For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis.

?使用語言： Java
?主要特點：對大表格支援得最好
?許可： Apache
?通訊協議： Thrift & custom binary CQL3
?可調節的分發及複製(N, R, W)
?查詢列範圍內的鍵值
?類似大表格的特點：列，某個列集合
?Can be used as a distributed hash-table, with an "SQL-like" language, CQL (but no JOIN!)

?資料可以設定有效期
?寫操作比讀操作更快
?所有的節點都是相似的，而不像Hadoop/HBase
?很好的和可靠的跨資料中心的複製

最佳的應用場景：寫操作多過讀操作，如果每個系統組建都必須用 Java編寫。
例如：銀行業，金融業（雖然對於金融交易不是必須的，但這些產業對資料庫的要求會比它們更大）寫比讀更快。

Neo4j (V1.5M02)

Written in: Java
Main point: Graph database - connected data
License: GPL, some features AGPL/commercial
Protocol: HTTP/REST (or embedding in Java)
Standalone, or embeddable into Java applications
Full ACID conformity (including durable data)
Both nodes and relationships can have metadata
Integrated pattern-matching-based query language ("Cypher")
Also the "Gremlin" graph traversal language can be used
Indexing of nodes and relationships
Nice self-contained web admin
Advanced path-finding with multiple algorithms
Indexing of keys and relationships
Optimized for reads
Has transactions (in the Java API)
Scriptable in Groovy
Online backup, advanced monitoring and High Availability is AGPL/commercial licensed

Best used: For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense.

For example: For searching routes in social relations, public transport links, road maps, or network topologies.

所用語言： Java
特點：基於關係的圖形資料庫
使用許可： GPL，其中一些特性使用 AGPL/商業許可
協議： HTTP/REST（或嵌入在 Java中）
可獨立使用或嵌入到 Java應用程式
圖形的節點和邊都可以帶有後設資料
很好的自帶web管理功能
使用多種演算法支援路徑搜尋
使用鍵值和關係進行索引
為讀操作進行最佳化
支援事務（用 Java api）
使用 Gremlin圖形遍歷語言
支援 Groovy指令碼
支援線上備份，高階監控及高可靠性支援使用 AGPL/商業許可

最佳應用場景：適用於圖形一類資料。這是 Neo4j與其他nosql資料庫的最顯著區別

例如：社會關係，公共交通網路，地圖及網路拓譜

Hypertable (0.9.6.5)

Written in: C++
Main point: A faster, smaller HBase
License: GPL 2.0
Protocol: Thrift, C++ library, or HQL shell
Implements Google's BigTable design
Run on Hadoop's HDFS
Uses its own, "SQL-like" language, HQL
Can search by key, by cell, or for values in column families.
Search can be limited to key/column ranges.
Sponsored by Baidu
Retains the last N historical values
Tables are in namespaces
Map/reduce with Hadoop

Best used: If you need a better HBase.

For example: Same as HBase, since it's basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

?使用語言： C++
?主要特點：小的，非常快
?許可： GPL 2.0
?通訊協議： Thrift, C++ library, or HQL shell
?實現了谷歌的Bigtable的設計
?執行在Run on Hadoop's HDFS

NoSQL(MongoDB,Riak,CouchDB,Redis)