Basic Aggregation in MongoDB 2.1 with Python

jieforest發表於2012-06-10

Why a new framework?

If you've been following along with this article series, you've been introduced to MongoDB's mapreducecommand, which up until MongoDB 2.1 has been the go-to aggregation tool for MongoDB. (There's also the group() command, but it's really no more than a less-capable and un-shardable version of mapreduce(), so we'll ignore it here.)

So if you already have mapreduce() in your toolbox, why would you ever want something else?

Mapreduce is hard; let's go shopping

The first motivation behind the new framework is that, while mapreduce() is a flexible and powerful abstraction for aggregation, it's really overkill in many situations, as it requires you to re-frame. your problem into a form. that's amenable to calculation using mapreduce().

For instance, when I want to calculate the mean value of a property in a series of documents, trying to break that down into appropriate map, reduce, and finalize steps imposes some extra cognitive overhead that we'd like to avoid. So the new aggregation framework is (IMO) simpler.

The Javascript. global interpreter lock is evil

The MapReduce algorithm, the basis of MongoDB's mapreduce() command, is a great approach to solving Embarrassingly Parallel problems.

Each invocation of map, reduce, and finalize is completely independent of the others (though the map/reduce/finalize phases are order-dependent), so we shouldbe able to dispatch these jobs to run in parallel without any problems.

Unfortunately, due to MongoDB's use of the SpiderMonkey Javascript. engine, each mongod process is restricted to running a single Javascript. thread at a time.

So in order to get any parallelism with a MongoDB mapreduce(), you must run it on a sharded cluster, and on a cluster with N shards, you're limited to N-way parallelism.

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/301743/viewspace-732370/，如需轉載，請註明出處，否則將追究法律責任。

MongoDB系列--深入理解MongoDB聚合（Aggregation ）
2019-07-30
MongoDB
MongoDB 新手入門 - Aggregation
2022-05-30
MongoDB
2.1 Statistic Basic統計基礎
2016-11-09
2.1Python語言簡介
2018-02-27
Python
MongoDB University筆記總結-M001_Chapter 5: Indexing and Aggregation Pipeline
2020-11-30
MongoDB筆記APTIndex
2.1 Python基本語法之註釋
2020-11-25
Python
Flink CDC 2.1 正式釋出，穩定性大幅提升，新增 Oracle，MongoDB 支援
2021-11-23
OracleMongoDB
Typescript basic
2019-02-16
TypeScript
MongoDB University課程M103 Basic Cluster Administration 學習筆記
2020-12-21
MongoDB筆記
Python 打包工具 PyInstaller 2.1 釋出
2013-09-29
Python
Django 中 Aggregation聚合的使用
2021-03-06
Django
OpenAPI Basic Structure
2019-09-06
APIStruct
Docker-Basic
2017-10-03
Docker
JUnit basic annotation
2015-07-11
Day 3(Python + Git + MongoDb)
2017-11-07
PythonGitMongoDB
豬行天下之Python基礎——2.1 Python註釋與模組
2019-04-02
Python
Python 潮流週刊#76：用 50 行 Python 程式碼實現 BASIC（摘要）
2024-11-09
Python
Oracle Reporting 3 - Aggregation Level
2013-11-25
Oracle
IPFS_basic_use
2020-12-31
numpy_torch_basic
2024-08-25
什麼是MongoDB？Python爬蟲為什麼使用MongoDB?
2021-05-08
MongoDBPython爬蟲
mongodb資料庫使用03、python和mongodb的互動
2020-11-29
MongoDB資料庫Python
python與MongoDB的連線
2017-12-13
PythonMongoDB
python操作mongodb資料庫
2024-10-23
PythonMongoDB資料庫
Python Guide 系列 2.1：結構化你的專案
2015-08-18
PythonGUIIDE
【MongoDB學習筆記】手把手教你配置Python操作MongoDB
2021-12-23
MongoDB筆記Python
OAuth 2.1 框架
2022-05-03
OAuth框架
crntan 2.1 原理
2019-01-27
Machine Learning - Basic points
2020-01-17
Mac
Visual Basic for Application
2020-04-05
APP
Spark Basic RDD 操作示例
2017-06-01
Spark
Postgres Basic Commands for Beginners
2013-07-31
Python操作MongoDB文件資料庫
2019-05-09
PythonMongoDB資料庫
Python資料庫MongoDB騷操作
2018-11-07
Python資料庫MongoDB
Python 資料庫騷操作 -- MongoDB
2018-11-07
Python資料庫MongoDB
在Python應用中使用MongoDB
2016-12-30
PythonMongoDB
Introduction to MongoDB for Java, PHP and Python Developers
2012-05-29
MongoDBJavaPHPPythonDeveloper
Python|Python互動之mongoDB互動詳解
2018-08-28
PythonMongoDB

Basic Aggregation in MongoDB 2.1 with Python

相關文章