MongoDB學習之聚合操作

小勇勇發表於2021-12-02

MongoDB學習之基本操作

mongo的聚合操作和mysql的查詢類比

SQL 操作/函式mongodb聚合操作
where$match
group by$group
having$match
select$project
order by$sort
limit$limit
sum()$sum
count()$sum
join$lookup
下面一些例子和sql做對比

下面是使用資料庫的一個基本結構

資料連結(提取碼:gqh2)

{ 
  _id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: [ 
     { 
       product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     {
       product: 'Refined Concrete Ball',
       sku: '1732',
       qty: 61,
       price: Decimal128("47.00"),
       cost: Decimal128("47") 
     },
  ],
  total: Decimal128("407") 
}

先來一些操作案例

select sum(total) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:"$total"}}})
結果:{ _id: null, 'total': Decimal128("44019609") }
select count(1) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:1}}})
結果:{ _id: null, total: 100000 }
select count(1) from orders group by status

db.orders.aggregate({$group:{_id:"$status",total:{$sum:1}}})
結果:
{ _id: 'created', total: 20087 }
{ _id: 'shipping', total: 20017 }
{ _id: 'cancelled', total: 19978 }
{ _id: 'completed', total: 20015 }
{ _id: 'fulfilled', total: 19903 }
select count(1) from orders group by status having count(1) > 20000

db.orders.aggregate([
    {$group:{_id:{status:'$status'},total:{$sum:1}}},
    {$match:{total:{$gte:20000}}}
 ])
結果:
{ _id: { status: 'created' }, total: 20087 }
{ _id: { status: 'shipping' }, total: 20017 }
{ _id: { status: 'completed' }, total: 20015 }
select count(1) total
from orders 
group by status,year(orderDate),month(orderDate)
order by year(orderDate),month(orderDate)

db.orders.aggregate([
  {
    $group:{
      _id:{
         status:'$status',
         orderDate:{
           year:{$year:"$orderDate"},
           month:{$month:"$orderDate"}
         }
       },
       total:{$sum:1}
     }
  },{
    $sort:{"_id.orderDate.year":1,"_id.orderDate.month":1}
  }
])
結果:
{ _id: { status: 'cancelled', orderDate: { year: 2019, month: 1 } }, total: 2066 }
{ _id: { status: 'shipping', orderDate: { year: 2019, month: 1 } }, total: 2058 }
{ _id: { status: 'completed', orderDate: { year: 2019, month: 1 } }, total: 2068 }
{ _id: { status: 'created', orderDate: { year: 2019, month: 1 } }, total: 2047 }
{ _id: { status: 'fulfilled', orderDate: { year: 2019, month: 1 } }, total: 2076 }
{ _id: { status: 'cancelled', orderDate: { year: 2019, month: 2 } }, total: 1816 }
{ _id: { status: 'created', orderDate: { year: 2019, month: 2 } }, total: 1817 }
{ _id: { status: 'shipping', orderDate: { year: 2019, month: 2 } }, total: 1844 }
{ _id: { status: 'completed', orderDate: { year: 2019, month: 2 } }, total: 1813 }
{ _id: { status: 'fulfilled', orderDate: { year: 2019, month: 2 } }, total: 1913 }
......
select *
from(
    select month(orderDate) month,name,status
    from orders
) order
where month = 2

db.orders.aggregate([{$project:{month:{$month:"$orderDate"},name:1,status:1}},{$match:{month:2}}]) 
結果:
{ _id: ObjectId("5dbe7a542411dc9de6429190"),name: 'Kris Hansen',status: 'cancelled',month: 2 }
{ _id: ObjectId("5dbe7a542411dc9de6429191"),name: 'Constantin Wuckert',status: 'completed',month: 2 }
{ _id: ObjectId("5dbe7a545368f69de2b4d375"),name: 'Reed Jerde',status: 'fulfilled',month: 2 }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1d2"),name: 'Lyric Hodkiewicz',status: 'cancelled',month: 2 }
.....
select count(*) from orders where month(orderDate) >= 3 group by month(orderDate)

db.orders.aggregate([
    {$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
#結果
{ _id: { month: 6 }, count: 9915 }
{ _id: { month: 4 }, count: 10083 }
{ _id: { month: 10 }, count: 9928 }
{ _id: { month: 5 }, count: 10142 }
{ _id: { month: 8 }, count: 10194 }
{ _id: { month: 9 }, count: 9779 }
{ _id: { month: 7 }, count: 10240 }
{ _id: { month: 3 }, count: 10201 }

MongoDB聚合(Aggregate)

MongoDB 中聚合(aggregate)主要用於處理資料(諸如統計平均值,求和等),並返回計算後的資料結果

一個aggregate由多個階段(Stage)組成。上一階段產生的結果會作為下一階段的輸入,所以也會被形象的稱為流水線(Pipeline)。

表示式:處理輸入文件並輸出。表示式是無狀態的,只能用於計算當前聚合管道的文件,不能處理其它的文件。

這裡我們介紹一下聚合框架中常用的幾個操作:

  • $project:修改輸入文件的結構。可以用來重新命名、增加或刪除域,也可以用於建立計算結果以及巢狀文件。
  • $match:用於過濾資料,只輸出符合條件的文件。$match使用MongoDB的標準查詢操作。
  • $limit:用來限制MongoDB聚合管道返回的文件數。
  • $skip:在聚合管道中跳過指定數量的文件,並返回餘下的文件。
  • $unwind:將文件中的某一個陣列型別欄位拆分成多條,每條包含陣列中的一個值。
  • $group:將集合中的文件分組,可用於統計結果。
  • $sort:將輸入文件排序後輸出。
  • $geoNear:輸出接近某一地理位置的有序文件。

下面是一個aggregate的基本處理流程

image-20211121220504864

  • db.collection.aggregate() 可以用多個構件建立一個管道,對於一連串的文件進行處理。這些構件包括:篩選操作的match、對映操作的project、分組操作的group、排序操作的sort、限制操作的limit、和跳過操作的skip
  • db.collection.aggregate()使用了MongoDB內建的原生操作,聚合效率非常高,支援類似於SQL Group By操作的功能。
  • 每個階段管道限制為100MB的記憶體。如果一個節點管道超過這個極限,MongoDB將產生一個錯誤。為了能夠在處理大型資料集,可以設定allowDiskUse為true來在聚合管道節點把資料寫入臨時檔案。這樣就可以解決100MB的記憶體的限制。
  • db.collection.aggregate()可以作用在分片集合,但結果不能輸在分片集合,MapReduce可以 作用在分片集合,結果也可以輸在分片集合。
  • db.collection.aggregate()方法可以返回一個指標(cursor),資料放在記憶體中,直接操作。跟Mongo shell 一樣指標操作。
  • db.collection.aggregate()輸出的結果只能儲存在一個文件中,BSON Document大小限制為16M。可以通過返回指標解決,版本2.6中後面:db.collect.aggregate()方法返回一個指標,可以返回任何結果集的大小。

$count

返回文件統計數

先看一些非聚合操作中的count使用方法

#對應查詢出來的是orders這個集合中的所有資料總和
db.orders.count();
#結果:{"result": 100000}
#對應查出來自Malaysia這個國家的訂單總和
db.orders.find({country:"Malaysia"}).count()
#結果:{"result": 392}

使用聚合操作中$count來彙總行數

#使用聚合查出來自Malaysia這個國家的訂單總和,並且返回給counts欄位
db.orders.aggregate([
    {$match:{country:"Malaysia"}},
    {$count:"counts"}
])
#結果:{"counts": 392}
#下面是兩種不同的寫法只是在 $match的時候有所區別(可以先體驗以下)
db.orders.aggregate([
    {$match:{country:{$eq:"Malaysia"}}},
    {$count:"counts"}
])
db.orders.aggregate([
    {$match:{$expr:{$eq:["$country","Malaysia"]}}},
    {$count:"counts"}
])
#結果:{"counts": 392}

除此以外可以靈活使用group+$sum來實現$count

#對應查詢出來的是orders這個集合中的所有資料總和,並且返回給counts欄位
db.orders.aggregate({$group:{_id:null,counts:{$sum:1}}})
#結果:{"_id": null,"counts": 100000}
#使用聚合查出來自Malaysia這個國家的訂單總和,並且返回給counts欄位
db.orders.aggregate([
    {$match:{country:{$eq:"Malaysia"}}},
    {$group:{_id:null,counts:{$sum:1}}}
])
#結果:{"_id": null,"counts": 392}

$group

按照指定表示式對文件進行分組

$group使用的基本語法:

{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }
  1. _id+表示式用來做分組條件,也就是_id後面的內容與sqlgroup by後面的表示式的用途相同
  2. _id後面的 欄位+accumulator操作符與sql中做完group by後在select後面的的聚合函式用途相同,例如:sum()avg()max()min()

例如:

db.orders.aggregate({$group:{_id:"$country",total:{$sum:"$total"}}})
#結果
{ _id: 'Guam', total: Decimal128("182335") }
{ _id: 'El Salvador', total: Decimal128("159475") }
{ _id: 'Saint Martin', total: Decimal128("163267") }
{ _id: 'Botswana', total: Decimal128("189330") }
{ _id: 'San Marino', total: Decimal128("174200") }
{ _id: 'Czech Republic', total: Decimal128("178602") }
{ _id: 'Estonia', total: Decimal128("172816") }
.......
#上面的mql相當於sql中的
select sum(total) from orders group by country

$group階段的記憶體限制為100M。預設情況下,如果stage超過此限制,$group將產生錯誤。但是,要允許處理大型資料集,請將allowDiskUse選項設定為true以啟用$group操作以寫入臨時檔案。

名稱描述類比sql
$avg計算均值avg
$first返回每組第一個文件,如果有排序,按照排序,如果沒有按照預設的儲存的順序的第一個文件。limit 0,1
$last返回每組最後一個文件,如果有排序,按照排序,如果沒有按照預設的儲存的順序的最後個文件。-
$max根據分組,獲取集合中所有文件對應值得最大值。max
$min根據分組,獲取集合中所有文件對應值得最小值。min
$push將指定的表示式的值新增到一個陣列中。-
$addToSet將表示式的值新增到一個集合中(無重複值,無序)。-
$sum計算總和sum
$stdDevPop返回輸入值的總體標準偏差(population standard deviation)-
$stdDevSamp返回輸入值的樣本標準偏差(the sample standard deviation)-

下面我們按照以上文件依次用一下每一個表示式

  1. $avg計算平均值

    --計算每個國家的每個訂單的平均消費
    db.orders.aggregate({$group:{
            _id:"$country",
            avgMoney:{$avg:"$total"}
        }})
    --結果    
    { _id: 'Saudi Arabia',avgMoney: Decimal128("433.4898419864559819413092550790068") }
    { _id: 'New Caledonia',avgMoney: Decimal128("441.9833729216152019002375296912114") }
    { _id: 'Congo',avgMoney: Decimal128("451.8834951456310679611650485436893") }
    { _id: 'Turkey',avgMoney: Decimal128("425.7422434367541766109785202863962") }
    { _id: 'Cuba',avgMoney: Decimal128("437.2074074074074074074074074074074") }
    { _id: 'Uruguay',avgMoney: Decimal128("434.1564792176039119804400977995110") }
    ......
  2. $first返回第一個文件

    --根據國家分組,每組第一筆訂單的訂單商品列表
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrderLines:{$first:"$orderLines"}
    }})
    --結果
    { _id: 'Malta',firstOrderLines: [ 
      { product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") },
      { product: 'Intelligent Metal',sku: '179',qty: 62,price: Decimal128("91.00"),cost: Decimal128("90.09") },
      { product: 'Intelligent Granite',sku: '9',qty: 31,price: Decimal128("68.00"),cost: Decimal128("61.88") },
      { product: 'Licensed Cotton',sku: '6846',qty: 9,price: Decimal128("16.00"),cost: Decimal128("15.68") } 
    ] }
    { _id: 'Papua New Guinea',firstOrderLines: [ 
      { product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
      ...
    ] }
    ......
    
    --根據國家分組,每組第一筆訂單的訂單商品列表裡面的第一條商品資訊
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrder:{$first:{$first:"$orderLines"}}
    }})
    ---結果
    { _id: 'Malta',firstOrder:  
     { product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") }
    }
    { _id: 'Papua New Guinea',firstOrder:
      { product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
    }
    ......
  3. $last返回最後一個文件

    --根據每個國家分組,每筆最後一個訂單的orderDate
    db.orders.aggregate([{$group:{
        _id:"$country",
        lastOrderDate:{$last:"$orderDate"}
    }}])
    --結果
    { _id: 'Micronesia', lastOrderDate: 2019-01-15T07:23:18.002Z }
    { _id: 'Malaysia', lastOrderDate: 2019-05-15T20:16:56.644Z }
    { _id: 'San Marino', lastOrderDate: 2019-09-29T06:10:07.292Z }
  4. $max$min:最大值和最小值

    --根據年月分組,查出每組第一筆訂單時間和最後一組訂單時間
    db.orders.aggregate({$group:{
        _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        maxDate:{$max:"$orderDate"},
        minDate:{$min:"$orderDate"}
    }})
    --結果
    { _id: { year: 2019, month: 1 }, maxDate: 2019-01-31T23:53:57.308Z, minDate: 2019-01-01T00:03:59.661Z }
    { _id: { year: 2019, month: 4 }, maxDate: 2019-04-30T23:57:03.352Z, minDate: 2019-04-01T00:02:12.224Z }
    { _id: { year: 2019, month: 3 }, maxDate: 2019-03-31T23:55:10.312Z, minDate: 2019-03-01T00:13:53.761Z }
    { _id: { year: 2019, month: 7 }, maxDate: 2019-07-31T23:55:51.718Z, minDate: 2019-07-01T00:00:07.540Z }
  5. $push將指定值新增到一個陣列當中可以push到一個已經存在的陣列當中,如果不存在會建立這樣一個陣列

    --根據城市、年、月分組,將每組的下單時間push到一個新的 orderDates 陣列當中
    db.orders.aggregate({$group:{
        _id:{city:"$city",year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        orderDates:{$push:"$orderDate"},
    }})
    --結果
    { _id: { city: 'Kennedifurt', year: 2019, month: 9 }, orderDates: [ 2019-09-30T10:25:19.763Z ] }
    { _id: { city: 'South Jewelstad', year: 2019, month: 1 }, orderDates: [ 2019-01-06T19:59:03.380Z ] }
    { _id: { city: 'Germanmouth', year: 2019, month: 9 }, orderDates: [ 2019-09-25T07:45:54.260Z ] }
    { _id: { city: 'Fayebury', year: 2019, month: 8 }, orderDates: [ 2019-08-12T11:08:37.815Z ] }
    { _id: { city: 'New Lailaport', year: 2019, month: 1 }, orderDates: [ 2019-01-19T12:28:56.978Z ] }
    { _id: { city: 'Port Bennyside', year: 2019, month: 2 }, orderDates: [ 2019-02-25T01:18:21.657Z ] }
    { _id: { city: 'Abernathymouth', year: 2019, month: 6 }, orderDates: 
       [ 2019-06-03T18:03:21.149Z,
         2019-06-13T23:35:32.994Z,
         2019-06-18T11:32:22.229Z ] 
    }
  6. $addToSet將指定值新增到一個集合當中集合是無序的並且會去重

    --根據月份分組,將每個月都下單過的國家都新增到 countrySet 中去
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            countrySet:{$addToSet:"$country"}
        }
    })
    --結果  
    {
        "_id": {
          "year": 2019,
          "month": 1
        },
        "countrySet": ["French Guiana", "Germany", "Poland", "Comoros", "Portugal", "Fiji", "France", "Benin", "Greece", "Belarus", "Vietnam", "Ireland", "Vanuatu", "Netherlands Antilles", "Iceland", "Palestinian Territory", "Malawi", "Brazil", "Libyan Arab Jamahiriya", "Kuwait", "Liechtenstein", "Suriname", "Uganda", "New Caledonia", "Bolivia", "Nicaragua", "Burundi", "Uzbekistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Tajikistan", "Mexico", "Singapore", "Sri Lanka", "Antarctica (the territory South of 60 deg S)", "Myanmar", "Tonga", "Slovenia", "Latvia", "Ukraine", "Oman", "Saint Helena", "Bosnia and Herzegovina", "Hungary", "Aruba", "Jordan", "Solomon Islands", "Mozambique", "Svalbard & Jan Mayen Islands", "Taiwan", "Cyprus", "Thailand", "Equatorial Guinea", "Belize", "Niger", "Israel", "Hong Kong", "Senegal", "Costa Rica", "Sierra Leone", "Kiribati", "Lesotho", "Nepal", "Serbia", "Barbados", "Spain", "Czech Republic", "Saint Martin", "Saint Pierre and Miquelon", "Togo", "Somalia", "Northern Mariana Islands", "Maldives", "British Indian Ocean Territory (Chagos Archipelago)", "Montenegro", "Cote d'Ivoire", "United Arab Emirates", "Guernsey", "Bulgaria", "Netherlands", "Greenland", "Niue", "Colombia", "Egypt", "Madagascar", "Brunei Darussalam", "Iraq", "Mauritius", "French Polynesia", "Jersey", "Canada", "Grenada", "Honduras", "New Zealand", "Cocos (Keeling) Islands", "Mayotte", "Virgin Islands, British", "Finland", "Macedonia", "Cook Islands", "Micronesia", "Christmas Island", "Turks and Caicos Islands", "Falkland Islands (Malvinas)", "El Salvador", "Estonia", "Eritrea", "Afghanistan", "San Marino", "Malaysia", "Cambodia", "Anguilla", "Philippines", "Zambia", "Republic of Korea", "Mauritania", "Yemen", "South Africa", "Gambia", "Namibia", "Peru", "Samoa", "Qatar", "Guinea", "Monaco", "Mongolia", "Cayman Islands", "Bouvet Island (Bouvetoya)", "Romania", "Sweden", "Guam", "Guyana", "Japan", "Bangladesh", "Djibouti", "Reunion", "Central African Republic", "Martinique", "Sudan", "Norway", "Guadeloupe", "Malta", "Papua New Guinea", "Macao", "Tunisia", "Iran", "Ghana", "Trinidad and Tobago", "Syrian Arab Republic", "French Southern Territories", "Russian Federation", "Botswana", "Pakistan", "Luxembourg", "Ethiopia", "Austria", "Rwanda", "Holy See (Vatican City State)", "American Samoa", "Tanzania", "Morocco", "Lao People's Democratic Republic", "Faroe Islands", "Bahrain", "China", "Indonesia", "Ecuador", "Tuvalu", "Panama", "Algeria", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Cape Verde", "Palau", "Armenia", "Dominican Republic", "Bhutan", "Liberia", "India", "Mali", "Switzerland", "Isle of Man", "Argentina", "Virgin Islands, U.S.", "Swaziland", "Timor-Leste", "Azerbaijan", "Bahamas", "Guatemala", "Saint Lucia", "Sao Tome and Principe", "United States Minor Outlying Islands", "Australia", "Italy", "Paraguay", "Tokelau", "Gabon", "Wallis and Futuna", "Cameroon", "Norfolk Island", "Guinea-Bissau", "Chad", "Zimbabwe", "Nauru", "Pitcairn Islands", "Georgia", "Kenya", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea", "Puerto Rico", "Croatia", "Antigua and Barbuda", "Seychelles", "Marshall Islands", "Burkina Faso", "Denmark", "United Kingdom", "Dominica", "Albania", "Angola", "Slovakia (Slovak Republic)", "Western Sahara", "Belgium", "Saudi Arabia", "Turkey", "Congo", "Cuba", "Uruguay", "Montserrat", "United States of America", "Lebanon", "Saint Vincent and the Grenadines", "Saint Kitts and Nevis", "Saint Barthelemy", "Haiti", "Moldova", "Heard Island and McDonald Islands", "Lithuania", "Turkmenistan", "Venezuela", "Andorra"]
      },
      {
        "_id": {
          "year": 2019,
          "month": 9
        },
        "countrySet": ["Germany", "Poland", "French Guiana", "Fiji", "France", "Comoros", "Portugal", "Benin", "Greece", "Belarus", "Ireland", "Vietnam", "Brazil", "Malawi", "Vanuatu", "Netherlands Antilles", "Palestinian Territory", "Iceland", "Kuwait", "Libyan Arab Jamahiriya", "Liechtenstein", "New Caledonia", "Suriname", "Uganda", "Bolivia", "Uzbekistan", "Burundi", "Nicaragua", "Tajikistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Sri Lanka", "Mexico", "Singapore", "Antarctica (the territory South of 60 deg S)", "Tonga", "Myanmar", "Slovenia", "Latvia", "Oman", "Saint Helena", "Ukraine", "Bosnia and Herzegovina", "Aruba", "Jordan", "Hungary", "Mozambique", "Solomon Islands", "Svalbard & Jan Mayen Islands", "Thailand", "Taiwan", "Cyprus", "Equatorial Guinea", "Belize", "Niger", "Senegal", "Hong Kong", "Israel", "Kiribati", "Costa Rica", "Sierra Leone", "Lesotho", "Saint Martin", "Spain", "Barbados", "Nepal", "Togo", "Maldives", "Czech Republic", "Somalia", "Saint Pierre and Miquelon", "Serbia", "Northern Mariana Islands", "Montenegro", "British Indian Ocean Territory (Chagos Archipelago)", "Cote d'Ivoire", "United Arab Emirates", "Guernsey", "Niue", "Bulgaria", "Netherlands", "Egypt", "Colombia", "Greenland", "Brunei Darussalam", "Madagascar", "Mauritius", "Iraq", "Canada", "French Polynesia", "Jersey", "Grenada", "Cocos (Keeling) Islands", "New Zealand", "Honduras", "Virgin Islands, British", "Mayotte", "Cook Islands", "Finland", "Macedonia", "Micronesia", "Turks and Caicos Islands", "Christmas Island", "Estonia", "Falkland Islands (Malvinas)", "El Salvador", "Eritrea", "Malaysia", "San Marino", "Afghanistan", "Anguilla", "Cambodia", "Zambia", "Republic of Korea", "Mauritania", "Philippines", "South Africa", "Gambia", "Yemen", "Qatar", "Peru", "Namibia", "Guinea", "Samoa", "Cayman Islands", "Monaco", "Mongolia", "Bouvet Island (Bouvetoya)", "Romania", "Sweden", "Guam", "Guyana", "Djibouti", "Japan", "Bangladesh", "Reunion", "Central African Republic", "Sudan", "Norway", "Martinique", "Guadeloupe", "Papua New Guinea", "Malta", "Tunisia", "Macao", "Iran", "Ghana", "Syrian Arab Republic", "Trinidad and Tobago", "French Southern Territories", "Botswana", "Luxembourg", "Russian Federation", "Pakistan", "Ethiopia", "Holy See (Vatican City State)", "Panama", "Austria", "Rwanda", "American Samoa", "Faroe Islands", "Tanzania", "Morocco", "Lao People's Democratic Republic", "Ecuador", "China", "Indonesia", "Bahrain", "Algeria", "Tuvalu", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Palau", "Cape Verde", "Bhutan", "Dominican Republic", "Armenia", "Mali", "Isle of Man", "Liberia", "India", "Switzerland", "Argentina", "Virgin Islands, U.S.", "Timor-Leste", "Swaziland", "Azerbaijan", "United States Minor Outlying Islands", "Saint Lucia", "Bahamas", "Guatemala", "Australia", "Sao Tome and Principe", "Tokelau", "Paraguay", "Italy", "Wallis and Futuna", "Gabon", "Cameroon", "Guinea-Bissau", "Chad", "Norfolk Island", "Zimbabwe", "Nauru", "Georgia", "Kenya", "Pitcairn Islands", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea", "Croatia", "Puerto Rico", "Antigua and Barbuda", "Seychelles", "Marshall Islands", "Burkina Faso", "Dominica", "Denmark", "Albania", "United Kingdom", "Angola", "Slovakia (Slovak Republic)", "Western Sahara", "Belgium", "Turkey", "Congo", "Saudi Arabia", "Uruguay", "Cuba", "United States of America", "Montserrat", "Lebanon", "Saint Kitts and Nevis", "Saint Vincent and the Grenadines", "Saint Barthelemy", "Haiti", "Moldova", "Lithuania", "Heard Island and McDonald Islands", "Turkmenistan", "Venezuela", "Andorra"]
      }
    .......
  7. $sum計算總和

    --根據月份分組,獲取每組的收入總和 sumTotal
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sumTotal:{$sum:"$total"}
        }
    })
    --結果
    { _id: { year: 2019, month: 2 }, sumTotal: Decimal128("4072808") }
    { _id: { year: 2019, month: 10 }, sumTotal: Decimal128("4356471") }
    { _id: { year: 2019, month: 5 }, sumTotal: Decimal128("4460433") }
    ......
  8. $stdDevPop返回輸入值的總體標準偏差

    image-20211129005142568

    --根據月份分組,計算總體準偏差計算
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            popTotal:{$stdDevPop:"$total"}
        }
    })
    --結果
    { _id: { year: 2019, month: 2 }, popTotal: 189.3064965965138 }
    { _id: { year: 2019, month: 10 }, popTotal: 187.19676293125292 }
    { _id: { year: 2019, month: 5 }, popTotal: 189.54277980510432 }
    { _id: { year: 2019, month: 8 }, popTotal: 189.52305549485735 }
    { _id: { year: 2019, month: 6 }, popTotal: 189.99641948294692 }
    { _id: { year: 2019, month: 1 }, popTotal: 188.89723701416594 }
    { _id: { year: 2019, month: 4 }, popTotal: 189.33635941008336 }
    { _id: { year: 2019, month: 3 }, popTotal: 190.39465578257668 }
    { _id: { year: 2019, month: 7 }, popTotal: 189.01641050584374 }
    { _id: { year: 2019, month: 9 }, popTotal: 188.10379143822877 }
  9. $stdDevSamp返回輸入值的樣本標準偏差

    image-20211129005507209

    --根據月份分組,計算樣本準偏差計算
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sampTotal:{$stdDevSamp:"$total"}
        }
    })
    --結果
    { _id: { year: 2019, month: 2 }, sampTotal: 189.31678247750685 }
    { _id: { year: 2019, month: 9 }, sampTotal: 188.1134099175866 }
    { _id: { year: 2019, month: 7 }, sampTotal: 189.02564049879336 }
    { _id: { year: 2019, month: 3 }, sampTotal: 190.40398862519802 }
    { _id: { year: 2019, month: 5 }, sampTotal: 189.55212494401323 }
    { _id: { year: 2019, month: 4 }, sampTotal: 189.34574899869335 }
    { _id: { year: 2019, month: 1 }, sampTotal: 188.90639411415503 }
    { _id: { year: 2019, month: 8 }, sampTotal: 189.53235199281477 }
    { _id: { year: 2019, month: 6 }, sampTotal: 190.00600146946147 }
    { _id: { year: 2019, month: 10 }, sampTotal: 187.20619136123352 }

$match

接受一個指定查詢條件的文件。查詢語法與讀操作查詢語法相同。

基本的語法{ $match: { <query> } }

在實際應用中儘可能將$match放在管道的前面位置。這樣有兩個好處:

  1. 可以快速將不需要的文件過濾掉,以減少管道的工作量
  2. 如果再投射和分組之前執行$match查詢可以使用索引
--類似於in查詢
db.orders.aggregate({
    $match:{
        country:{$in:["Romania", "Sweden", "Guam", "Guyana"]}
    }
})
--結果:查出這幾個國家的訂單
--範圍查詢
db.orders.aggregate({
    $match:{
        orderDate:{$gte:ISODate("2019-02-01"),$lt:ISODate("2019-02-04")}
    }
})
--結果:查出 2019-02-01 到 2019-02-03這三天的所有訂單

$expr使用聚合表示式

--查詢3月份以及往後的資料 - 根據月份分組 - sum訂單數
db.orders.aggregate([
    {$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
--結果
{ _id: { month: 7 }, count: 10240 }
{ _id: { month: 5 }, count: 10142 }
{ _id: { month: 6 }, count: 9915 }
{ _id: { month: 4 }, count: 10083 }
{ _id: { month: 10 }, count: 9928 }
{ _id: { month: 9 }, count: 9779 }
{ _id: { month: 3 }, count: 10201 }
{ _id: { month: 8 }, count: 10194 }

$mod使用取模運算子

--查詢total屬性後面是00結尾的訂單
db.orders.aggregate([
    {$match:{total:{$mod:[100,0]}}}
 ]);
--結果
{ _id: ObjectId("5dbe7a575368f69de2b4d4db"),
  street: '5929 Elroy Points',
  city: 'Retaberg',
  state: 'Utah',
  country: 'Cote d\'Ivoire',
  zip: '73722-0034',
  phone: '113.509.1520',
  name: 'Sanford Runte',
  userId: 7843,
  orderDate: 2019-02-21T20:26:32.458Z,
  status: 'completed',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Steel Shoes',
       sku: '1374',
       qty: 82,
       price: Decimal128("15.00"),
       cost: Decimal128("13.35") },
     { product: 'Sleek Frozen Salad',
       sku: '2698',
       qty: 79,
       price: Decimal128("41.00"),
       cost: Decimal128("33.21") },
     { product: 'Intelligent Granite Mouse',
       sku: '17',
       qty: 55,
       price: Decimal128("54.00"),
       cost: Decimal128("50.76") },
     { product: 'Handcrafted Wooden Chicken',
       sku: '2079',
       qty: 4,
       price: Decimal128("17.00"),
       cost: Decimal128("17") } ],
  total: Decimal128("500") }
{ _id: ObjectId("5dbe7a575368f69de2b4d50c"),
  street: '6159 Vandervort Camp',
  city: 'South Bobby',
  state: 'Montana',
  country: 'Guernsey',
  zip: '55141',
  phone: '173.672.8440 x661',
  name: 'Jovan Rice',
  userId: 3526,
  orderDate: 2019-09-14T21:05:45.049Z,
  status: 'shipping',
  shippingFee: Decimal128("9.00"),
  orderLines: 
   [ { product: 'Small Metal Sausages',
       sku: '8130',
       qty: 11,
       price: Decimal128("80.00"),
       cost: Decimal128("67.2") },
     { product: 'Intelligent Rubber Chicken',
       sku: '3775',
       qty: 61,
       price: Decimal128("10.00"),
       cost: Decimal128("8") },
     { product: 'Generic Rubber Table',
       sku: '7102',
       qty: 36,
       price: Decimal128("10.00"),
       cost: Decimal128("8.5") } ],
  total: Decimal128("100") }
......

$regex使用正規表示式匹配

--以184開頭的手機號的訂單數量
db.orders.aggregate([
    {$match:{ phone: { $regex: /^184/ }}},
    {$count:"counts"}
 ]);
--結果
{"counts": 55}

$unwind

將陣列拆分成單獨的文件

格式

{
  $unwind:
  {
      path: <field path>,
      includeArrayIndex: <string>,  
      preserveNullAndEmptyArrays: <boolean> 
  } 
}

includeArrayIndex:可選,一個新欄位的名稱用於存放元素的陣列索引。該名稱不能以$開頭。

preserveNullAndEmptyArrays:可選,預設為false,若為true,如果path沒有對應的欄位或者對應的陣列size為0,則$unwind輸出文件,預設false不輸出。

--篩選一條資料,將陣列拆分
db.orders.aggregate([
    {$match:{_id:ObjectId("5dbe7aa650fc769de3e1b551")}},
    {$unwind:"$orderLines"},
 ]);
--結果
{ _id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Sleek Granite Gloves',
     sku: '6176',
     qty: 31,
     price: Decimal128("74.00"),
     cost: Decimal128("71.04") },
  total: Decimal128("313") }
{ _id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Licensed Soft Cheese',
     sku: '2702',
     qty: 70,
     price: Decimal128("55.00"),
     cost: Decimal128("53.9") },
  total: Decimal128("313") }
......

$project

從文件中指定想要的欄位和不想要的欄位

格式

{ $project: { <specification(s)> } }

specifications有以下形式:

<field>: <1 or true> 包含該欄位
<field>: <0 or false> 不包含該欄位

_id: <0 or false>        是否指定_id欄位
--如果有一個屬性為或幾個屬性為1,那麼只顯示這一個或幾個屬性 + _id
db.orders.aggregate({
    $project:{name:1}
})
--結果
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes' }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{ _id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

--如果有一個屬性為或幾個屬性為0,那麼顯示除了這個一個或幾個屬性的其它所有屬性
db.orders.aggregate({
    $project:{orderLines:0}
})
--結果
{
    "_id": {"$oid": "5dbe7a545368f69de2b4d36e"},
    "city": "Champlinberg",
    "country": "Malaysia",
    "name": "Destinee Schneider",
    "orderDate": {"$date": "2019-03-26T03:20:08.805Z"},
    "phone": "425.956.7743 x4621",
    "shippingFee": {"$numberDecimal": 8.00},
    "state": "Texas",
    "status": "created",
    "street": "493 Hilll Curve",
    "total": {"$numberDecimal": 407},
    "userId": 3573,
    "zip": "24344-1715"
},
{
    "_id": {"$oid": "5dbe7a545368f69de2b4d36f"},
    "city": "Linwoodburgh",
    "country": "United States of America",
    "name": "Ashlynn Sipes",
    "orderDate": {"$date": "2019-07-18T07:21:53.530Z"},
    "phone": "508.326.5494 x1218",
    "shippingFee": {"$numberDecimal": 7.00},
    "state": "Indiana",
    "status": "shipping",
    "street": "39476 Lacey Harbor",
    "total": {"$numberDecimal": 439},
    "userId": 2500,
    "zip": "84551"
}
......

--只展示巢狀屬性
db.orders.aggregate({
    $project:{"orderLines.price":1}
})
或者
db.orders.aggregate({
    $project:{orderLines:{price:1}}
})
--結果
{ _id: ObjectId("5dbe7a542411dc9de6429193"),
  orderLines: 
   [ { price: Decimal128("75.00") },
     { price: Decimal128("64.00") },
     { price: Decimal128("34.00") },
     { price: Decimal128("98.00") },
     { price: Decimal128("88.00") },
     { price: Decimal128("20.00") },
     { price: Decimal128("59.00") },
     { price: Decimal128("20.00") },
     { price: Decimal128("90.00") },
     { price: Decimal128("45.00") },
     { price: Decimal128("42.00") },
     { price: Decimal128("28.00") } ] }
{ _id: ObjectId("5dbe7a5450fc769de3e19d20"),
  orderLines: 
   [ { price: Decimal128("51.00") },
     { price: Decimal128("10.00") },
     { price: Decimal128("63.00") },
     { price: Decimal128("12.00") },
     { price: Decimal128("37.00") },
     { price: Decimal128("43.00") },
     { price: Decimal128("39.00") },
     { price: Decimal128("68.00") },
     { price: Decimal128("21.00") } ] }
......

$cond-if-then-else的使用相當於SQL中的case-when-then-else

$$REMOVE是在滿足這個條件的時候移除這個屬性

--不是7月的文件,移除這個屬性
db.orders.aggregate({
    $project:{
        name:1,
        orderDate:{
            $cond: {
                   if: { $ne: [ {"$month":"$orderDate"}, 7 ] },
                   then: "$$REMOVE",
                   else: "$orderDate"
            }
        }
    }
})
--結果
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes', orderDate: 2019-07-18T07:21:53.530Z }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{ _id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

對映到一個屬性包含多個屬性

--使用substr擷取第一個字母,使用strLenCP取name的長度
db.orders.aggregate({
    $project:{
        _id: 0,
        nm:{
            name:"$name",
            firstLetter:{$substr:["$name",0,1]},
            nameLenth:{$strLenCP:"$name"}
        }
    }
})
--結果
{ nm: { name: 'Destinee Schneider', firstLetter: 'D', nameLenth: 18 } }
{ nm: { name: 'Ashlynn Sipes', firstLetter: 'A', nameLenth: 13 } }
{ nm: { name: 'Genoveva Bauch', firstLetter: 'G', nameLenth: 14 } }
{ nm: { name: 'Kris Hansen', firstLetter: 'K', nameLenth: 11 } }
{ nm: { name: 'Dudley Kertzmann', firstLetter: 'D', nameLenth: 16 } }
......

將多個屬性的值對映到一個陣列當中

db.orders.aggregate({
    $project:{
        _id: 0,
        msg:[
            "$name","$orderDate","$orderLines.price"
        ]
    }
})
--結果
{msg: 
   [ 
     'Gudrun Stamm',
     2019-09-10T01:00:00.679Z,
     [ 
       Decimal128("17.00"),
       Decimal128("91.00"),
       Decimal128("51.00"),
       Decimal128("10.00"),
       Decimal128("18.00"),
       Decimal128("46.00"),
       Decimal128("69.00"),
       Decimal128("18.00"),
       Decimal128("89.00"),
       Decimal128("99.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Jalon Erdman',
     2019-03-06T08:30:55.042Z,
     [ 
       Decimal128("37.00"),
       Decimal128("91.00"),
       Decimal128("88.00"),
       Decimal128("20.00"),
       Decimal128("75.00"),
       Decimal128("46.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Mossie Ankunding',
     2019-05-25T09:40:13.662Z,
     [ 
       Decimal128("14.00"),
       Decimal128("49.00"),
       Decimal128("38.00"),
       Decimal128("55.00"),
       Decimal128("20.00") 
     ] 
   ] 
}
{ msg: 
   [ 
     'Jorge Toy',
     2019-09-28T23:07:35.137Z,
     [ 
       Decimal128("71.00"),
       Decimal128("62.00"),
       Decimal128("59.00"),
       Decimal128("43.00"),
       Decimal128("55.00"),
       Decimal128("65.00"),
       Decimal128("57.00") 
     ] 
   ] 
}
......

$limit

限制條數,獲取前n條資料

db.orders.aggregate({
    $limit:2
})
--結果
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   [ { product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     { product: 'Intelligent Wooden Towels',
       sku: '5674',
       qty: 72,
       price: Decimal128("84.00"),
       cost: Decimal128("68.88") },
     { product: 'Refined Steel Bacon',
       sku: '5009',
       qty: 8,
       price: Decimal128("53.00"),
       cost: Decimal128("50.35") } ],
  total: Decimal128("407") }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
......

$skip

跳過前n行資料查詢

--查詢第2、3條
db.orders.aggregate([{
        $skip:1
    },{
        $limit:2
}])
--結果
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Incredible Concrete Chips',
       sku: '3756',
       qty: 6,
       price: Decimal128("18.00"),
       cost: Decimal128("15.12") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"),
  street: '699 Harvey Row',
  city: 'Electamouth',
  state: 'South Dakota',
  country: 'Burundi',
  zip: '61826',
  phone: '(936) 449-4255 x58095',
  name: 'Genoveva Bauch',
  userId: 8302,
  orderDate: 2019-03-15T13:53:48.925Z,
  status: 'shipping',
  shippingFee: Decimal128("5.00"),
  orderLines: 
   [ { product: 'Intelligent Soft Salad',
       sku: '3711',
       qty: 85,
       price: Decimal128("86.00"),
       cost: Decimal128("76.54") },
     { product: 'Generic Cotton Ball',
       sku: '2112',
       qty: 44,
       price: Decimal128("21.00"),
       cost: Decimal128("19.32") },
     { product: 'Rustic Plastic Keyboard',
       sku: '6451',
       qty: 19,
       price: Decimal128("81.00"),
       cost: Decimal128("77.76") } ],
  total: Decimal128("341") }

$sort

對文件進行排序 升序:1 降序:-1

--用名字排序
db.orders.aggregate([{
        $sort:{name:1}
    },{
        $project:{_id:0,name:1}
    }
])
--結果
{ name: 'Aaliyah Bruen' }
{ name: 'Aaliyah Erdman' }
{ name: 'Aaliyah Fahey' }
{ name: 'Aaliyah Gerhold' }
{ name: 'Aaliyah Graham' }
{ name: 'Aaliyah Greenfelder' }
{ name: 'Aaliyah Konopelski' }
{ name: 'Aaliyah Kovacek' }
{ name: 'Aaliyah Kuphal' }
{ name: 'Aaliyah Lueilwitz' }
{ name: 'Aaliyah Maggio' }
......

$sortByCount

根據某個欄位分組計算 count 值然後按照這個值降序排序

db.orders.aggregate({
    $sortByCount:"$status"
})
--結果
{ _id: 'created', count: 20087 }
{ _id: 'shipping', count: 20017 }
{ _id: 'completed', count: 20015 }
{ _id: 'cancelled', count: 19978 }
{ _id: 'fulfilled', count: 19903 }

相關文章