1.概述
最近有同學和網友私信我,問我MongoDB方面的問題;這裡我整理一篇部落格來贅述下MongoDB供大家學習參考,部落格的目錄內容如下:
- 基本操作
- CRUD
- MapReduce
本篇文章是基於MongoDB叢集(Sharding+Replica Sets)上演示的,故操作的內容都是叢集層面的,所以有些命令和單獨的使用MongoDB庫有異樣。具體叢集搭建可以參考我寫的《高可用的MongoDB叢集》。
2.基本操作
常用的 Shell 命令如下所示:
db.help() # 資料庫幫助
db.collections.help() # 集合幫助
rs.help() # help on replica set
show dbs # 展示資料庫名
show collections # 展示collections在當前庫
use db_name # 選擇資料庫
檢視集合基本資訊,內容如下所示:
#檢視幫助 db.yourColl.help(); #查詢當前集合的資料條數 db.yourColl.count(); #檢視資料空間大小 db.userInfo.dataSize(); #得到當前聚集集合所在的 db db.userInfo.getDB(); #得到當前聚集的狀態 db.userInfo.stats(); #得到聚集集合總大小 db.userInfo.totalSize(); #聚集集合儲存空間大小 db.userInfo.storageSize(); #Shard版本資訊 db.userInfo.getShardVersion() #聚集集合重新命名,將userInfo重新命名為users db.userInfo.renameCollection("users"); #刪除當前聚集集合 db.userInfo.drop();
3.CRUD
3.1建立
在叢集中,我們增加一個 friends 庫,命令如下所示:
db.runCommand({enablesharding:"friends"});
在庫新建後,我們在該庫下建立一個user分片,命令如下:
db.runCommand( { shardcollection : "friends. user"});
3.2新增
在MongoDB中,save和insert都能達到新增的效果。但是這兩者是有區別的,在save函式中,如果原來的物件不存在,那他們都可以向collection裡插入資料;如果已經存在,save會呼叫update更新裡面的記錄,而insert則會忽略操作。
另外,在insert中可以一次性插敘一個列表,而不用遍歷,效率高,save則需要遍歷列表,一個個插入,下面我們可以看下兩個函式的原型,通過函式原型我們可以看出,對於遠端呼叫來說,是一次性將整個列表post過來讓MongoDB去處理,效率會高些。
Save函式原型如下所示:
Insert函式原型(部分程式碼)如下所示:
3.3查詢
3.3.1查詢所有記錄
db. user.find();
預設每頁顯示20條記錄,當顯示不下的情況下,可以用it迭代命令查詢下一頁資料。注意:鍵入it命令不能帶“;” 但是你可以設定每頁顯示資料的大小,用DBQuery.shellBatchSize= 50;這樣每頁就顯示50條記錄了。
3.3.2查詢去掉後的當前聚集集合中的某列的重複資料
db. user.distinct("name"); #會過濾掉name中的相同資料 相當於: select distict name from user;
3.3.3查詢等於條件資料
db.user.find({"age": 24});
#相當於: select * from user where age = 24;
3.3.4查詢大於條件資料
db.user.find({age: {$gt: 24}}); # 相當於: select * from user where age >24;
3.3.5查詢小於條件資料
db.user.find({age: {$lt: 24}}); #相當於: select * from user where age < 24;
3.3.6查詢大於等於條件資料
db.user.find({age: {$gte: 24}}); #相當於: select * from user where age >= 24;
3.3.7查詢小於等於條件資料
db.user.find({age: {$lte: 24}}); #相當於: select * from user where age <= 24;
3.3.8查詢AND和OR條件資料
- AND
db.user.find({age: {$gte: 23, $lte: 26}}); #相當於 select * from user where age >=23 and age <= 26;
-
OR
db.user.find({$or: [{age: 22}, {age: 25}]}); #相當於: select * from user where age = 22 or age = 25;
3.3.9模糊查詢
db.user.find({name: /mongo/}); #相當於%% select * from user where name like '%mongo%';
3.3.10開頭匹配
db.user.find({name: /^mongo/});
# 與SQL中得like語法類似 select * from user where name like 'mongo%';
3.3.11指定列查詢
db.user.find({}, {name: 1, age: 1}); #相當於: select name, age from user;
當然name也可以用true或false,當用ture的情況下和name:1效果一樣,如果用false就是排除name,顯示name以外的列資訊。
3.3.12指定列查詢+條件查詢
db.user.find({age: {$gt: 25}}, {name: 1, age: 1}); #相當於: select name, age from user where age > 25;
db.user.find({name: 'zhangsan', age: 22});
#相當於:
select * from user where name = 'zhangsan' and age = 22;
3.3.13排序
#升序:
db.user.find().sort({age: 1});
#降序:
db.user.find().sort({age: -1});
3.3.14查詢5條資料
db.user.find().limit(5); #相當於: select * from user limit 5;
3.3.15N條以後資料
db.user.find().skip(10); #相當於: select * from user where id not in ( select * from user limit 5 );
3.3.16在一定區域內查詢記錄
#查詢在5~10之間的資料
db.user.find().limit(10).skip(5);
可用於分頁,limit是pageSize,skip是第幾頁*pageSize。
3.3.17COUNT
db.user.find({age: {$gte: 25}}).count(); #相當於: select count(*) from user where age >= 20;
3.3.18安裝結果集排序
db.userInfo.find({sex: {$exists: true}}).sort();
3.3.19不等於NULL
db.user.find({sex: {$ne: null}}) #相當於: select * from user where sex not null;
3.4索引
建立索引,並指定主鍵欄位,命令內容如下所示:
db.epd_favorites_folder.ensureIndex({"id":1},{"unique":true,"dropDups":true}) db.epd_focus.ensureIndex({"id":1},{"unique":true,"dropDups":true})
3.5更新
update命令格式,如下所示:
db.collection.update(criteria,objNew,upsert,multi)
引數說明: criteria:
查詢條件 objNew:update物件和一些更新操作符
upsert:如果不存在update的記錄,是否插入objNew這個新的文件,true為插入,預設為false,不插入。
multi:預設是false,只更新找到的第一條記錄。如果為true,把按條件查詢出來的記錄全部更新。
下面給出一個示例,更新id為 1 中 price 的值,內容如下所示:
db. user.update({id: 1},{$set:{price:2}}); #相當於: update user set price=2 where id=1;
3.6刪除
3.6.1刪除指定記錄
db. user. remove( { id:1 } ); #相當於: delete from user where id=1;
3.6.2刪除所有記錄
db. user. remove( { } ); #相當於: delete from user;
3.6.3DROP
db. user. drop(); #相當於: drop table user;
4.MapReduce
MongoDB中的 MapReduce 是編寫JavaScript指令碼,然後由MongoDB去解析執行對應的指令碼,下面給出 Java API 操作MR。程式碼如下所示:
MongdbManager類,用來初始化MongoDB:
package cn.mongo.util; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import com.mongodb.DB; import com.mongodb.Mongo; import com.mongodb.MongoOptions; /** * @Date Mar 3, 2015 * * @author dengjie * * @Note mongodb manager */ public class MongdbManager { private static final Logger logger = LoggerFactory.getLogger(MongdbManager.class); private static Mongo mongo = null; private static String tag = SystemConfig.getProperty("dev.tag"); private MongdbManager() { } static { initClient(); } // get DB object public static DB getDB(String dbName) { return mongo.getDB(dbName); } // get DB object without param public static DB getDB() { String dbName = SystemConfig.getProperty(String.format("%s.mongodb.dbname", tag)); return mongo.getDB(dbName); } // init mongodb pool private static void initClient() { try { String[] hosts = SystemConfig.getProperty(String.format("%s.mongodb.host", tag)).split(","); for (int i = 0; i < hosts.length; i++) { try { String host = hosts[i].split(":")[0]; int port = Integer.parseInt(hosts[i].split(":")[1]); mongo = new Mongo(host, port); if (mongo.getDatabaseNames().size() > 0) { logger.info(String.format("connection success,host=[%s],port=[%d]", host, port)); break; } } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("create connection has error,msg is %s", ex.getMessage())); } } // 設定連線池的資訊 MongoOptions opt = mongo.getMongoOptions(); opt.connectionsPerHost = SystemConfig.getIntProperty(String.format("%s.mongodb.poolsize", tag));// poolsize opt.threadsAllowedToBlockForConnectionMultiplier = SystemConfig.getIntProperty(String.format( "%s.mongodb.blocksize", tag));// blocksize opt.socketKeepAlive = true; opt.autoConnectRetry = true; } catch (Exception e) { e.printStackTrace(); } } }
MongoDBFactory類,用來封裝操作業務程式碼,具體內容如下所示:
package cn.mongo.util; import java.util.ArrayList; import java.util.List; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import cn.diexun.domain.MGDCustomerSchema; import com.mongodb.BasicDBList; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.util.JSON; /** * @Date Mar 3, 2015 * * @Author dengjie */ public class MongoDBFactory { private static Logger logger = LoggerFactory.getLogger(MongoDBFactory.class); // save data to mongodb public static void save(MGDCustomerSchema mgs, String collName) { DB db = null; try { db = MongdbManager.getDB(); DBCollection coll = db.getCollection(collName); DBObject dbo = (DBObject) JSON.parse(mgs.toString()); coll.insert(dbo); } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("save object to mongodb has error,msg is %s", ex.getMessage())); } finally { if (db != null) { db.requestDone(); db = null; } } } // batch insert public static void save(List<?> mgsList, String collName) { DB db = null; try { db = MongdbManager.getDB(); DBCollection coll = db.getCollection(collName); BasicDBList data = (BasicDBList) JSON.parse(mgsList.toString()); List<DBObject> list = new ArrayList<DBObject>(); int commitSize = SystemConfig.getIntProperty("mongo.commit.size"); int rowCount = 0; long start = System.currentTimeMillis(); for (Object dbo : data) { rowCount++; list.add((DBObject) dbo); if (rowCount % commitSize == 0) { try { coll.insert(list); list.clear(); logger.info(String.format("current commit rowCount = [%d],commit spent time = [%s]s", rowCount, (System.currentTimeMillis() - start) / 1000.0)); } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("batch commit data to mongodb has error,msg is %s", ex.getMessage())); } } } if (rowCount % commitSize != 0) { try { coll.insert(list); logger.info(String.format("insert data to mongo has spent total time = [%s]s", (System.currentTimeMillis() - start) / 1000.0)); } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("commit end has error,msg is %s", ex.getMessage())); } } } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("save object list to mongodb has error,msg is %s", ex.getMessage())); } finally { if (db != null) { db.requestDone(); db = null; } } } }
LoginerAmountMR類,這是一個統計登入使用者數的MapReduce計算類,程式碼如下:
package cn.mongo.mapreduce; import java.sql.Timestamp; import java.util.ArrayList; import java.util.Date; import java.util.List; import org.bson.BSONObject; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import cn.diexun.conf.ConfigureAPI.MR; import cn.diexun.conf.ConfigureAPI.PRECISION; import cn.diexun.domain.Kpi; import cn.diexun.util.CalendarUtil; import cn.diexun.util.MongdbManager; import cn.diexun.util.MysqlFactory; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBCursor; import com.mongodb.DBObject; import com.mongodb.MapReduceOutput; import com.mongodb.ReadPreference; /** * @Date Mar 13, 2015 * * @Author dengjie * * @Note use mr jobs stats user login amount */ public class LoginerAmountMR { private static Logger logger = LoggerFactory.getLogger(LoginerAmountMR.class);
// map 函式JS字串拼接 private static String map() { String map = "function(){"; map += "if(this.userName != \"\"){"; map += "emit({" + "kpi_code:'login_times',username:this.userName," + "district_id:this.districtId,product_style:this.product_style," + "customer_property:this.customer_property},{count:1});"; map += "}"; map += "}"; return map; }
private static String reduce() { String reduce = "function(key,values){"; reduce += "var total = 0;"; reduce += "for(var i=0;i<values.length;i++){"; reduce += "total += values[i].count;}"; reduce += "return {count:total};"; reduce += "}"; return reduce; }
// reduce 函式字串拼接 public static void main(String[] args) { loginNumbers("t_login_20150312"); } /** * login user amount * * @param collName */ public static void loginNumbers(String collName) { DB db = null; try { db = MongdbManager.getDB(); db.setReadPreference(ReadPreference.secondaryPreferred()); DBCollection coll = db.getCollection(collName); String result = MR.COLLNAME_TMP; long start = System.currentTimeMillis(); MapReduceOutput mapRed = coll.mapReduce(map(), reduce(), result, null); logger.info(String.format("mr run spent time=%ss", (System.currentTimeMillis() - start) / 1000.0)); start = System.currentTimeMillis(); DBCursor cursor = mapRed.getOutputCollection().find(); List<Kpi> list = new ArrayList<Kpi>(); while (cursor.hasNext()) { DBObject obj = cursor.next(); BSONObject key = (BSONObject) obj.get("_id"); BSONObject value = (BSONObject) obj.get("value"); Object kpiValue = value.get("count"); Object userName = key.get("username"); Object districtId = key.get("district_id"); Object customerProperty = key.get("customer_property"); Object productStyle = key.get("product_style"); Kpi kpi = new Kpi(); try { kpi.setUserName(userName == null ? "" : userName.toString()); kpi.setKpiCode(key.get("kpi_code").toString()); kpi.setKpiValue(Math.round(Double.parseDouble(kpiValue.toString()))); kpi.setCustomerProperty(customerProperty == null ? "" : customerProperty.toString()); kpi.setDistrictId(districtId == "" ? 0 : Integer.parseInt(districtId.toString())); kpi.setProductStyle(productStyle == null ? "" : productStyle.toString()); kpi.setCreateDate(collName.split("_")[2]); kpi.setUpdateDate(Timestamp.valueOf(CalendarUtil.formatMap.get(PRECISION.HOUR).format(new Date()))); list.add(kpi); } catch (Exception exx) { exx.printStackTrace(); logger.error(String.format("parse type or get value has error,msg is %s", exx.getMessage())); } } MysqlFactory.insert(list); logger.info(String.format("store mysql spent time is %ss", (System.currentTimeMillis() - start) / 1000.0)); } catch (Exception ex) { ex.printStackTrace(); logger.error(String.format("run map-reduce jobs has error,msg is %s", ex.getMessage())); } finally { if (db != null) { db.requestDone(); db = null; } } } }
5.總結
在計算 MongoDB 的MapReduce計算的時候,拼接JavaScript字串時需要謹慎小心,很容易出錯,上面給出的程式碼只是一部分程式碼,供參考學習使用;另外,若是要做MapReduce任務計算,推薦使用Hadoop的MapReduce計算框架,MongoDB的MapReduce框架這裡僅做介紹學習瞭解。
6.結束語
這篇部落格就和大家分享到這裡,若是大家在研究學習的過程當中有什麼問題,可以加群進行討論或傳送郵件給我,我會盡我所能為您解答,與君共勉!