為什麼我會從MongoDB遷移到PostgreSQL

jieforest發表於2012-09-20
I recently concluded a migration away from MongoDB to PostgreSQL for one of my apps - digiDoc. I’d like to tell you why I did so.

To be honest, the decision to use MongoDb was an ill-thought out one. Lesson learned - thoroughly research any new technology you introduce into your stack, know well the strengths and weaknesses thereof and evaluate honestly whether it fits your needs or not - no matter how much hype there is surrounding said technology. What follows is also not a litany of the usual compaints against MongoDB such as data corruption, global write lock, shard configuration and so on. For our use case, mongodb failed at a much more basic level.

digiDoc is all about converting paper documents like receipts and business cards into searchable database, and so a document database seemed like a logical fit(!). Alas, not being aware of the mathematics behind relational algebra, I could not see clearly the trap I was falling into - document databases are remarkably hard to run aggregations on and aggregating the data and presenting meaningrful statistics on your receipts is one of the core features of digiDoc. Without the powerful aggregation features that we take for granted in RDBMSs, I would constantly be fighting with unweildy map-reduce constructs when all I want is SUM(amount) FROM receipts WHERE GROUP BY . I even contributed some patches to mongoid-map-reduce but the whole experience of aggregating data with mongodb was so ugh that I couldn’t bring myself to work on the app beyond a point. That is of course, a bad place to be in.

Secondly, with a document database, you lose the independence of your data access paths. People keep complaining that JOINs make your data hard to scale. Well, the converse is also true - Not havingJOINs makes your data an intractable lump of mud. Consider a simple thing like an audit trail. In a document database, normally, this would be a set of embedded documents in the item being audited. Now, lets say you want to see a list of all the actions performed by a particular user. Boing! Trapped. You have to load every document in the database and extract the udit trail from it, then filter it in your app for the user you’re looking for. Just the thought of what that would do to my hardware was enough to turn me off the whole idea. JOINs are cool! And guess which problem you are more likely to have - needing joins, or scaling beyond facebook?

Thirdly, mongodb is quite feature poor. Perhaps this has changed in the last few months, but when I last looked something as simple as case-insensitive search did not exist. The recommended solutionwas to have a field in the model with all your search data in it in lower case. Now my model, which I have carefully constructed so as to adhere to the Single Responsibility Principal needs to have callback hooks to save this search string everytime it is updated. And if I add a new field to the model? Time to regenerate all the search strings. I can only come to the conclusion that mongodb is a well-funded and elaborate troll.

Fourthly, and this one completely blew my mind - somewhere along the stack of mongodb, mongoid and mongoid-map-reduce, somewhere there, type information was being lost. I thought we were scaling hard when one of our customers suddenly had 1111 documents overnight. Imagine my disappointment when I realised it was actually four 1s, added together. They’d become strings along the way. Now in this case, perhaos the fault was mine but somehow, I can’t see this happening with Postgres. And when you add up all the hours it takes to deal with these niggling problems and the surprising lack of features, it all leads to some pretty obvious conclusions.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/301743/viewspace-744446/,如需轉載,請註明出處,否則將追究法律責任。

相關文章