This is just a train of thoughts about how to reproduce the functionality of a SQL VIEW/MATERIALIZED VIEW in mongoDB.
You need the following software:
- Mongo shell to run
bad.js
(the 1st example). You start a single instance (no rs) usingmongod --dbpath <YOUR_DATA_PATH> --smallfiles
- Node.js, and the mongodb, mongo-oplog packages (install via npm) to run
bad3.js
(the 3rd example). You start a replica set (can be even just a single member) usingmongod --dbpath <YOUR_DATA_PATH> --smallfiles --replSet "rstest"
- If you're curious about what happened to
bad2.js
... I'll tell you if you help me making oplog tailing work in the mongo shell ;)
The data model is:
- One db
test1
with one collectionsales
. You can add elements in it by:
{
"desc" : "fourth sale in october",
"date" : ISODate("2014-10-14T23:00:00.000Z")
}
change the description and the date as you wish (as the aggregation is grouping by month, you need several entries on different months)
bad2.js
uses aggregation framework$out
to create the collectionmonthlySalesReport
- same is done by
bad3.js
on every new insert into thesales
collection (sort of a "trigger")
The "presentation" (such a pretentious word) goes like that:
-
Although mostly used to store and retrieve data via CRUD operations, MongoDB have an aggregation framework that allows to write more complex queries (e.g. group, sum, etc). I can then use this aggregation framework to write queries that aggregate and reports on data.
-
I would like, at this point, to be able to create the equivalent of an
SQL VIEW
, that is, a virtual table (collection, in mongo jargon), whose content is the result of a query, that I can query as any other normal table (oops... collection! :). Unfortunately, this seems not to be a priority (the jira issue has been updated in Sept 2014 but the "will do" from Eliot dates back 2011...) -
As of 2.6, I can use
$out
to output the result of an aggregation query to a new collection. This is great but not optimal, for instance because the new collection is a completely new collection that takes spaces for both data and index storage, ... (more a MATERIALIZED VIEW than a VIEW) -
Anyway, the
$out
it's better than nothing. However, we can't tell mongodb to continuously update the target collection, and we can't even schedule the aggregation query to run periodically to update the$out
target collection. We need to do it from outside mongoDB. -
First thing that pops to mind is wrap the query into a
setTimeout
and execute it via mongo shell. Something like:
setTimeout(function(){
// run the aggregation framework query here
}, 5000);
to invoke via mongo shell mongo <FILENAME>.js
. Too bad, no window.setTimeout() :(
-
Quick&dirty solution is what you see in
bad.js
(that's why it's called "bad" in the first place: awhile(true)
. If you launchmongo bad.js
on a mongo instance withtest1
db and thesales
collection, you'll find a newmonthlySalesReport
collection with the aggregated data. so far so good. -
while(true)
is really ugly. You may want to remove it, so that the script run the query only once, and schedule the execution via cron: no realtime update, but that's all we can get. -
Another solution that have seen much buzz recently is oplog tailing. Google it if you want to know what that is, but basically, the oplog is a capped collection that mongo uses to replicate data between nodes of a replica set. The gist is that we tap that collection and we update our
monthlySalesReport
collection with data as soon as this is written in the originalsales
collection. -
So how do we do that. mongo-oplog and many other libraries makes our life easier. Just
npm install mongo-oplog
andnpm install mongodb
and we're ready to runnode bad3.js
. Please note that I implemented only theinsert
case... do the others as exercise, and continue reading below when you are doingupdate
;) -
This looks good. So much like TRIGGERS. But is it? The answer is no, as it's still external to the db, and most important because the update "trigger" don't have access to the old value of the updated document... :(
-
Please please please MongoDB: implement native VIEWs!!!
Bedtime reading, if you like the topic: http://docs.mongodb.org/manual/reference/sql-comparison/
As a reference, the SQL equivalent of the data model above is (PostgreSQL):
-
sales
table:CREATE TABLE sales ( date date, "desc" character varying)
-
sales
table entries:INSERT INTO sales VALUES ('2014-09-03', 'another sale')
-
monthlySalesReport
query/view (to create the view, prefix the query withCREATE VIEW AS
):SELECT concat(date_part('year', date), '-', date_part('month', date)) as id, count(*) FROM sales GROUP BY id