How efficient can Meteor be while sharing a huge c

2019-01-07 02:20发布

问题:

Imagine the following case:

  • 1,000 clients are connected to a Meteor page displaying the content of the "Somestuff" collection.

  • "Somestuff" is a collection holding 1,000 items.

  • Someone inserts a new item into the "Somestuff" collection

What will happen:

  • All Meteor.Collections on clients will be updated i.e. the insertion forwarded to all of them (which means one insertion message sent to 1,000 clients)

What is the cost in term of CPU for the server to determine which client needs to be updated?

Is it accurate that only the inserted value will be forwarded to the clients, and not the whole list?

How does this work in real life? Are there any benchmarks or experiments of such scale available?

回答1:

The short answer is that only new data gets sent down the wire. Here's how it works.

There are three important parts of the Meteor server that manage subscriptions: the publish function, which defines the logic for what data the subscription provides; the Mongo driver, which watches the database for changes; and the merge box, which combines all of a client's active subscriptions and sends them out over the network to the client.

Publish functions

Each time a Meteor client subscribes to a collection, the server runs a publish function. The publish function's job is to figure out the set of documents that its client should have and send each document property into the merge box. It runs once for each new subscribing client. You can put any JavaScript you want in the publish function, such as arbitrarily complex access control using this.userId. The publish function sends data into the merge box by calling this.added, this.changed and this.removed. See the full publish documentation for more details.

Most publish functions don't have to muck around with the low-level added, changed and removed API, though. If a publish function returns a Mongo cursor, the Meteor server automatically connects the output of the Mongo driver (insert, update, and removed callbacks) to the input of the merge box (this.added, this.changed and this.removed). It's pretty neat that you can do all the permission checks up front in a publish function and then directly connect the database driver to the merge box without any user code in the way. And when autopublish is turned on, even this little bit is hidden: the server automatically sets up a query for all documents in each collection and pushes them into the merge box.

On the other hand, you aren't limited to publishing database queries. For example, you can write a publish function that reads a GPS position from a device inside a Meteor.setInterval, or polls a legacy REST API from another web service. In those cases, you'd emit changes to the merge box by calling the low-level added, changed and removed DDP API.

The Mongo driver

The Mongo driver's job is to watch the Mongo database for changes to live queries. These queries run continuously and return updates as the results change by calling added, removed, and changed callbacks.

Mongo is not a real time database. So the driver polls. It keeps an in-memory copy of the last query result for each active live query. On each polling cycle, it compares the new result with the previous saved result, computing the minimum set of added, removed, and changed events that describe the difference. If multiple callers register callbacks for the same live query, the driver only watches one copy of the query, calling each registered callback with the same result.

Each time the server updates a collection, the driver recalculates each live query on that collection (Future versions of Meteor will expose a scaling API for limiting which live queries recalculate on update.) The driver also polls each live query on a 10 second timer to catch out-of-band database updates that bypassed the Meteor server.

The merge box

The job of the merge box is to combine the results (added, changed and removed calls) of all of a client's active publish functions into a single data stream. There is one merge box for each connected client. It holds a complete copy of the client's minimongo cache.

In your example with just a single subscription, the merge box is essentially a pass-through. But a more complex app can have multiple subscriptions which might overlap. If two subscriptions both set the same attribute on the same document, the merge box decides which value takes priority and only sends that to the client. We haven't exposed the API for setting subscription priority yet. For now, priority is determined by the order the client subscribes to data sets. The first subscription a client makes has the highest priority, the second subscription is next highest, and so on.

Because the merge box holds the client's state, it can send the minimum amount of data to keep each client up to date, no matter what a publish function feeds it.

What happens on an update

So now we've set the stage for your scenario.

We have 1,000 connected clients. Each is subscribed to the same live Mongo query (Somestuff.find({})). Since the query is the same for each client, the driver is only running one live query. There are 1,000 active merge boxes. And each client's publish function registered an added, changed, and removed on that live query that feeds into one of the merge boxes. Nothing else is connected to the merge boxes.

First the Mongo driver. When one of the clients inserts a new document into Somestuff, it triggers a recomputation. The Mongo driver reruns the query for all documents in Somestuff, compares the result to the previous result in memory, finds that there is one new document, and calls each of the 1,000 registered insert callbacks.

Next, the publish functions. There's very little happening here: each of the 1,000 insert callbacks pushes data into the merge box by calling added.

Finally, each merge box checks these new attributes against its in-memory copy of its client's cache. In each case, it finds that the values aren't yet on the client and don't shadow an existing value. So the merge box emits a DDP DATA message on the SockJS connection to its client and updates its server-side in-memory copy.

Total CPU cost is the cost to diff one Mongo query, plus the cost of 1,000 merge boxes checking their clients' state and constructing a new DDP message payload. The only data that flows over the wire is a single JSON object sent to each of the 1,000 clients, corresponding to the new document in the database, plus one RPC message to the server from the client that made the original insert.

Optimizations

Here's what we definitely have planned.

  • More efficient Mongo driver. We optimized the driver in 0.5.1 to only run a single observer per distinct query.

  • Not every DB change should trigger a recomputation of a query. We can make some automated improvements, but the best approach is an API that lets the developer specify which queries need to rerun. For example, it's obvious to a developer that inserting a message into one chatroom should not invalidate a live query for the messages in a second room.

  • The Mongo driver, publish function, and merge box don't need to run in the same process, or even on the same machine. Some applications run complex live queries and need more CPU to watch the database. Others have only a few distinct queries (imagine a blog engine), but possibly many connected clients -- these need more CPU for merge boxes. Separating these components will let us scale each piece independently.

  • Many databases support triggers that fire when a row is updated and provide the old and new rows. With that feature, a database driver could register a trigger instead of polling for changes.



回答2:

From my experience, using many clients with while sharing a huge collection in Meteor is essentially unworkable, as of version 0.7.0.1. I'll try to explain why.

As described in the above post and also in https://github.com/meteor/meteor/issues/1821, the meteor server has to keep a copy of the published data for each client in the merge box. This is what allows the Meteor magic to happen, but also results in any large shared databases being repeatedly kept in the memory of the node process. Even when using a possible optimization for static collections such as in (Is there a way to tell meteor a collection is static (will never change)?), we experienced a huge problem with the CPU and Memory usage of the Node process.

In our case, we were publishing a collection of 15k documents to each client that was completely static. The problem is that copying these documents to a client's merge box (in memory) upon connection basically brought the Node process to 100% CPU for almost a second, and resulted in a large additional usage of memory. This is inherently unscalable, because any connecting client will bring the server to its knees (and simultaneous connections will block each other) and memory usage will go up linearly in the number of clients. In our case, each client caused an additional ~60MB of memory usage, even though the raw data transferred was only about 5MB.

In our case, because the collection was static, we solved this problem by sending all the documents as a .json file, which was gzipped by nginx, and loading them into an anonymous collection, resulting in only a ~1MB transfer of data with no additional CPU or memory in the node process and a much faster load time. All operations over this collection were done by using _ids from much smaller publications on the server, allowing for retaining most of the benefits of Meteor. This allowed the app to scale to many more clients. In addition, because our app is mostly read-only, we further improved the scalability by running multiple Meteor instances behind nginx with load balancing (though with a single Mongo), as each Node instance is single-threaded.

However, the issue of sharing large, writeable collections among multiple clients is an engineering problem that needs to be solved by Meteor. There is probably a better way than keeping a copy of everything for each client, but that requires some serious thought as a distributed systems problem. The current issues of massive CPU and memory usage just won't scale.



回答3:

The experiment that you can use to answer this question:

  1. Install a test meteor: meteor create --example todos
  2. Run it under Webkit inspector (WKI).
  3. Examine the contents of the XHR messages moving across the wire.
  4. Observe that the entire collection is not moved across the wire.

For tips on how to use WKI check out this article. It's a little out of date, but mostly still valid, especially for this question.



回答4:

This is still a year old now and therefore I think pre-"Meteor 1.0" knowledge, so things may have changed again? I'm still looking into this. http://meteorhacks.com/does-meteor-scale.html leads to a "How to scale Meteor?" article http://meteorhacks.com/how-to-scale-meteor.html