Old documents are not resent? #41

Streemo · 2015-05-30T21:56:22Z

Hey @arunoda - First of all, I'm a big fan of your packages, I use a handful of them.

So I read this in your "Why?" section:

"When you are subscribing inside a Deps.autorun computation, all the subscriptions started on the previous computation will be stopped...
...Also, this will force the Meteor server to resend data you already had in the client. It will waste your server's CPU and network bandwidth."

I ran some basic tests on the pub/sub by observing the DDP messages on the client. In the Tracker.autorun, No duplicate documents are sent. This is with a bare bones pub/sub in a Test collection. For example, If I sub all docs $near myReactiveCoordinates, the computation will be re-triggered when I move 10 feet, but the documents are mostly the same, no duplicate added messages are seen in the DDP logs.

Can you explain what you mean? I'm running meteor 1.1.0.2 and Meteor seems to be smart enough to not dupe the data.

arunoda · 2015-05-31T03:58:15Z

I suggest you to read this article: https://meteorhacks.com/subscriptions-manager-is-here

Streemo · 2015-05-31T20:01:08Z

@arunoda I've been toying around with the idea of caching data for a very
common use case: pagination when there exists many documents. because users
will be going back and forth between pages very quickly, and each sub is
only 20 docs. using your package, I would be observing Posts.find(0 to 19)
AND Posts.find(20 to 39) AND ... etc. instead I think it would be better to
simply kill the first observer and reobserve the same query but N+20. this
way, we only have one observer per user instead of M, one for each of the M
pages the user has gone thru.

the problem I see with any off these solutions is that the app could become
laggy on crappy clients.

another solution: kind of like fast-render but with older data. don't cache
old subscriptions. instead, cache document ids on the server. on a per
connection basis. then, in the new publish function, we check for cached
docs for that connection, and run a self.added for each doc, thus the doc
doesn't get removed from client. however, the doc will no longer be
observed by the server. the server is leaner because we only observe 20
docs at a time, but the client keeps the old days to quick-render pages it
was at before. upon going to an old page, the old data is loaded and the
server then runs the appropriate observe as dictated by the subscription
and the client receives new data updates as usual.

possible features: could cache entire docs on the server and then run a
diff algorithm on each sub to see if it needs to be updated, reducing
bandwidth. cons: more cpu usage on server.

cons with all solutions: more memory usage on client, can get show at 10^3
documents.
On May 30, 2015 11:58 PM, "Arunoda Susiripala" [email protected]
wrote:

I suggest you to read this article:
https://meteorhacks.com/subscriptions-manager-is-here

—
Reply to this email directly or view it on GitHub
#41 (comment)
.

Streemo · 2015-05-31T21:15:35Z

@arunoda sorry that was from my phone, few typos.

I am interested in the "another solution" because imagine the following user situation:

go to page 1, get top 20 posts.
go to page 2, get posts 21-40.
go back to page 1.
- Regardless of the solution we choose, we need to have some data ready-to-render here, which was the whole point - a clean UX.
- Using your solution, it can be page 1's data all up-to-date, since the subscription handles were never destroyed.
- My idea: I do not believe background-updating the data the user is NOT looking at is necessary - caching it, however, is. See below for my explanation. We can cache the old data, and have the user render old data when it subscribes to the old page, if it exists. Then, the real-time updates to the old data will come since Meteor is informed that the user is now looking at the old page. Basic pub/sub.
- My theory: To the average user, rendering OLD data and then updating it in the background is the same as rendering NEW data and then updating it in the background. The user just wants to see something of value instead of that damned loading circle. Either way: the user sees data immediately after requesting it, and it is real-time updated. Most users will not notice that the data had zero changes. If they do, they will assume that no changes were made since they were last there. Before they can think about it, it will be up-to-date. Combine this with the fact that sub rate is high, and no one will notice the difference at all.

Basically, we only need to tell the user about changes when they are actually looking at the content in question. In this use case, which is very common, for anyone with enough data to be paginated:

Pros with your package:

The user's data is cached so a quick-render can happen without waiting on the same data again.

Cons:

Meteor is under the impression that the user is at page X for all X in [1,N], where N is the number of pages the user has been to in the paginated scheme. So, it runs N observers here. But, taking into account my hypothesis, this is a waste of resources.
Lots of data being cached in the client can result in a laggy client for crappy computers.

New Package Pros:

The user's data is cached so a quick-render can happen without waiting on the same data again.
Meteor does not observe N queries for each user, instead it observes 1 query per user.

Cons:

Lots of data being cached in the client can result in a laggy client for crappy computers.
Manually diffing newly observed data against the cache could eat CPU, not sure if this would be better than having N observers.

As a side note, I am wondering if there's a way to sample the client's RAM and CPU and then determine the number to cache based on this number.

arunoda · 2015-06-02T06:43:39Z

There are two main reason we develop subs-manager.

To give fast page switching (data caching)
To reduce the server load

You might be interested about the point 2. Subscribing and unsubscribing is one of costly operations in the server. So, with subs-manager we stopped that. So, that leads to over 50% reduction in the server load for some apps.

I get your point about background updates. I think that's a different problem to address. Still we need to benchmark the server and client costs.

Streemo · 2015-06-02T08:29:57Z

what about the case with pagination, which leads to cases of either
overlapping observers, or many small observers that are observing what a
larger single observer could.

I am too curious cpu and ram taken when subscribing vs. cpu and ram taken
for query observing...

I will check your site, I think you have a study on this iirc

On Jun 2, 2015 2:43 AM, "Arunoda Susiripala" [email protected]
wrote:

There are two main reason we develop subs-manager.

To give fast page switching (data caching)
To reduce the server load

You might be interested about the point 2. Subscribing and unsubscribing
is one of costly operations in the server. So, with subs-manager we stopped
that. So, that leads to over 50% reduction in the server load for some apps.

I get your point about background updates. I think that's a different
problem to address. Still we need to benchmark the server and client costs.

—
Reply to this email directly or view it on GitHub.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Old documents are not resent? #41

Old documents are not resent? #41

Streemo commented May 30, 2015

arunoda commented May 31, 2015

Streemo commented May 31, 2015

Streemo commented May 31, 2015

arunoda commented Jun 2, 2015

Streemo commented Jun 2, 2015

Old documents are not resent? #41

Old documents are not resent? #41

Comments

Streemo commented May 30, 2015

arunoda commented May 31, 2015

Streemo commented May 31, 2015

Streemo commented May 31, 2015

arunoda commented Jun 2, 2015

Streemo commented Jun 2, 2015