Revise topic subscription quota
Closed, ResolvedPublic

Description

In T263817#7039894, we introduced a limit that prevents people from being subscribed to more than 5,000 distinct conversations per the guidance the Data Persistence team shared in T263817#7040637. This task is about increasing this quota if/when we notice people coming close to it.

Open questions

  • What – if any – concerns does the Data Persistence Team have about removing the quota.

Event Timeline

ppelberg updated the task description. (Show Details)

We will prioritize work on this task if/when we notice people coming close to reaching the 5,000 distinct discussion subscription quota.

Note: during the team's 4-November standup, we decided to be proactive about revisiting this quota rather than waiting for us to find ourselves in a situation where we are needing to remove the quota out of necessity with urgency.

Change 747535 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/DiscussionTools@master] Move user subscription limit to config

https://gerrit.wikimedia.org/r/747535

Change 747535 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Move user subscription limit to config

https://gerrit.wikimedia.org/r/747535

Change 761986 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Remove limit on the number of topic subscriptions per user

https://gerrit.wikimedia.org/r/761986

The most enthusiastic users on enwiki have exceeded 500 subscriptions. Users on a few other projects I checked randomly have fewer than that, around 100.

Hi, I'm from data persistence team and I'll be looking into this.

So the table at this size is definitely fine, in enwiki it's only 11MB and the growth is small but I'm worried about its growth in future and that will be much harder to fix (like now it's really hard to fix templatelinks). So it's better to clean it before it gets out of hand.

You have two ways to reduce the size:

  • vertically: Remove the title string column? Why page_id is not enough?
  • horizontally: Remove subscriptions older than a year or two.

I don't know how feasible (technically and PM-wise) is to do those changes but this work is quite important and I foresee it will get a lot of users (thank you for building it!).

I don't think we should block the change on fixing these scalability concerns but it's blocked on having a clear roadmap of fixing those concerns.

And one more thing; Once this is enabled everywhere, please make sure bots don't get subscribed to discussions they start. Lots of bots create talk page discussions and if you automatically subscribe creators of discussion section, you might end up with a table filled with 99% bots being subscribed to sections.

vertically: Remove the title string column? Why page_id is not enough?

I think our logic here was "this is a table that's just like watchlist, let's do exactly what watchlist does".

Once this is enabled everywhere, please make sure bots don't get subscribed to discussions they start.

As-is, automatic subscription only happens for users who're using our APIs, so this won't be a problem unless bot-authors start migrating to new APIs. I presume that we could add a check on just $user->isBot() to avoid this.

vertically: Remove the title string column? Why page_id is not enough?

I think our logic here was "this is a table that's just like watchlist, let's do exactly what watchlist does".

Yes but:

  • watchlist table is one of the biggest tables in our infra and needed a lot of love (including deleting millions of bot's watchlist)
  • watchlist is per page, this is per section (am I understanding it correctly?) so in theory can get pretty big.
  • The reason watchlist keeps string is that you can watch a non-existing page (e.g. for recreation by spammers). I'm not sure if that would be the case here. This also helps in page moves etc.

Once this is enabled everywhere, please make sure bots don't get subscribed to discussions they start.

As-is, automatic subscription only happens for users who're using our APIs, so this won't be a problem unless bot-authors start migrating to new APIs. I presume that we could add a check on just $user->isBot() to avoid this.

There are lots of bots that create discussion threads.

This is true, but I doubt they're using the new APIs, so it'd at least be a gradual increase over time rather than a rapid surge. 😁 (That said: I have a patch up.)

There are lots of bots that create discussion threads.

This is true, but I doubt they're using the new APIs, so it'd at least be a gradual increase over time rather than a rapid surge. 😁 (That said: I have a patch up.)

That's even worse, we don't know it's growing until it's too late and trust me, clean ups are hard 😭 (see T289249 and T296380).

Thanks for fixing that. What do you think of the page title column (note that if you go with page id, you don't need to make sure it changes after page moves).

We actually want it to display the original title after page moves. This was an intentional decision: T264885#6586037.

So the table at this size is definitely fine, in enwiki it's only 11MB and the growth is small but I'm worried about its growth in future and that will be much harder to fix (like now it's really hard to fix templatelinks). So it's better to clean it before it gets out of hand.

You have two ways to reduce the size:

  • vertically: Remove the title string column? Why page_id is not enough?
  • horizontally: Remove subscriptions older than a year or two.

I don't know how feasible (technically and PM-wise) is to do those changes but this work is quite important and I foresee it will get a lot of users (thank you for building it!).

I don't think we should block the change on fixing these scalability concerns but it's blocked on having a clear roadmap of fixing those concerns.

I am sorry, but I do not understand your worry at all. Can you elaborate why we should be concerned?

My understanding is:

  • Storing 11 MB of data, into eternity and in triplicate, costs like… five dollars
  • The simple queries we run on this data would still be just as fast even if the table was 100000x the size

Is this incorrect? Can you give me a better intuition?

Given the above, I see no reason to remove old subscriptions from this table. I can easily imagine users referring back to their subscriptions a few years later, e.g. to find a discussion they remember participating in. I do that often myself on Phabricator.

So the table at this size is definitely fine, in enwiki it's only 11MB and the growth is small but I'm worried about its growth in future and that will be much harder to fix (like now it's really hard to fix templatelinks). So it's better to clean it before it gets out of hand.

You have two ways to reduce the size:

  • vertically: Remove the title string column? Why page_id is not enough?
  • horizontally: Remove subscriptions older than a year or two.

I don't know how feasible (technically and PM-wise) is to do those changes but this work is quite important and I foresee it will get a lot of users (thank you for building it!).

I don't think we should block the change on fixing these scalability concerns but it's blocked on having a clear roadmap of fixing those concerns.

I am sorry, but I do not understand your worry at all. Can you elaborate why we should be concerned?

My understanding is:

  • Storing 11 MB of data, into eternity and in triplicate, costs like… five dollars
  • The simple queries we run on this data would still be just as fast even if the table was 100000x the size

Is this incorrect? Can you give me a better intuition?

Well, Yes, it is incorrect. Databases are much more complicated than that. I try to explain, I'm sorry if I might repeat something you already know.

Putting cost value on data heavily depends on the architecture and the way you store and read that data. The cost of maintaining a table of 11MB in a core database is much higher than five dollars while in the media storage (swift) is much cheaper where you don't have that many reads.

Where we put data in core databases are per section (enwiki gets its own section, let's stick to enwiki for now) where there is a primary and around 20-30 replicas (in both dcs). Five to ten replicas serve live traffic. A replica serving traffic must serve as much as possible from the memory cache and given the load we have, we can't read rows from disk on more than 0.01% of cases. A standard db host which is extremely beefy and quite expensive has around 350GB memory for the cache (plus 150GB for temp tables and the rest of that 512GB for the rest). That means the size of data hotter than 0.01% in each database can't go higher than 350GB in all of the database, including revision table, rc table, etc. and those are heavily being read. And keep in mind this is a vertical scalability limitation and there is nothing we can do on hardware-wise because this is not scaling horizontally.

The number that you gave (100000x) would yield to 1.05TB which is way higher than the 350GB limit (ignoring the fact that we need to serve rc and other data) and would definitely bring everything down. Of course, when you store data in a different way, for example for case of revision texts in ExternalStorage, there are ways to make it horizontally scale which while being expensive, it's possible. But that's not the case here.

Again 11MB in itself is not a problem but it will grow to a big problem if you let it grow like this. We are already doing a lot of work to compact the database (including actor and comment migration which is not done yet btw) and then links tables but that would just makes wikis (specially commons which is in serious trouble atm) not explode since the current grow path is leading to them being completely unusable in a year or two. We don't have capacity to add more in that space and I'm sorry this is neither $5 problem nor scalable to 100000x the size.

Next steps

  • 1. @matmarex and @Ladsgroup to talk about whether removing the subscriptions is indeed a non-starter
  • If "1." proves to be true, we'll need to scope work for deciding what happens when people approach/reach the 5,000 subscription quota.

Outcome of the conversation:

  • @Ladsgroup is fine with the changes proposed in this task, but would like us to have a plan to make some of the improvements suggested in T294881#7707795 (even if we won't have the time to actually make those changes, just a plan is needed)
  • Therefore I filed T306199, which describes a plan to remove some columns from the table, which would become possible after the work on T296801 is completed
  • We can remove the limit on topic subscriptions

Change 761986 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Remove limit on the number of topic subscriptions per user

https://gerrit.wikimedia.org/r/761986

ppelberg claimed this task.