Track metrics on Portuguese Wikipedia relating to IP-editing turn off
Closed, ResolvedPublic

Description

Goal

Recently ptwiki turned off editing for IP editors. This has been done using an AbuseFilter for the time being. We should monitor some metrics to see how this impacts the health of the project long term.

@jwang recommended keeping these metrics in a notebook that auto-refreshes periodically. This will allow the metrics to be shared more broadly.

To begin with, this could include:

  • Number of active editors
  • Number of edits
  • Number of blocks
  • Number of reverts
  • Number of accounts created
  • Retention rate
  • Checkuser checks

In future we can include:

  • Quality of edits with ORES
Delivery:
Weekly report

link: https://analytics.wikimedia.org/published/notebooks/AHT/ptwiki_dashboard.html

Covered metrics:

  • Number of active editors
  • Number of edits
  • Number of blocks
  • Number of reverts
    • Definition 1: number of edits reverted by the snapshot time
    • Definition 2: number of edits reverted within 48 hours
  • Number of accounts created
  • Retention rate
  • Checkuser checks
  • Number of non-reverted edits
    • Definition 1: number of edits which were not reverted by the snapshot time
    • Definition 2: number of edits which were not reverted within 48 hours
    • Definition 3: number of content edits which were not reverted within 48 hours, excluding bot edits (https://phabricator.wikimedia.org/T273518)
    • Definition 4: number of edits which were not reverted within 48 hours, excluding bot edits (https://phabricator.wikimedia.org/T273518)
  • Number of edits by non-bot vs bot registered users
  • Quality of edits with ORES

Observations by the end of November 2020 (Week 48th):
After turning off IP editing on ptwiki, we saw:

  • a 57% YoY increase in active registered editors
  • a 20% YoY increase in new accounts
  • a 10% YoY decrease in total edits
  • a 50% YoY decrease in reverts
  • a 3% YoY decrease in non-reverted edits
  • a 85% YoY decrease in blocks
  • a 3% YoY decrease in non-reverted content edits excluding bot edits. (https://phabricator.wikimedia.org/T273518)
  • a 7% YoY decrease in non-reverted edits excluding bot edits. (https://phabricator.wikimedia.org/T273518)
Summary report:

link: https://meta.wikimedia.org/wiki/IP_Editing:_Privacy_Enhancement_and_Abuse_Mitigation/Impact_report_for_Login_Required_Experiment_on_Portuguese_Wikipedia

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I made some queries to get some initial results:

Number of active editors: https://quarry.wmflabs.org/query/48869. This is not the same concept of active editors adopted by the research team, but it can show us that ptwiki had an increase of active users after the measures to require registration to edit took place in the middle of October 4th.

Number of edits: "Edits per day" graph tool. I have created that tool with a query similar to that I used to get the active users. The graph show us that IPs used to make approximately 1700 edits per day. After the mandatory registration the new users edits have raised approximately 700 daily edits (from ~700 to ~1400), that suggest that about 700 edits that was made by IPs become to be made by new registered users and about 1000 are no longer been made.

Number of blocks and accounts created: https://quarry.wmflabs.org/query/48865. That shows an increase of accounts creation (almost doubled), an apparently non significant change in the total blocks and an apparently increase in the registered users blocks, what was expected as some users that would make bad edits as IPs are now making it as registered users.

Number of reverts: https://quarry.wmflabs.org/query/48872. I used the standard revert edit summary pattern to get those data as there is not an easy way to get reverts data. The use of summary make the data not totally precise, but give us and idea that the reverts have decreased, from an approximate average of 250 per day to approximately 100 per day.

Quality of edits with ORES: https://quarry.wmflabs.org/query/48860. I used the ORES damaging model to estimate the proportion of damaging edits. The data shows that it has decreased from approx. 18% to approx. 7%. That suggest us that those approx. 1000 edits per day that are no longer been made by IPs are worse edits then those approx. 700 that become to be made by new registered users.

I hope those data can be useful. You can fork those queries to get updated data.

Number of edits: "Edits per day" graph tool. I have created that tool with a query similar to that I used to get the active users. The graph show us that IPs used to make approximately 1700 edits per day. After the mandatory registration the new users edits have raised approximately 700 daily edits (from ~700 to ~1400), that suggest that about 700 edits that was made by IPs become to be made by new registered users and about 1000 are no longer been made.
(...)
Quality of edits with ORES: https://quarry.wmflabs.org/query/48860. I used the ORES damaging model to estimate the proportion of damaging edits. The data shows that it has decreased from approx. 18% to approx. 7%. That suggest us that those approx. 1000 edits per day that are no longer been made by IPs are worse edits then those approx. 700 that become to be made by new registered users.

With the data provided, is it possible to know if, for example, the situation would not be similar to the following?

  • Damaging edits prevented: 250
  • Damaging edits kept: 50
  • Good edits prevented: 750
  • Good edits kept: 650

With these subdivisions we would have the same decrease in proportion of damaging edits from 18% (~300/1700) to 7% (~50/700). And if that were the case, ptwiki would be giving up on more than half the good edits (~54% = 750÷(750+650)) made by unregistered users when preventing those 1000 edits (750 good and 250 damaging). [crossposted to https://pt.wikipedia.org/w/index.php?diff=59559093]

One interesting thing to consider are stats for experienced editors. One of the main arguments of banning IPs was that freeing experienced editors from reviewing IP edits would enhance their productivity by freeing up time to create and expand articles. So most likely number of revert, protect, or block actions go down, but other types of edits should go up.

Abuse filter stats would also be interesting to monitor, number of actions prevented in particular.

These measure number o IPs triggering that particular filter. I was talking about monitoring all filters, particularly ones that prevent edits by newcomers, not only IPs.

The weekly summary is published at: https://analytics.wikimedia.org/published/notebooks/AHT/ptwiki_dashboard.html
Metrics covered are:

  • Number of active editors
  • Number of edits
  • Number of blocks
  • Number of reverts
  • Number of accounts created
  • Retention rate
  • Checkuser checks

The report is refreshed weekly.

Reviewed with @kzimmerman and @Niharika , added a few more metrics and changes based on their comments.

Added non-reverted edits to measure how many non-reverted edits we lose after turning of IP editing. I evaluated by two kinds of revert definitions product teams usually use. The overall revert is defined as number of edits which were reverted by the snapshot time. Some product team prefer to measure the reverts within 48 hours of editing. I measured the reverts and non-reverted edits on ptwiki using both methods. Data shows that on ptwiki around 60% reverts happened within 48 hours of editing, and that number of non-reverted edits decreased after turning off IP editing.

I also made a few changes on graphs. Marked turnoff date on graphs. Modified the graph title of non-bot editors to non-bot registered editors. Added a zoom-in graph for blocks.

Observations by the end of November 2020
After turning off IP editing on ptwiki, we saw:

  • an increase in active registered editors
  • an increase in new accounts
  • an decrease in total edits
  • an decrease in reverts
  • an decrease in non-reverted edits
  • an decrease in blocks
jwang updated the task description. (Show Details)
kaldari renamed this task from Track metrics on ptwiki relating to IP-editing turn off to Track metrics on Portuguese Wikipedia relating to IP-editing turn off.Dec 22 2020, 11:20 PM
kaldari added subscribers: kaldari, JJMC89, DannyH.
Quality analysis of edits with ORES score.

I explored ORES damage scores of edits on ptwiki. Here are my findings.

If we measure the rate of damaging edits identified by ORES model per total edits by all editors, including IP editors, registered bot editors and non-bot editors, we will see a significant drop since the 40th week of 2020.

image.png (674×1 px, 129 KB)

However, the ORES model is known for being biased against IP editors. (Reference: https://arxiv.org/pdf/2006.03121.pdf). It's not surprised that the damages/edits rate dropped since the 40th week, when turning off IP editing. To get an apple to apple comparison, I would exclude IP editors in baseline.

Meanwhile, I measured the ORES scores of edits by bot editors. ORES model is friendly to bots, not marking bot edits as damages usually. However, the number of bot edits fluctuated dramatically month by month due to the nature of the bot function. To reduce the fluctuation in monthly trends, I will exclude bot editors in analysis.

Here is the Damage Rate of non-bot editors.

image.png (666×1 px, 126 KB)

We can see for non-bot edits, the damages/edits rate increased since 40th week. It mainly due to the ratio of newcomer increased and the damages/edits rate usually is high on newcomers.

image.png (662×1 px, 127 KB)

image.png (622×1 px, 97 KB)

Next step, we will keep monitoring the damage rate in the following a few months to see whether the rate will decrease when newcomer become a veteran.

However, the ORES model is known for being biased against IP editors.

@jwang - This seems like an extremely important detail. Can you elaborate on it? I skimmed through the paper you cited, but didn't see anything about it.

However, the ORES model is known for being biased against IP editors. (Reference: https://arxiv.org/pdf/2006.03121.pdf). It's not surprised that the damages/edits rate dropped since the 40th week, when turning off IP editing. To get an apple to apple comparison, I would exclude IP editors in baseline.

This is interesting. Could be useful at some point to rebuild the ORES model with IP editing turned off (even if IP edits would eventually be turned back on), but from anedoctal experience some types of (more obvious) vandalism are no longer being made, so this could explain at least partially the decrease in edits marked as damaging.

Observations by the end of November 2020
After turning off IP editing on ptwiki, we saw:

  • an increase in active registered editors
  • an increase in new accounts
  • an decrease in total edits
  • an decrease in reverts
  • an decrease in non-reverted edits
  • an decrease in blocks

@jwang - If you have time, could you add percentages to these observations? A 1% increase is a lot different than a 500% increase. It would be good to get an idea of the magnitudes involved (and maybe incorporate the data since November as well).

However, the ORES model is known for being biased against IP editors.

@jwang - This seems like an extremely important detail. Can you elaborate on it? I skimmed through the paper you cited, but didn't see anything about it.

Thanks for asking about this! As I was the one that supplied Jennifer with this information, I dug a little further into things to make sure I wasn't perpetuating incorrect information.

I also skimmed through the linked paper, where I was hoping to learn more about it since I remember talking to Nate (@Groceryheist, the main author) about ORES' dependency injection. Studying the paper more closely, I see that the research questions that they're answering and the methodology doesn't focus on that. However, it was positive to see their findings indicate that the ORES-driven RC filters mitigates some of the bias issue.

Digging a little further, I found @Halfak's collection of links about the bias. It's mainly about how the SVC classifier had problematic behaviour when making predictions for non-registered editors, and that this was mitigated by switching to a GradientBoost classifier. However, as Aaron points out in the slides and presentation about it, the problem doesn't go away completely. I'd argue that it's because the underlying data reflects what in the arXiv pre-print is referred to as "over-profiled users": non-registered edits are scrutinized more closely and thereby reverted at a higher rate than registered editors. Aaron or Nate might have more insight into this though, hence the pings.

@kaldari thanks for the suggestion. I have added quantitative summary in description.
Observations by the end of November 2020 (Week 48th)
After turning off IP editing on ptwiki, we saw:

  • a 57% YoY increase in active registered editors
  • a 20% YoY increase in new accounts
  • a 10% YoY decrease in total edits
  • a 50% YoY decrease in reverts
  • a 3% YoY decrease in non-reverted edits
  • a 85% YoY decrease in blocks

a 3% YoY decrease in non-reverted edits

From the graphs it seems most of that is from bot edits (perhaps automated warnings?). Could we separate it to content edits as well?

@jwang, it may be interesting to insert another graphic, indicating the total number of protected pages in each week. What do you think?

Hi @jwang. Great job on this! I was wondering if when discounting reverted edits, shouldn't you discount also reversions? I mean, if people don't make the edits that get reverted, the reversion would not be needed either.

@GoEThe, good point.

Currently non-reverted edits measures the number of 'good' edits by human users. It makes sense to also check the edits after excluding revert edits. It will be measuring the number of edits by human users, excluding edits created due to vandalism. I have created a ticket (T278587) to track it.

I wonder if we could have a version for other wikis as well (e.g. enwiki as the largest one, jawiki as a sample for wikis with more unregistered users), probably on a separate page.
The data we currently have allows a comparison of ptwiki before and after editing by IP editors was turned off; having other wikis would allow us to make sense of ptwiki data before turning off editing by IP editors, especially on how similar/different the situation was between ptwiki and other wikis, so it is easier to assess whether the ptwiki experience could be applied on other wikis.

As for the implementation, there could be some difference in namespace ids, but otherwise I suppose it is mainly substituting "ptwiki" with other wiki names.

I wonder if we could have a version for other wikis as well (e.g. enwiki as the largest one, jawiki as a sample for wikis with more unregistered users), probably on a separate page.
The data we currently have allows a comparison of ptwiki before and after editing by IP editors was turned off; having other wikis would allow us to make sense of ptwiki data before turning off editing by IP editors, especially on how similar/different the situation was between ptwiki and other wikis, so it is easier to assess whether the ptwiki experience could be applied on other wikis.

There is going to be a RFC on en.Wiki soon and this data would be essential. Is this being prepared? Is there an ETA on it?

I wonder if we could have a version for other wikis as well (e.g. enwiki as the largest one, jawiki as a sample for wikis with more unregistered users), probably on a separate page.
The data we currently have allows a comparison of ptwiki before and after editing by IP editors was turned off; having other wikis would allow us to make sense of ptwiki data before turning off editing by IP editors, especially on how similar/different the situation was between ptwiki and other wikis, so it is easier to assess whether the ptwiki experience could be applied on other wikis.

As for the implementation, there could be some difference in namespace ids, but otherwise I suppose it is mainly substituting "ptwiki" with other wiki names.

@Kudpung @patilise This request should have its own phabricator ticket.

Portugese wiki metrics report is now published on meta.

I brought this up with @jwang. We won't be able to have a continuously updating dashboard up because of technical and capacity constraints but we can consider running one-off updates if that would be helpful.