[SPIKE] Determine what – if any – new instrumentation is needed for notifications
Closed, ResolvedPublic

Description

This task is about determining what – if any – new instrumentation is needed for us to measure the impact the notification interventions being implemented (T273920) as part of the Talk pages project.

Background

Work on this task will begin once we know:

  1. The experiments we will run to evaluate the impact of the notification interventions and
  2. The metrics we will use in these "experiments"

"1." and "2." will be finalized once the notifications measurement plan (T274215) is drafted.

Instrumentation changes

The instrumentation changes needed to evaluate the impact of the notification interventions we have planned (T273920) are documented in the Topic Notifications/Instrumentation Spec.

Open questions

  • 1. How are we – collectively – defining "topics," "comments," and "replies"?
  • 2. Which of the below do we care to know/measure?
    • A. ✅When/if someone posts a comment in response to a new topic 'you've' started?
      • We will potentially use this information to evaluate the impact of this feature.
        • Emphasis placed on "potentially" above to signal the possibility that T263821 could be de-scoped.
    • B. ✅ When/if someone comments in a section after you have already commented in it?
      • We will not use this information to evaluate the impact of this feature; however, we will use it as a "Curiosity" to understand how topic subscriptions might be impacting the cadence of conversations. This has been added as "Curiosity #3" in T280895's task description.
    • ✅ C. When/if someone replies directly to a comment you've written?
      • We will use this data as an input to our KPI metric (T280895) in the scenario that T263821 is de-scoped.
  • 3. How will we relate comments to sections?
    • Note: this assumes our interest is limited to comments and topic NOT replies.
    • This question will be answered in T280100.
  • 4. How will we relate replies to comments and comments to sections?
    • Note: this assumes we are interested in topics, comments and replies posted in direct response to comments.
    • This question will be answered in T284200
  • 5. Is it a priority to be able to track the full "lifecycle" of the notification flow?
    • Where "Notification flow" = notifications is sent (start) → response is posted (end).
    • No and the rationale can be found in T277349#7087010. Instead, we would like to be able to track/define the notification flow as follows:
      • 1) Notification sent
      • 2) Notification delivered
      • 3) Notification opened/read
      • 4) Notification interacted with
  • 6. For new notification instrumentation (read: Echo), to what schema will these new events be added? An existing schema? A new schema?

Done

  • Answers to all ===Open questions are answered and documented in the ticket
  • The table in the Instrumentation changes section above is populated with the new instrumentation and/or changes to existing instrumentation that is needed to evaluate the impact of the notification interventions we have planned.

Event Timeline

matmarex renamed this task from [SPIKE] Determine what – if any – new instrumentation is needed to [SPIKE] Determine what – if any – new instrumentation is needed for notifications.Mar 18 2021, 4:48 PM
MNeisler triaged this task as Medium priority.Mar 18 2021, 8:47 PM
MNeisler added a project: Product-Analytics.
MNeisler moved this task from Triage to Upcoming Quarter on the Product-Analytics board.

I drafted a list of events that need to be tracked in order to calculate the metrics outlined in the measurement plan. Please see the "Events to be tracked" column in the instrumentation spec document.

Note: I'm currently using the instrumentation spec as my working document but can add these to the task description once finalized.

Notes from meeting with @MNeisler on 5-May
During today's conversation with Megan, the following open questions emerged. They are also now represented in the task's desription.

Open questions

  • 1. How are we – collectively – defining "topics," "comments," and "replies"?
  • 2. Which of the below do we care to know/measure?
    • A. When/if someone posts a comment in response to a new topic 'you've' started?
    • B. When/if someone comments in a section after you have already commented in it?
    • C. When/if someone replies directly to a comment you've written?
  • 3. How will we relate comments to sections?
    • Note: this assumes our interest is limited to comments and topic NOT replies.
  • 4. How will we relate replies to comments and comments to sections?
    • Note: this assumes we are interested in topics, comments and replies posted in direct response to comments.
  • 5. Is it a priority to be able to track the full "lifecycle" of the notification flow?
    • Where "Notification flow" = notifications is sent (start) → response is posted (end).

Next steps

  • Editing + Product Analytics to meet to arrive at answer to the ===Open questions above.

Provisional/partial answers to questions 1., 2., and 5. from T277349#7064137, below.

Next step

  • Answer and/or create a plan for finalizing the answers all questions when @DLynch, @MNeisler, and I meet the week of 24-May.

Open questions

  • 1. How are we – collectively – defining "topics," "comments," and "replies"?

Here is what I currently understand the above to mean:

  • Topic: an edit to a talk page that contains:
    • 1) An ==H2==
    • 2) Some text
    • 3) A signature.
  • Comment: an edit to a talk page that:
    • 1) Is posted within a topic
    • 2) Is not indented,
    • 3) Contains some text
    • 4) Contains a signature.
  • Reply: an edit to a talk page that:
    • 1) Is posted within a topic
    • 2) Is indented
    • 3) Contains some text
    • 4) Contains a signature.
  • 2. Which of the below do we care to know/measure?
    • A. When/if someone posts a comment in response to a new topic 'you've' started?
    • B. When/if someone comments in a section after you have already commented in it?
    • C. When/if someone replies directly to a comment you've written?

I think the answer to this question depends on the complexity and effort involved with adding the instrumentation required for us to track direct responses to comments people have written.

If the complexity/effort is not significantly greater than the complexity/effort with tracking comments posted in sections "you've" started or already commented in, then I think it is worth it.

Reason being: the KPI we've defined for Topic Subscriptions has to do with reducing the amount of time between someone ("Person A") posting on a talk page and someone else ("Person B") responding to what they've said. Where "responding" implies a "Person B" is addressing "Person A." As such, it would be ideal for the instrumentation to enable us to track the specific interactions we are interested in affecting.

  • 5. Is it a priority to be able to track the full "lifecycle" of the notification flow?
    • Where "Notification flow" = notifications is sent (start) → response is posted (end).

Priority? No. Nice to have? Yes.

Ideally, we'd be able to evaluate the impact of a notification being sent on a person's likelihood of engaging with the content/person responsible for triggering said notification.

With the above said, the measurement plan is "asking for" information about the "Notification flow" in the context of evaluating the extent to which Junior and Senior Contributors are successfully using topic subscriptions. The measurement plan is not "asking for" information about the "Notification flow" in the context of evaluating the impact of the feature.

As such, if it turns out to require a significant amount of work to relate peoples' actions/engagement with notifications to their actions/engagement with the page for which they are being notified about, I think we can let go of tracking it for now. Perhaps if we encounter a scenario where we are not seeing topic subscriptions impacting how people contribute to talk pages, we can explore extending the "Notification flow" further.

TASK DESCRIPTION UPDATE
I've updated the task description with the decisions Megan and I finalized during the meeting we had today, 2-June.

I'm now assigning this over to Megan to complete "1." from the "NEXT STEPS" listed below.


NEXT STEPS

Task description update

  • ADDED question #6: "For new notification instrumentation (read: Echo), to what schema will these new events be added? An existing schema? A new schema?"

@ppelberg
I've updated the instrumentation spec to ensure it reflects all of the events we will need to track. During my review, I identified one additional open question noted below. Once that's resolved, this is ready for @DLynch's review.

OPEN QUESTION:
Our measurement plan currently identifies the following as high priority, low likelihood measure of disruption:

" People receive multiple notifications for the same comment (e.g. mention, topic subscription, user_talk page)."

Is this a priority to track? Do we want to track notifications sent for the same comment as well as the same new topic? If so, we will need to add a unique comment and new topic identifier to the notification instrumentation as well.

@ppelberg
I've updated the instrumentation spec to ensure it reflects all of the events we will need to track. During my review, I identified one additional open question noted below. Once that's resolved, this is ready for @DLynch's review.

Excellent and noted.

OPEN QUESTION:
Our measurement plan currently identifies the following as high priority, low likelihood measure of disruption:

" People receive multiple notifications for the same comment (e.g. mention, topic subscription, user_talk page)."

Is this a priority to track? Do we want to track notifications sent for the same comment as well as the same new topic?

How would you categorize the effort required to add the tracking that will help us verify that people are not receiving multiple notifications for the same comment or topic [i]? Is the instrumentation difficult to implement? Were the instrumentation to be implemented, would the analysis required to make sense of the data said instrumentation implemented be difficult to complete?

I ask the above thinking that if implementing this instrumentation and conducting the analysis needed to glean information from the data said instrumentation will produce is "less than" straightforward, then I think we can skip it for now.

Reason being:

  1. I assume active editors are mindful of the number of notifications they receive
  2. If/when they notice themselves receiving duplicative notifications, they will tell us and it will be relatively straightforward for us to verify the issue(s) they are encountering

i. E.g. Person A subscribes to a Conversation 1. Person B posts a comment to Conversation 1 that includes a mention of Person A. In this scenario, Person A receive one notification that they have been pinged (read: they would NOT receive a new comment notification in this case).

Is the instrumentation difficult to implement? Were the instrumentation to be implemented, would the analysis required to make sense of the data said instrumentation implemented be difficult to complete?

@ppelberg Assuming there is just a unique identifier for each new comment, the analysis should be pretty straightforward. To detect the specific issue identified in the measurement plan, I think it will be more important to have a unique comment identifier versus topic identifier as I could see other legitimate reasons a user might receive multiple notifications for the same topic (e.g. there were multiple comments to a topic).

That being said, I agree that we could probably skip this for the same reasons you mentioned. I also think monitoring the number of notifications sent per user over time will help identify any sudden increases or decreases that will alert us to some type of bug that is causing a significant amount of duplicate notifications.

I left my notes on the instrumentation spec. To sum up, there's a lot of "no, because although the data is there it's not actually being recorded in a helpful manner". We could mine data from echo_event, but I think it's more trouble than it's worth.

I also left a few comments asking how some events should be distinguished, since I realized I couldn't tell from the spec.

I'd say that we're going to need some new instrumentation mostly around the comment-posting. I believe it'll be simple, since it just has to hook in exactly where the notifications are already being generated and log a slightly different view of what they're sending. We'd also probably want some schema-logging around the toggling of subscriptions. Also either enabling the existing (disabled) Echo schema, or doing some limited incorporation of what it logs into one of those other schemas I just mentioned.

Assuming there is just a unique identifier for each new comment, the analysis should be pretty straightforward. To detect the specific issue identified in the measurement plan, I think it will be more important to have a unique comment identifier versus topic identifier as I could see other legitimate reasons a user might receive multiple notifications for the same topic (e.g. there were multiple comments to a topic).

The trivially-accessible information that we're already sending with the notification is:

'subscribed-comment-name' => $heading->getName(),
'comment-id' => $newComment->getId(),
'comment-name' => $newComment->getName(),
'content' => $newComment->getBodyText( true ),
'section-title' => $heading->getText(),
'revid' => $newRevRecord->getId(),
'mentioned-users' => $mentionedUsers,

Given what's there, I can certainly access an id for the comment, its parent, and the thread it's contained within.

Next step


Note: you can see an overview of the process we are tracking in the task description of T284848.

Quick update on task progress:

I met with @DLynch this past Tuesday (13-July). We resolved any remaining questions on the instrumentation spec regarding how events should be distinguished and discussed where events should be stored in a way that is helpful for analysis. Based on our discussions,

  • Comment Posting Events: Will be stored in a new talk page events schema.
  • Notification Events: Combination of Echo (inactive; will need to be enabled) and EchoInteraction schema.
  • Notification enabling and disabling/subscription toggling: There is not an eventlogging schema to track these events; however, data is stored in the discussiontools_subscription table. I am currently reviewing and will confirm if this will be sufficient to track the metrics identified for this project.

Next Steps: I am currently working on proposing the schema design and field names for any new instrumentation. I should have that completed by early next week and will send to @DLynch to review and begin work on instrumentation.

(cc @ppelberg)

Hi @MNeisler ,
Thanks so much for posting this very concise and helpful update. I want to make sure we save this outcome/ decision and Phab comments can be lossy. I wonder if there is a doc where we can put the decisions down. Maybe it's already part of an associated instrumentation doc, in which case, can you please share that link with me.

Nothing at all to do if it's not in an instrumentation doc, then I will just jot it down for myself in a doc because I don't want to make any extra work!

Maybe it's already part of an associated instrumentation doc, in which case, can you please share that link with me.

@LZaman - Yes, here's a link to the instrumentation doc which is where we are documenting the instrumentation plan. Column D is currently with details on where each event will be stored.

@DLynch

Sorry for the delay. I've updated the instrumentation spec with some proposed fields for the new schema to track comment posting on talk pages and confirmed the fields in Echo/Echo Interaction and discussiontools-subscription table which can be used to calculate the identified metrics for notifications.

Let me know if you have any questions or suggested changes.

Summary of where data where be logged and some open questions listed below:

Comment Posting Events:

  • These will be stored in a new schema.
  • Suggested name of new schema: talk_page_event (Recommend we use talk page instead of discussion tool in the schema name since we will be tracking both comments made using discussion tools and wikieditor page editing).
  • Please see talk_page_event schema proposal for suggested schema fields and naming conventions.

Open Questions:

  • For non-discussion tool comments and topics, will it be possible to decipher between posting a comment/topic vs making an edit to an existing comment in this schema. If possible, we only want this schema to track published new topics, comments or responses. (associated with work to define a qualifying edit done in T262107)
  • Do we want this to track any other actions beyond a user publishing a comment, topic, or response? I currently added an event.action field which can be used if we decide to track actions such as "subscribe" in this schema. Otherwise I think this can be removed.

Notification Events

  • Will be stored in Echo (need to be enabled) and EchoInteraction.
  • Confirmed the existing fields in those two schemas should be sufficient to track identified notification related metrics

Open Questions:

  • Can these Echo and EchoInetraction schemas can be joined using eventId and userId fields?
  • We will need to add values to EchoInteraction's event.notificationType field to differentiate the different notifications a user will receive. I'm not completely clear right now on all the possible notification types but happy to help propose some event names if I get a list of those.

Notification Preference Changes

  • The discussiontools_subcription table should be sufficient to track metrics identified for this project pending resolution of open questions below:

Open Questions:

  • Is it possible to join EchoInteraction/Echo with the discussiontools_subscription table and/or the new talk_page_event schema? This will be needed for the following metrics:
  • Percent of contributors that receive notifications and turn them off
  • A contributor that make an edit to a talk page manually enables or disables topic subscriptions.
  • For the metric, "Sudden increase in the percent of contributors that unsubscribe from a conversation" we will need to track changes in preferences over time. Can this be added to the talk_page_event schema as another type of action or should we store elsewhere?

For non-discussion tool comments and topics, will it be possible to decipher between posting a comment/topic vs making an edit to an existing comment in this schema. If possible, we only want this schema to track published new topics, comments or responses. (associated with work to define a qualifying edit done in T262107)

So long as the signature isn't changed, we shouldn't be triggering any events for edits to existing comments. If they change the signature, we'd pick it up as a new comment.

Do we want this to track any other actions beyond a user publishing a comment, topic, or response? I currently added an event.action field which can be used if we decide to track actions such as "subscribe" in this schema. Otherwise I think this can be removed.

More of a product question, I think? @ppelberg?

Can these Echo and EchoInetraction schemas can be joined using eventId and userId fields?

They both have an eventId, which should match up. I believe that you can then match Echo.recipientUserId and EchoInteraction.userId.

We will need to add values to EchoInteraction's event.notificationType field to differentiate the different notifications a user will receive. I'm not completely clear right now on all the possible notification types but happy to help propose some event names if I get a list of those.

notificationType is just a string and not an enum, so I believe that we won't need to make any schema changes for this. My understanding about the implementation of our notifications is that there's only one notificationType: dt-subscribed-new-comment

Is it possible to join EchoInteraction/Echo with the discussiontools_subscription table and/or the new talk_page_event schema? This will be needed for the following metrics:

You can join it with talk_page_event, with topic_id and user_id. Echo and EchoInteraction are non-starters for this, unless you only wanted "has received any notification", followed by "has unsubscribed from any thread" without the specifics matching up.

For the metric, "Sudden increase in the percent of contributors that unsubscribe from a conversation" we will need to track changes in preferences over time. Can this be added to the talk_page_event schema as another type of action or should we store elsewhere?

We don't have a way to do this currently -- the discussiontools_subscription table stores a creation timestamp and a state, but not a modified timestamp -- so we can't tell if someone was manually-subscribed and then later unsubscribed. That said, if all you care about is "sudden jump in the number of subscriptions recently created that are currently an unsubscribe", you could do that (SELECT * FROM discussiontools_subscription WHERE sub_state = 0 AND sub_created > (NOW() - INTERVAL 1 DAY) or similar).

Do we want this to track any other actions beyond a user publishing a comment, topic, or response? I currently added an event.action field which can be used if we decide to track actions such as "subscribe" in this schema. Otherwise I think this can be removed.

More of a product question, I think? @ppelberg?

As discussed, we are going to include the event.action field.

Is it possible to join EchoInteraction/Echo with the discussiontools_subscription table and/or the new talk_page_event schema? This will be needed for the following metrics:

You can join it with talk_page_event, with topic_id and user_id. Echo and EchoInteraction are non-starters for this, unless you only wanted "has received any notification", followed by "has unsubscribed from any thread" without the specifics matching up.

Confirmed: we are interested in knowing when someone disables the feature (read: changes their preferences). As such, we will depend on [Schema:PrefUpdate](Schema:PrefUpdate) for this information.

For the metric, "Sudden increase in the percent of contributors that unsubscribe from a conversation" we will need to track changes in preferences over time. Can this be added to the talk_page_event schema as another type of action or should we store elsewhere?

We don't have a way to do this currently -- the discussiontools_subscription table stores a creation timestamp and a state, but not a modified timestamp -- so we can't tell if someone was manually-subscribed and then later unsubscribed. That said, if all you care about is "sudden jump in the number of subscriptions recently created that are currently an unsubscribe", you could do that (SELECT * FROM discussiontools_subscription WHERE sub_state = 0 AND sub_created > (NOW() - INTERVAL 1 DAY) or similar).

  • @ppelberg to confirm that it is sufficient for us to only track instances where someone has been automatically subscribed to a thread and later unsubscribes from it.

For the metric, "Sudden increase in the percent of contributors that unsubscribe from a conversation" we will need to track changes in preferences over time. Can this be added to the talk_page_event schema as another type of action or should we store elsewhere?

We don't have a way to do this currently -- the discussiontools_subscription table stores a creation timestamp and a state, but not a modified timestamp -- so we can't tell if someone was manually-subscribed and then later unsubscribed. That said, if all you care about is "sudden jump in the number of subscriptions recently created that are currently an unsubscribe", you could do that (SELECT * FROM discussiontools_subscription WHERE sub_state = 0 AND sub_created > (NOW() - INTERVAL 1 DAY) or similar).

  • @ppelberg to confirm that it is sufficient for us to only track instances where someone has been automatically subscribed to a thread and later unsubscribes from it.

@MNeisler: it is sufficient for us to only track instances where people have been automatically subscribed to a thread and later unsubscribe from it.

Thinking: tracking a sudden increase in the number of people unsubscribing from a thread will be sufficient for helping us to detect whether people are finding topic subscriptions unreliable, which this metric is intended to help us do. [i]


i. See: the "Unreliable" scenario in the "Pre-mortem" section of the measurement plan.

@MNeisler I'm assigning this task over to you to update the Instrumentation spec with the decisions made in T277349#7227660, T277349#7244933, and T277349#7247776. [i]

Once this is done, please assign this back over to me to resolve.


i. I think this all that's left to be done on this task. Although, if this is not the case, please say as much.

@ppelberg - I've updated the instrumentation spec to reflect the decisions in the comments you identified. The instrumentation spec now identifies all the new and existing instrumentation required to calculate the identified metrics for topic subscriptions.

Assigning this back over to you to resolve.

As the next step, I'm going to document the new instrumentation that needs to be implemented in a format I think would work best for legal's review. This will be done in T286076. Let me know if that's not the case.

@ppelberg - I've updated the instrumentation spec to reflect the decisions in the comments you identified. The instrumentation spec now identifies all the new and existing instrumentation required to calculate the identified metrics for topic subscriptions.

Excellent. I've updated the task description to reflect the fact that the instrumentation spec is finalized.

As the next step, I'm going to document the new instrumentation that needs to be implemented in a format I think would work best for legal's review. This will be done in T286076. Let me know if that's not the case.

This sounds great. I've updated the === Instrumentation Steps in T284848 with where we are in the process. See: T284848#7256902.