Jump to content

Grants talk:IEG/Editor Interaction Data Extraction and Visualization: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 9 years ago by Fabian Flöck in topic overlooked interaction type
Content deleted Content added
No edit summary
Line 189: Line 189:


: I certainly agree with that. This interaction type is important although not that explicit. See also my [[Grants_talk:IEG/Editor_Interaction_Data_Extraction_and_Visualization#Further_extensions_.2F_variations_of_the_interaction_extraction|comment in the intra-article interaction discussion below]]. We can surely extract that, it is more an issue of how to define what counts as an interaction and what doesn't and how to express it. That is a non-trivial task (How far away in time or textual distance do I have to edit for my edit to constitute an interaction with you? And what type will it be? Or how much will I weigh it?). What has been done in research so far (sorry, don't have the papers in mind) as far as I can recall was mostly building networks of co-editorship inter-article. Combining this with intra-article editorship (e.g. in the same section in the same day) is very interesting. --[[User:Fabian Flöck|Fabian Flöck]] ([[User talk:Fabian Flöck|talk]]) 15:21, 18 October 2014 (UTC)
: I certainly agree with that. This interaction type is important although not that explicit. See also my [[Grants_talk:IEG/Editor_Interaction_Data_Extraction_and_Visualization#Further_extensions_.2F_variations_of_the_interaction_extraction|comment in the intra-article interaction discussion below]]. We can surely extract that, it is more an issue of how to define what counts as an interaction and what doesn't and how to express it. That is a non-trivial task (How far away in time or textual distance do I have to edit for my edit to constitute an interaction with you? And what type will it be? Or how much will I weigh it?). What has been done in research so far (sorry, don't have the papers in mind) as far as I can recall was mostly building networks of co-editorship inter-article. Combining this with intra-article editorship (e.g. in the same section in the same day) is very interesting. --[[User:Fabian Flöck|Fabian Flöck]] ([[User talk:Fabian Flöck|talk]]) 15:21, 18 October 2014 (UTC)

:: In the same vein, I would strongly recommend to extend the kinds of interactions described in the grant somehow like this and make it more systematic:
* Intra-page
** articles
*** antagonistic (delete/undo) = reverts
*** supportive (reintroductions/redeletes)
*** co-editing (editing in vicinity of each other (time/space) under certain constraints in an article)
** talk pages
*** non-user talk: replying to each other in a thread
*** user talk: posting on each other's talk pages
*** for discussion: other talk spaces

* Inter-page
** Co-editing of articles and talk pages (constraint or weighted by time/space)



--[[User:Fabian Flöck|Fabian Flöck]] ([[User talk:Fabian Flöck|talk]]) 15:34, 18 October 2014 (UTC)


== questions from rubin16 ==
== questions from rubin16 ==

Revision as of 15:49, 18 October 2014

Clarifying roles and responsibilities

Hi Pine and Halfak (WMF), thanks for all your work on this proposal!

I'm wondering if you can share some more details about how you'd be dividing up roles and responsibilities for this project, as I understand it mostly rests on the 2 of you to do the labor. Specifically:

  • Pine, as your time is the sole "expense" this grant would fund, can you please share a bit more about what activities you'd be responsible for? I'm particularly wondering what things would be in your wheelhouse as a Research Analyst.
  • Aaron, I note that you are volunteering with your WMF staff account. Does that mean that this project needs WMF staff to do some of the extraction or other work in order to complete this project? What are the pieces of this project that only you can do (either as staff, or in your personal volunteer capacity)?

This feels like a new experimental case for IEG, and I want to make sure we understand it fully before marking the proposal eligible. Best wishes, Siko (WMF) (talk) 00:02, 4 October 2014 (UTC)Reply

Hi Siko (WMF), some of this is very much variable depending on the scope of this project, and Aaron has just found another person who may work with us. I think it would be good for Aaron and I to meet to flesh out this proposal some more. Unfortunately my time to work on this is limited until the end of next week; this proposal could have used at least another week of discussion prior to submitting it but we were up against the submission deadline, so we are still sorting out how this project will work.

Questions that Aaron and I should discuss:

  • Aaron's role: is he working in a WMF capacity, in a volunteer capacity, or both?
  • Pine's tasks and time expectations
  • Role of our third participant
  • Timelines and deliverables

Halfak (WMF) can you set up a time with me to answer these and other questions from Siko? Perhaps we could meet on Wednesday morning.

--Pine 00:36, 4 October 2014 (UTC)Reply

Hey Siko (WMF) and Pine. I apologize for the late response. I've been under the weather this weekend and traveling this week.
This project is not an official part of my duties as WMF staff. My role on this project is that of a volunteer. I'll help Pine with what he needs to get the project up and running (e.g. edits to this proposal) and help him find collaborators (e.g. Fabian_Flöck and HaithamS_). I'll also work to produce some of the editor interaction datasets and consult with Pine and other volunteers/advisors about formats, distribution and APIs for editor interaction data. After typing this out, it's clear that I should have signed the proposal with my volunteer account (EpochFail). I'll go fix that right away. Sorry for the confusion. --Halfak (WMF) (talk) 03:32, 7 October 2014 (UTC)Reply
Done --EpochFail (talk) 03:40, 7 October 2014 (UTC)Reply
Thanks for this clarification, EpochFail, that's helpful! Pine, I expect the committee will want to better understand answers to address the other 3 bullet points on your list of questions when they review this proposal (particularly your tasks in each activity specified, and any gaps remaining in the team as it forms), so I'd encourage you to keep working towards adding these pieces into the proposal over coming days as you sort them out more clearly. Meanwhile we'll mark this proposal eligible so you can proceed towards review :) Best wishes, Siko (WMF) (talk) 20:37, 9 October 2014 (UTC)Reply
Project meetings we had one meeting today and are planning another one for Saturday. Hopefully by the end of Saturday we will have a reasonable project plan that all participants agree with and that the Committee can review. --Pine 18:55, 13 October 2014 (UTC)Reply

Sensitivity of data

Hi Pine. I realize that my name is listed as an advisor on the project, but I would like to ask a few questions as this project is now being proposed for IEG. One of the major questions I have in mind is around data sensitivity. The proposal states that one of its main goals is to publish user interaction data, do you think that some users might raise concerns (privacy, potential of misuse, .. etc) around the published data? If yes, what kind of concerns, and how this might contradict with the goals of this project?. Thanks. --HaithamS (WMF) (talk) 22:07, 9 October 2014 (UTC)Reply

Hi HaithamS (WMF), thanks for your question. The current plan is to use only public data, in much the same way that xtools uses public data. --Pine 20:03, 10 October 2014 (UTC)Reply
Yes, the data is public, but I think some care should be taken when showing interaction examples involving specific individuals. It could be perceived as "picking on" individuals unfairly. I don't think we want to be showing actual usernames in those cases, User1, User2, etc, or other more descriptive but non-identifying names should be chosen. There may be situations (e.g. ArbCom) where the ability to use the tools to produce interaction logs with actual user names is perhaps appropriate but not in the "lab rat" situation. I guess I am saying that the software should always anonymise except when instructed to reveal the names of specific users when legitimately requested. Kerry Raymond (talk) 02:11, 18 October 2014 (UTC)Reply

Eligibility confirmed, round 2 2014

This Individual Engagement Grant proposal is under review!

We've confirmed your proposal is eligible for round 2 2014 review. Please feel free to ask questions and make changes to this proposal as discussions continue during this community comments period.

The committee's formal review for round 2 2014 begins on 21 October 2014, and grants will be announced in December. See the schedule for more details.

Questions? Contact us.

Jtud (WMF) (talk) 22:20, 9 October 2014 (UTC)Reply

Analysis and visualization methods

"We will make editor interaction data easy to understand by using visualizations."

Who will generate these visualizations? I'm down for a good 2d visualization here and there, but I don't have much experience in network visualization. Who will do that work? --EpochFail (talk) 15:50, 13 October 2014 (UTC)Reply

That task will probably be let by me, possibly in cooperation with Fabian or Haitham. --Pine 18:53, 13 October 2014 (UTC)Reply
It would be good to understand if either of the others have actually confirmed they are willing and able to contribute to these tasks, Pine. My impression so far is that HaithamS (WMF) offered advice on the general idea of this project before it became a grant proposal, but I'm less sure how your recent conversations have gone with him about his role. Generating the visualizations can take significant volunteer time and that's well beyond scope of an advisor. Cheers, Siko (WMF) (talk) 22:32, 17 October 2014 (UTC)Reply
As pretty as network visualisations can be, I don't know if it's actually massive networks that we need to visualise here. I Further to my comments elsewhere on this page, I think we are probably most interested in visualising the interactions that have high individual significance or are cumulatively significant. Any visualisation I can think of would probably have time as the X axis, perhaps sometimes using a log scale to emphasis recent history over ancient history. Just to take a simple question. Is reverting more common in 2014 (as a percentage of all edits) than it was in the past? Who is being reverted, newbies vs editors of varying levels of experience? Who is doing the reverting? Is that changing over time? Is the first edit a user makes to an article more likely to be reverted than subsequent edits? Is that changing over time? I think the game with visualisation is looking to see correlations with editor attrition. I think the kinds of questions that might be better addressed with one of the those network visualations might be to visualise interaction against high-level categories. So do a layout based on distance between categories and colour them by the scoring of the interactions. That might show that certain categories were more prone to certain kinds of interaction patterns, e.g. POV issues around politics articles, reverting unsourced material in BLPs. Are disagreements more likely to occur in politics than geography? Kerry Raymond (talk) 03:38, 18 October 2014 (UTC)Reply

Community notification

We're a bit behind here. It seems to me that the Wiki Researcher community will be our primary audience for this work. If so, here's a few places it seems that we should canvass:

I'm sure I'm missing others, but this seems like a good place to start. --EpochFail (talk) 15:53, 13 October 2014 (UTC)Reply

Woops. Nearly forgot about gendergap-l and Gender gap.

I suspect that, for newcomer interactions, hosts at en:WP:Teahouse would be interested too. --EpochFail (talk) 15:55, 13 October 2014 (UTC)Reply

Formats

Hey folks. I figure now is as good of a time as any to start talking about formats for editor interaction datasets.

I propose that the core event of an editor iteraction can be represented as a triple of:

<interaction> ::= <person> <person> <timestamp: int>
<person> ::= inst. of Human
<timestamp> ::= int

Since the wiki software represents persons as users -- registered and anonymous -- it seems clear that we need to simplify to:

<interaction> ::= <actor: user> <actee: user> <timestamp: int>
<user> ::= <registered user> | <anonymous user>
<registered user> ::= <id: int>
<anonymous user> ::= <text: str>

We'll also like to carry a payload of metadata about the event (was it positive or negative? what was the topic of conversation? etc.):

<interaction> ::= <user> <user> <timestamp: int> <meta>
<meta> ::= <type: str> ...
... ::= A relevant data structure for the type.

Now for an example:

revision wikitext event (JSON)
1
== His stolen watch. ==
The article is missing information about [...]
2
== His stolen watch. ==
The article is missing information about [...]
: What information are you talking about?  Was his [...]
{
  actor: {text: "123.123.123.123"},
  actee: {id: 987654},
  timestamp: 1984567890,
  meta: {
    type: "talk_page_section",
    section: {
      index: 1,
      title: "His stolen watch."
    },
    conversers: 2
  }
}
3
== His stolen watch. ==
The article is missing information about [...]
: What information are you talking about?  Was his [...]
:: Yes it was.  There's an article in the [...]
{
  actor: {id: 987654},
  actee: {text:  "123.123.123.123"},
  timestamp: 1984567890,
  meta: {
    type: "talk_page_section",
    section: {
      index: 1,
      title: "His stolen watch."
    },
    conversers: 2
  }
}

--EpochFail (talk) 16:39, 13 October 2014 (UTC)Reply

I would suggest pulling out useful things from <meta> into top-level fields - like "positive<boolean>, topic<string>, etc". That would lend itself better to dumping into a database table and allowing people to query the dataset. But generating this interaction dataset seems to me like the best part of the proposal, I love it. As a side note, I would side with Nemo that it would be better to start on a different wiki. Start on something like etwiki and it'll be much faster to generate the dataset, then if it's useful, enwiki folks will be begging you to do it there as well. Also, in the time it would take to analyze enwiki, you could probably analyze a dozen small wikis and you'd have some very nice cross-wiki comparisons to make. Milimetric (WMF) (talk) 13:58, 17 October 2014 (UTC)Reply
I'd suggest sitting down and deciding on the theoretic model, then the information model, before deciding on any data representation. Otherwise the risk is that the chosen representation isn't powerful enough or isn't efficient for the types of queries one might need to do. Kerry Raymond (talk) 02:40, 18 October 2014

(UTC)

I would agree that where the interaction happens does matter and that all interactions have a timestamp for each party as in Wikipedia we have no actual single-point interaction (because they would be edit conflicts) but rather a set of individual actions at the same page separated in time.Kerry Raymond (talk) 03:45, 18 October 2014 (UTC)Reply

Use case missing

The scope is not defined anywhere, as all sections are tautological. One section says what we do is requested and the next says what is requested we do, then goals repeat the same and then several sections dive into implementation details. Please define what you actually are talking about. --Nemo 08:29, 17 October 2014 (UTC)Reply

Not one of the original proponents, but I think the end game here is to reduce editor attrition resulting from unpleasant interactions. There is plenty of anecdotal evidence that it is interaction with other editors that drives people away, both the newbies and the experienced editors, but I don't know if we have any handle on the nature and patterns of such interactions. Is it one person being repeatedly (and perhaps deliberately) unpleasant to another over a long time? Is it a process being worn down drip by drip by a series of unpleasant interactions by a large number of people (probably mostly unintended)? Are there signs of revenge? You reverted my edit a month ago on article X, so I'll say something negative on a talk page about something you've contributed on an apparently unrelated article. So I think we need to find all the places at which editors can interact to build profiles of interaction. A hypothesis might be "Editor attrition follows a big argument around a single article, which is characterised by <some pattern of interaction>". If so, we could have software watching for such patterns and try to intervene to calm things down before editor attrition occurs. Kerry Raymond (talk) 02:35, 18 October 2014 (UTC)Reply

English Wikipedia: no

Anything which begins with the English Wikipedia is 99 % certain to fail being expanded anywhere else. I oppose. Start with one or two wikis other than the English Wikipedia and I may believe that one day this will become for all languages; otherwise, just say you'll never go beyond en.wiki. --Nemo 08:29, 17 October 2014 (UTC)Reply

I think the choice of language is effectively constrained by the languages spoken by the researchers. One would often need to compare the quantitative data with qualititative data. You can't judge whether an interaction is a friendly or unfriendly one if you don't understand that langauge. Also, many major research journals are published in English, so English examples are more useful for publication. I don't see a problem at this stage to be explicit and restrict the scope to en.WP. Indeed, I think for an initial foray into this space, it is better to narrow the focus. Kerry Raymond (talk) 02:19, 18 October 2014 (UTC)Reply

overlooked interaction type

Possibly it was overlooked because it was so obvious that it didn't need mentioning, but I would have thought editing the same article was an interaction. The current list only mentions reverts as an interaction of interest, but I can certainly build up a liking or disliking of another editor even when we are not actually reverting each other. OK, two editors editing the same article 5 years apart probably isn't a terribly interesting interaction, but 5 mins apart is. So, I think editing the same article is relevant, but its significance is modified by the time gaps between them. Again, I dislike at least one editor because of a series of unsourced edits they did some years ago across a large number of articles. So, the significance/strength of our interaction in any one article is low, but cumulatively I've grown to dislike them a lot. How does that play out if this editor and I come into close contact on another article? Am I now pre-disposed to be unfriendly toward them over an unrelated matter?

I suspect we need to be able to find all interactions involving a pair of editors and develop some kind of weighting to determine the likely "strength/significance" of that interaction as well as the "sentiment" of it (friendly/postive, or not). Specifically, I would think the distance in time and the distance apart in the text of the article are both relevant to the significance of the interaction. If years apart we edit different sections, it seems a very minimal interaction. If minutes apart we edit the same sentence, it seems a very significant interaction. For example, I often react to unsourced facts being added to articles on my watchlist by adding in a citation if I can easily find one. Clearly this fits the "close in time, close in text" as a significant interaction. I think that I am being helpful, and I hope the other person thinks so too, but maybe the other person perceives my actions as an implicit criticism of their contribution.

Even with talk pages and voting pages, even if we aren't interacting directly on individual topics or votes, the fact that both of us edit that page suggests we may well be reading one other's remarks and forming opinions positive or negative about the other. Again, it's a weaker interaction that if we are going head-to-head with one supporting and one opposing on the same issue. So I think the weighting model probably has to be different for different types of pages. If you oppose my request for admin rights, I am probably going to take it more personally than if we disagree on a borderline case of notability when neither of us has contributed to the article.

So I would think we would want to be able to extract all possible interactions and then weight them rather than decide in advance that some aren't significant. The drip-drip-drip of growing anger/frustration may arise from lots of low-significant interactions. We probably want to calculate some kind of score based on across all interactions to measure the extent of absolute interaction as well as a score than then uses sentiment analysis on the interactions to determine the polarity of those interactions, which then leads to some sense of the feelings between the people. Of course, this assumes that interactions can cancel each other out. Does your revert of my edit yesterday get forgotten if you give me a barnstar today? Or would a "thanks" today be sufficient? I assume thanks is on the list of interactions? Kerry Raymond (talk) 03:11, 18 October 2014 (UTC)Reply

I certainly agree with that. This interaction type is important although not that explicit. See also my comment in the intra-article interaction discussion below. We can surely extract that, it is more an issue of how to define what counts as an interaction and what doesn't and how to express it. That is a non-trivial task (How far away in time or textual distance do I have to edit for my edit to constitute an interaction with you? And what type will it be? Or how much will I weigh it?). What has been done in research so far (sorry, don't have the papers in mind) as far as I can recall was mostly building networks of co-editorship inter-article. Combining this with intra-article editorship (e.g. in the same section in the same day) is very interesting. --Fabian Flöck (talk) 15:21, 18 October 2014 (UTC)Reply
In the same vein, I would strongly recommend to extend the kinds of interactions described in the grant somehow like this and make it more systematic:
  • Intra-page
    • articles
      • antagonistic (delete/undo) = reverts
      • supportive (reintroductions/redeletes)
      • co-editing (editing in vicinity of each other (time/space) under certain constraints in an article)
    • talk pages
      • non-user talk: replying to each other in a thread
      • user talk: posting on each other's talk pages
      • for discussion: other talk spaces
  • Inter-page
    • Co-editing of articles and talk pages (constraint or weighted by time/space)


--Fabian Flöck (talk) 15:34, 18 October 2014 (UTC)Reply

questions from rubin16

Hello, Pine! :) Could you, please, expand the problem you want to solve? Who are that researchers with requests, what strategic objectives are involved, what do you expect to change or introduce to wiki-community as a result of this research? rubin16 (talk) 13:11, 18 October 2014 (UTC)Reply

Editor interaction data based on edit activity inside articles: Dataformats/-sets and Visualizations

Hi, so I'm gonna sketch out what I already discussed with Halfak and Pine on hangouts in terms of what I could provide to the project. (I didn't know exactly how to integrate it into the main article, so I start my draft here; please move what you feel is relevant or tell me).

This covers intra-article interactions (so in one article at a time). Although the sets of editors/nodes from single articles could later be merged to generate a graph for a whole category or even whole Wikipedia.

What we have so far:

We extended the wikiwho algorithm we wrote (see here ) to generate relationship data between editors in an article based on edits.

Basic wikiwho authorship detection

So what is given from the original wikiwho algorithm is an output that tracks the authorship of single tokens of text and looks something like this (simplified):

(Legend: Under the tokens (words+special chars) of a revision you see the original author and the revision of origin for that token,

the # means "deletion of the token 4 lines above (previous revision), by the revision indicated on the left" )

revID editor action description Tokens ->
0 A add
There is a house on a hill .
A A A A A A A A
0 0 0 0 0 0 0 0
1 B light deletion B->A, add # #
There was a house on the hill . A tree was standing close !
A B A A A B A A B B B B B B
0 1 0 0 0 1 0 0 1 1 1 1 1 1
2 C deletion C->B # # # # # #
There was a house on the hill .
A B A A A B A A
0 1 0 0 0 1 0 0
3 D full revert D->C, reintro. B
There was a house on the hill . A tree was standing close !
A B A A A B A A B B B B B B
0 1 0 0 0 1 0 0 1 1 1 1 1 1
4 C light delete C->B , add # #
There was a house on the hill . A tree was standing nearby .
A B A A A B A A B B B B C C
0 1 0 0 0 1 0 0 1 1 1 1 4 4
Interaction extraction

Now we can transform that output into explicit interactions between the editors, as shown in the table below.

There are 4 different types of interactions:

  1. "delete" --> a token gets deleted, the deleting editor is the sender, the editor whose token was deleted is the receiver of the edge
  2. "undo" --> undoing a deletion or a reintroduction of a token. The "undoer" is the sender, the editor getting her action undone is the receiver.
  3. "reintroduction" --> the sender reintroduces content of the receiver that was previously deleted.
  4. "redeletion" --> the receiver deleted content, it was subsequently reintroduced and the sender now deletes the content again.

The first two interactions are regarded as antagonistic from the sender towards the receiver and marked with a "-", the latter two (3.+4.) are taken to be supportive and marked with a "+". (We deliberately refrained from using "revert" here, as actually both the "antagonistic" actions are reverts of some sort. When making the translation to a "revert", it would be these two. Cf. different kinds of reverts.)

"Weight" indicates how many tokens were affected by the action, "age" is an optional indicator for how old the tokens were that were affected (e.g., in revision 4 author D deletes 2 tokens from author B that have been introduced in revision 1, hence age=4-1=3)

editor interactions derived from the above-listed example revisions
revision sender receiver type weight age
1 B A delete (-) 2 1
2 C B delete (-) 6 1
3 D C undo (-) 6 1
3 B B reintroduction (+) 6 2
4 C B delete (-) 2 3
4 C B undo (-) 2 1
4 C C redelete (+) 2 2

These can be computed rather efficiently with our algorithm using the full-history dumps. Here I also agree that maybe starting with a smaller Wikipedia would be nice to have a full result set much faster.

As for the format I will comment on the "formats" thread above.

The plan is to integrate the output format with the project's needs and we would compute the interactions for the articles of whatever Wikipedias the project decides.

Further extensions / variations of the interaction extraction

Variations:

  • The 2 (-) and 2 (+) interaction types can and should imho actually be aggregated to only (+) and (-) for graph visualization approaches (for sure) and for analytics as well (probably), as otherwise, it gets to messy. For recording them, I'm not sure if that level of granularity is actually required.
  • ...?

Possible extensions:

  • Apart from weight and age (or "delay") of an interaction, we could also record more features per interaction:
    • Did the sender's edit affect all of the actions of a specific former edit? ("action" defined as each distinct interaction created by an editor changing a word, corresponding to the "weights" in the table) Than we could mark it as "full undo" or "full restore". E.g., D's reintroduction of the 6 tokens deleted by C in our example would be a "full undo" (or "full revert" if you will), thus the entry in the table for revision 3, D->C would be marked with a "1" in an additional column "full".
    • Did the edit create an identical revision as seen before (known as "identity revert" in other contexts) ? Than we could mark it in a new column/variable "identity revert"
    • Other metadata relating to the changed tokens are imaginable, such as, e.g., the average length of the tokens changed in the interaction, their total length, to what extent they were stopwords, etc....
  • More interaction types could be introduced (although I'm unsure re: the usefulness of the granularity and feel they might be very ambiguous/hard to define):
    • E.g. if two editors add words next to each other (define "next": could be in a paragraph, <40 chars apart...) without antagonizing each other (define how long afterwards), we could infer that they work together and add a "collaboration" tie or the like. This is related to Kerry's comment. (Although hers would also include looking at networks of co-editorship in different articles as far as I understood, which is a complementary extension one level up)
    • ... ?
Visualization

Using the interactions extracted like shown above, we are currently also working on a D3 (hence browser-based) visualization of the graph between editors in an article over time. So far it includes only the antagonistic edges. The implementation is based on the nice model proposed by Brandes et. al (paper here) and is a custom graph drawing approach for Wikipedia and negative edges. This is still very alpha, so I can only provide this screenshot so far (nodes are editors, will include also a lot of meta-info on). It also features a slider to navigate the network as it changes over revisions/time.

During the project I (with some researcher colleagues of mine) would work on that together with Pine probably to see how we can integrate it with other visualizations.

--Fabian Flöck (talk) 15:01, 18 October 2014 (UTC)Reply