Misplaced Pages

:Misplaced Pages Signpost/2018-02-05/Recent research: Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
< Misplaced Pages:Misplaced Pages Signpost | 2018-02-05 Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 21:38, 27 January 2018 view sourceTbayer (WMF) (talk | contribs)Extended confirmed users1,785 edits tweak headline / looks like an error (if it was intentional, please indicate which part of the text it refers to so that we can add the usual inline attribution)← Previous edit Revision as of 02:43, 29 January 2018 view source Eddie891 (talk | contribs)Autopatrolled, Administrators56,138 edits I'm not able to edit it via mobile, but the review attributed to me was actually Barbara, I just pasted it in.Tags: Mobile edit Mobile web editNext edit →
(One intermediate revision by the same user not shown)
Line 2: Line 2:
{{Signpost draft {{Signpost draft


|title = Automated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?
|title =
|blurb = |blurb =
|ce = No |ce = No
Line 9: Line 9:
}}{{Misplaced Pages:Misplaced Pages Signpost/Templates/RSS description|1=}}{{Misplaced Pages:Signpost/Template:Signpost-header|||}}</noinclude> }}{{Misplaced Pages:Misplaced Pages Signpost/Templates/RSS description|1=}}{{Misplaced Pages:Signpost/Template:Signpost-header|||}}</noinclude>


{{Misplaced Pages:Misplaced Pages Signpost/Templates/Signpost-article-header-v2|{{{1|Automated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?}}}|By ], ], and ]| ... January 2018}} {{Misplaced Pages:Misplaced Pages Signpost/Templates/Signpost-article-header-v2|{{{1|Automated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?}}}|By ], ], ] and ]| ... January 2018}}
{{Misplaced Pages:Misplaced Pages Signpost/Templates/Signpost-block-start-v2}} {{Misplaced Pages:Misplaced Pages Signpost/Templates/Signpost-block-start-v2}}
{{WRN}} {{WRN}}

Revision as of 02:43, 29 January 2018

Article display preview:
pxTKTK – TKTKRecent researchAutomated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?TKTK Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos. TKTK
This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!

This draft article ...

  • Green checkmarkY ... has a title defined.
    Automated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?
  • Red X symbolN ... has no blurb defined.
  • Red X symbolN ... is not yet ready to be copyedited.
  • Red X symbolN ... has not yet been copyedited.
  • Red X symbolN ... does not have an image.
  • Red X symbolN ... is not yet approved for publication.

Writer resources ...

deadlines Writing: 13 January 00:00 (-2 days ago; -9%) Publishing: 14 January 00:00 (-1 day ago; -4%)Deadline has started. (refresh)



Last revised 02:43, 29 January 2018 (UTC) (6 years ago) by Eddie891 (refresh)
The Signpost
← Back to ContentsView Latest Issue

Recent research

Automated Q&A from Misplaced Pages articles; Who succeeds in talk page discussions?

Contribute   —  Share this By Eddie891, Thomas Niebler, Barbara (WVS) and Tilman Bayer

A monthly overview of recent academic research about Misplaced Pages and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Reading Misplaced Pages to Answer Open-Domain Questions"

Reviewed by Thomas Niebler

This paper by Chen et al. propose to use the Misplaced Pages article corpus as a source of world knowledge in order to answer open domain questions. They point out that Misplaced Pages articles contain a lot more information than current knowledge bases, such as DBPedia or Freebase. While knowledge in KBs is encoded in a more machine-friendly way, the vast majority of Misplaced Pages's knowledge is not covered in KBs, but contained in unstructured text and is thus difficult to access in an algorithmic way. The proposed approach, called "DrQA", aims to overcome that limitation by leveraging the article content. It first retrieves Misplaced Pages articles relevant to a question, and then uses a recurrent neural network (RNN) to detect relevant parts in the article's paragraphs that could be used as answers. This RNN is based on a set of pretrained word embeddings as well as a set of other features.

Their results indicate that DrQA seems better suited to answer open domain questions than other competitors, based on a set of four question benchmarks. While the evaluation score improvement seems rather small (77.3 vs 78.8 F1 score), the whole task of machine reading at scale using Misplaced Pages gives directions for interesting future research and applications. For example, depending on the speed of the framework (which unfortunately was not discussed), a new Misplaced Pages service for answering such open domain questions could be established. Furthermore, this process of answering common knowledge questions could help in improving chatbots.

Are you a policy wonk? Who succeeds in talk page discussions

Reviewed by Eddie891

This Carnegie Mellon University study quantified the success of those editors who engage in talk page discussions and their roles in these discussions. The roles assigned to each editor was:

  • Moderator - decides when a decision is final to support their views
  • Architect - designs the article and its sections to support their views
  • Policy Wonk - quotes acronyms that represent policy/rules/guidelines to support their view
  • Wordsmith - determines the best article titles and section titles based upon their point of view
  • Expert - interjects facts into the discussion to support their point of view

Unlike earlier studies exploring editor interactions, editors in this study could be assigned simultaneous roles on an article talk page. Success of each editor was determined by analyzing subsequent edits to the article under discussion which were promoted by a particular editor and longevity of these edits. Those editors that are more detail-oriented tend to have more success than those more interested in organization. Multiple editors assuming the role of organization lessens the success of individual editors. The study assessed 7,211 articles, 21,108 discussion threads, 21,108 editor discussion pairs, and the average number of editors per discussion. The number of total edits by an editor is not associated with success.

"Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features"

Summarized by Eddie891

This article focuses on the 1.2 million unassesed articles in the Polish Misplaced Pages, and considers "over 100 linguistic features to determine the quality of Misplaced Pages articles in Polish language." From the conclusion: "Use of linguistic features is valuable for automatic determination of quality of Misplaced Pages article in Polish language. Better results in terms of precision can be achieved when the whole text of article is taken into the account. Then our model shows over 93% classification precision using such features as relative number of unique nouns and verbs (unique, 3rd person, impersonal). However, if we take into account only leading section of an article, relative quantity of common words, locatives, vocatives and third person words are the most significant for determination of quality.Using the obtained quality models we asses 500 000 randomly chosen unevaluated articles from Polish Misplaced Pages. According to result, about 4-5% of assessed articles can be considered by Misplaced Pages community as high quality articles."

Conferences and events

See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer
  • "Enrichment of Information in Multilingual Misplaced Pages Based on Quality Analysis" From the abstract: "Misplaced Pages articles may include infobox, which used to collect and present a subset of important information about its subject. This study presents method for quality assessment of Misplaced Pages articles and information contained in their infoboxes. Choosing the best language versions of a particular article will allow for enrichment of information in less developed version editions of particular articles." See also coverage of related papers involving the same author above, in our last issue: "Assessing article quality and popularity across 44 Misplaced Pages language versions", and below:
  • "Analysis of References Across Misplaced Pages Languages" From the abstract: "This paper presents an analysis of using common references in over 10 million articles in several Misplaced Pages language editions: English, German, French, Russian, Polish, Ukrainian, Belarussian. Also, the study shows the use of similar sources and their number in language sensitive topics."
  • "Misplaced Pages as a space for discursive constructions of globalization" From the abstract: "This article compares, through computer-assisted text analysis and qualitative reading, entries for the word ‘globalization’ in six major Western languages: English, German, French, Spanish, Portuguese, and Italian. Given Misplaced Pages’s model of open editing and open contribution, it would be logical to expect that definitions of globalization across different languages reflect variations related to diverse cultural contexts and collective writing. Results show, however, more similarities than differences across languages, demonstrated by an overall pattern of economic framing of the term, and an overreliance on English language sources."</ref>
  • "FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia" From the abstract: "In this paper we describe Frisk a multilingual unsupervised approach for the categorization of the interests of Twitter users. Frisk models the tweets of a user and the interests (e.g., politics, sports) as bags of articles and categories of Misplaced Pages respectively "
  • "Introduction to anatomy on Misplaced Pages" From the abstract: "No work parallels the amount of attention, scope or interdisciplinary layout of Misplaced Pages, and it offers a unique opportunity to improve the anatomical literacy of the masses. Anatomy on Misplaced Pages is introduced from an editor's perspective. Article contributors, content, layout and accuracy are discussed, with a view to demystifying editing for anatomy professionals."
  • "The institutionalization of free culture movement based on the study of Wikimedia projects in the East-Central Europe" From the English abstract: "The author of the publication presents the processes of institutionalization occurring in the projects of the Wikimedia Foundation, co-organized in the framework of the free culture movement. These processes on the one hand lead to the relative closing up of the members of groups belonging to regional cultures, especially those who speak the same language, on the other hand to encouraging interregional cooperation. Common enterprises undertaken by partners from East-Central Europe are not only contribution to the free culture movement, but may also point to emphasizing the common identity of prosumers of post-socialist societies."
  • "The Russian-language Misplaced Pages as a Measure of Society Political Mythologization" From the abstract: "The analyzed in this article myth about inheritance rights of Russia to the Kyivan Rus’1 arose in the 15th century. Recently this myth is being actively spread by the Russian propaganda in the mass media – in particular this is performed through Misplaced Pages being one of the most attended Internet resources. the purpose of this myth consists in activation of separatist sentiments of Russian-speaking Ukrainian citizens. Purpose – to explore vulnerability of Misplaced Pages policy of openness on the basis of a specific example as well as to explore its efficiency for formation of political myths; to analyze the technology used for creation of Misplaced Pages articles in the process of formation of myths.Methods. Comparison method is applied – texts of Misplaced Pages articles on various time stages of their creation were compared; results of analyzing Misplaced Pages pages were correlated to political events of Russian-Ukrainian relations. Results. Mythology not obliged to prove anything and Misplaced Pages aimed at forming the concept and creating only an impression of scientificness and not knowledge as such are perfectly agreed. That is why Misplaced Pages is one of the most efficient spreaders of myths (first of all political myths) supporting a definite ideology."
  • "Analysing Timelines of National Histories across Misplaced Pages Editions: A Comparative Computational Approach" From the abstract: "... we aim to automatically identify such differences by computing timelines and detecting temporal focal points of written history across languages on Misplaced Pages. In particular, we study articles related to the history of all UN member states and compare them in 30 language editions. We develop a computational approach that allows to identify focal points quantitatively, and find that Misplaced Pages narratives about national histories (i) are skewed towards more recent events (recency bias) and (ii) are distributed unevenly across the continents with significant focus on the history of European countries (Eurocentric bias). We also establish that national historical timelines vary across language editions, although average interlingual consensus is rather high ..."
  • "Using WikiProjects to Measure the Health of Misplaced Pages" From the abstract: "We analysed 3.2 million Misplaced Pages articles associated with 618 active Misplaced Pages projects. The dataset contained the logs of over 115 million article revisions and 15 million talk entries both representing the activity of 15 million unique Wikipedians altogether. Our analysis revealed that per WikiProject, the number of article and talk contributions are increasing, as are the number of new Wikipedians contributing to individual WikiProjects." From the results section: "In comparison to Suh et al. and Halfaker et al., our findings suggest that based on the WikiProject activity, Misplaced Pages is not in decline, but still enjoying growth with new users, edits, and discussion activity. Akin to other complex online communities, using traditional methods to measure community and system health may not reflect their true state ..."


References

  1. Danqi Chen; Adam Fisch; Jason Weston; Antoine Bordes: Reading Misplaced Pages to Answer Open-Domain Questions. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  2. Maki, Keith; Yoder, Michael; Jo, Yohan; Rosé, Carolyn (2017). "Roles and Success in Misplaced Pages Talk Pages: Identifying Latent Patterns of Behavior". Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1: 1026–1035.
  3. Lewoniewski, Włodzimierz; Węcel, Krzysztof; Abramowicz, Witold (2018-01-03). "Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features". doi:10.20944/preprints201801.0017.v1. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: unflagged free DOI (link)
  4. Lewoniewski, Włodzimierz (2017-06-28). Enrichment of Information in Multilingual Misplaced Pages Based on Quality Analysis. International Conference on Business Information Systems. Lecture Notes in Business Information Processing. Springer, Cham. pp. 216–227. doi:10.1007/978-3-319-69023-0_19. ISBN 9783319690223. Closed access icon
  5. Lewoniewski, Włodzimierz; Węcel, Krzysztof; Abramowicz, Witold (2017-10-12). Analysis of References Across Misplaced Pages Languages. International Conference on Information and Software Technologies. Communications in Computer and Information Science. Springer, Cham. pp. 561–573. doi:10.1007/978-3-319-67642-5_47. ISBN 9783319676418. Closed access icon author's copy / conference presentation video recording
  6. Rubira, Rainer; Gil-Egui, Gisela (2017-10-30). "Misplaced Pages as a space for discursive constructions of globalization". International Communication Gazette: 1748048517736415. doi:10.1177/1748048517736415. ISSN 1748-0485. Closed access icon
  7. Jipmo, Coriane Nana; Quercini, Gianluca; Bennacer, Nacéra (2017-11-05). FRISK: A Multilingual Approach to Find twitteR InterestS via wiKipedia. International Conference on Advanced Data Mining and Applications. Lecture Notes in Computer Science. Springer, Cham. pp. 243–256. doi:10.1007/978-3-319-69179-4_17. ISBN 9783319691787. Closed access icon, author copy
  8. Ledger, Thomas Stephen (2017-09-01). "Introduction to anatomy on Misplaced Pages". Journal of Anatomy. 231 (3): 430–432. doi:10.1111/joa.12640. ISSN 1469-7580. Closed access icon
  9. Skolik, Sebastian (2017). "Instytucjonalizacja ruchu wolnej kultury na przykładzie projektów Wikimedia w przestrzeni Europy Środkowo-Wschodniej". Wydawnictwo Uniwersytetu Śląskiego: 347–367. {{cite journal}}: Cite journal requires |journal= (help) Closed access icon (in Polish, book chapter from ISBN 978-83-8012-916-0)
  10. Sokolova, Sofiia (2017). "The Russian-language Misplaced Pages as a Measure of Society Political Mythologization". Journal of Modern Science. 33 (2): 147–176. ISSN 1734-2031. Closed access icon
  11. Samoilenko, Anna; Lemmerich, Florian; Weller, Katrin; Zens, Maria; Strohmaier, Markus (2017-05-24). "Analysing Timelines of National Histories across Misplaced Pages Editions: A Comparative Computational Approach". arXiv:1705.08816 . (published version)
  12. Tinati, Ramine; Luczak-Roesch, Markus; Shadbolt, Nigel; Hall, Wendy (2015). Using WikiProjects to Measure the Health of Misplaced Pages. ACM Press. pp. 369–370. doi:10.1145/2740908.2745937. ISBN 9781450334730. Closed access icon / Tinati, Ramine; Luczak-Rösch, Markus; Hall, Wendy; Shadbolt, Nigel (2015-05-23). Using WikiProjects to measure the health of Misplaced Pages. Web Science Track, World Wide Web Conference.
← Previous "Recent research"
S
In this issue5 February 2018 (all comments)
  • Op-ed
  • Featured content
  • Recent research
  • Blog
  • Interview
  • Traffic report
  • Special report
  • Arbitration report
  • In the media
  • Humour
  • + Add a comment

    Discuss this story

    These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
    Possibly, but it is based upon a data dump from 2008. Carnegie Mellon has a whole department dedicated to WP. I've been there. Ironically, most do not edit. Barbara (WVS)   14:16, 5 February 2018 (UTC)
    @Barbara (WVS): Note how the 2011 paper isolated authority citations and opinion changes ("alignment moves") as the primary features (beyond the writing parties, their semantic assertions, etc.) of talk pages. While the CMU paper says as much in section 2 on page 1027, they proceed to focus solely on authority claims in section 5.2.3 on page 1030, along with the other features in section 5.2, but ignore the crucial instance of participants coming into agreement with others. That's a really stark omission and I am sure their analysis would have been stronger if they included it. Do you know the authors? If so, please suggest that if it makes sense to you. 213.86.87.228 (talk) 18:36, 5 February 2018 (UTC)
    The Signpost: doing it for free since 2005. Home About Archives Newsroom Subscribe Suggestions Categories: