Misplaced Pages

:Misplaced Pages Signpost/2018-02-05/Recent research: Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
< Misplaced Pages:Misplaced Pages Signpost | 2018-02-05 Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 01:16, 19 January 2018 view sourceEddie891 (talk | contribs)Autopatrolled, Administrators56,138 edits fixTag: 2017 wikitext editor← Previous edit Revision as of 01:20, 19 January 2018 view source Eddie891 (talk | contribs)Autopatrolled, Administrators56,138 edits findingsTag: 2017 wikitext editorNext edit →
Line 13: Line 13:
{{WRN}} {{WRN}}
===Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features=== ===Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features===
This article <ref>{{Cite journal| doi = 10.20944/preprints201801.0017.v1| last1 = Lewoniewski| first1 = Włodzimierz| last2 = Węcel| first2 = Krzysztof| last3 = Abramowicz| first3 = Witold| title = Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features| date = 2018-01-03| url = http://www.preprints.org/manuscript/201801.0017/v1}}</ref> focuses on the 1.2 million unassesed articles in the ], and considers "over 100 linguistic features to determine the quality of Misplaced Pages articles in Polish language." This article <ref>{{Cite journal| doi = 10.20944/preprints201801.0017.v1| last1 = Lewoniewski| first1 = Włodzimierz| last2 = Węcel| first2 = Krzysztof| last3 = Abramowicz| first3 = Witold| title = Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features| date = 2018-01-03| url = http://www.preprints.org/manuscript/201801.0017/v1}}</ref> focuses on the 1.2 million unassesed articles in the ], and considers "over 100 linguistic features to determine the quality of Misplaced Pages articles in Polish language." Conclusion: Use of linguistic features is valuable for automatic determination of quality of Misplaced Pages article in Polish language. Better results in terms of precision can be achieved when the whole text of article is taken into the account. Then our model shows over 93% classification precision using such features as relative number of unique nouns and verbs (unique, 3rd person, impersonal). However, if we take into account only leading section of an article, relative quantity of common words, locatives, vocatives and third person words are the most significant for determination of quality.Using the obtained quality models we asses 500 000 randomly chosen unevaluated articles from Polish Misplaced Pages. According to result, about 4-5% of assessed articles can be considered by Misplaced Pages community as high quality articles.
===Briefly=== ===Briefly===



Revision as of 01:20, 19 January 2018

Article display preview:
pxTKTK – TKTKRecent researchTKTK Worthwhile Canadian initiative TKTKTKTK Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos. TKTK
This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!

This draft article ...

  • Red X symbolN ... has no title defined.
  • Red X symbolN ... has no blurb defined.
  • Red X symbolN ... is not yet ready to be copyedited.
  • Red X symbolN ... has not yet been copyedited.
  • Red X symbolN ... does not have an image.
  • Red X symbolN ... is not yet approved for publication.

Writer resources ...

deadlines Writing: 13 January 00:00 (-2 days ago; -9%) Publishing: 14 January 00:00 (-1 day ago; -4%)Deadline has started. (refresh)



Last revised 01:20, 19 January 2018 (UTC) (6 years ago) by Eddie891 (refresh)
The Signpost
← Back to ContentsView Latest Issue

Recent research

(Your article's descriptive subtitle here)

Contribute   —  Share this By ...

A monthly overview of recent academic research about Misplaced Pages and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features

This article focuses on the 1.2 million unassesed articles in the Polish Misplaced Pages, and considers "over 100 linguistic features to determine the quality of Misplaced Pages articles in Polish language." Conclusion: Use of linguistic features is valuable for automatic determination of quality of Misplaced Pages article in Polish language. Better results in terms of precision can be achieved when the whole text of article is taken into the account. Then our model shows over 93% classification precision using such features as relative number of unique nouns and verbs (unique, 3rd person, impersonal). However, if we take into account only leading section of an article, relative quantity of common words, locatives, vocatives and third person words are the most significant for determination of quality.Using the obtained quality models we asses 500 000 randomly chosen unevaluated articles from Polish Misplaced Pages. According to result, about 4-5% of assessed articles can be considered by Misplaced Pages community as high quality articles.

Briefly

Conferences and events

See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.

  • "..."
  • "..."
  • "..."
  • "..."
  • "..."

References

  1. Lewoniewski, Włodzimierz; Węcel, Krzysztof; Abramowicz, Witold (2018-01-03). "Determining Quality of Articles in Polish Misplaced Pages Based on Linguistic Features". doi:10.20944/preprints201801.0017.v1. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: unflagged free DOI (link)
Supplementary references:

full width content

← Previous "Recent research"
S
In this issue5 February 2018 (all comments)
  • Op-ed
  • Featured content
  • Recent research
  • Blog
  • Interview
  • Traffic report
  • Special report
  • Arbitration report
  • In the media
  • Humour
  • + Add a comment

    Discuss this story

    These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
    Possibly, but it is based upon a data dump from 2008. Carnegie Mellon has a whole department dedicated to WP. I've been there. Ironically, most do not edit. Barbara (WVS)   14:16, 5 February 2018 (UTC)
    @Barbara (WVS): Note how the 2011 paper isolated authority citations and opinion changes ("alignment moves") as the primary features (beyond the writing parties, their semantic assertions, etc.) of talk pages. While the CMU paper says as much in section 2 on page 1027, they proceed to focus solely on authority claims in section 5.2.3 on page 1030, along with the other features in section 5.2, but ignore the crucial instance of participants coming into agreement with others. That's a really stark omission and I am sure their analysis would have been stronger if they included it. Do you know the authors? If so, please suggest that if it makes sense to you. 213.86.87.228 (talk) 18:36, 5 February 2018 (UTC)
    What do you think of The Signpost? Share your feedback. Home About Archives Newsroom Subscribe Suggestions Categories: