Misplaced Pages

talk:Copyright problems - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Doc James (talk | contribs) at 02:46, 1 June 2014 (Crimean status referendum, 2014). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 02:46, 1 June 2014 by Doc James (talk | contribs) (Crimean status referendum, 2014)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) Shortcut For image or media copyright questions, see Misplaced Pages:Media copyright questions.
This is not the page to report a specific article's copyright problem. To do so, list the article on today's entry at the project page after following the appropriate instructions.
This is the talk page for discussing Copyright problems and anything related to its purposes and tasks.
Archives: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23Auto-archiving period: 30 days 
Misplaced Pages copyright
Policy
Guidelines
Advice
Processes
Resources
See also: m:copyright, m:Do fair use images violate the GFDL?, m:GFDL, m:GFDL Workshop, and m:non-free content

Stack Exchange attribution

Heartbleed has text copied from Stackexchange, since this revision. As it is CC-BY-SA, the text should be attributed properly. Normal CC-BY-SA Attribution can be done by Template:CCBYSASource, however stackexchange requires that nofollow may not be used . What to do? --Muelleum (talk) 00:33, 16 April 2014 (UTC)

Many of the attribution requirements (not just the "nofollow" one) in that blog post go far beyond those published in Creative Commons's own guidelines and the licence itself:
  • According to the CC-BY-SA human-readable summary, "appropriate credit" is explained as "If supplied, you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material. Prior versions of CC licenses have slightly different attribution requirements."
  • According to CC's attribution comparison chart, attribution in adapted works should be "reasonable to medium and means".
  • The CC-BY-SA 3.0 licence itself also indicates that attribution in adpated works should be "reasonable to the medium and means".
The first, third, and fourth requirements in the blog post directly contradict the licence, making it impossible to reuse Stack Exchange material in any non-visual or non-hypertext medium. It is impossible, for example, to use a "text blurb" to "visually indicate" the source in an audio recording, and it is impossible to "hyperlink directly to the original question" and "hyperlink each author name" in print publications. These scenarios are important for us because Misplaced Pages is regularly reproduced in audio and print media.
The blog post does seem to constitute the latest, official, entire attribution requirements for user content on Stack Exchange, since it's referenced in every page footer: "user contributions licensed under cc by-sa 3.0 with attribution required". So there is apparently a contradiction between the attribution requirements in the blog post and those in the licence they purport to use. Absent any further clarification, we should assume that the attribution requirements are stipulations over and above the ones specified by the licence, meaning that the licence for Stack Exchange content, while arguably still free, is incompatible with the unmodified CC-BY-SA 3.0 licence used by Misplaced Pages. This means we should remove all Stack Exchange content from Misplaced Pages, unless we get a statement from the original author(s) that they waive the additional attribution requirements. —Psychonaut (talk) 10:03, 16 April 2014 (UTC)
Wha? That is a ridiculous demand about nofollow. http://wiki.creativecommons.org/Marking/Users doesn't say anything specific about it but doesn't give the slighest indication that anyone is obliged to propagate the link through search engines. I wonder if the Creative Commons folks might have anything to say about the issue. Stack Exchange has also taken to issuing referral codes that its users then spam other forums with, trying to get other people to join SO. So I'm getting rather annoyed with them.

I don't see the SO content in that revision of the heartbleed article. If someone can say where it is, I can probably rewrite it. (Better to use the article talk page for this). 70.36.142.114 (talk) 10:42, 16 April 2014 (UTC)

Oh, I see, the info is in this diff: It's a list of vulnerable programs. I'd be inclined to leave it alone for now, and dig up independent citations for the individual programs involved as opportunity permits. 70.36.142.114 (talk) 10:54, 16 April 2014 (UTC)
Well, this is a problem not just with the Heartbleed article but generally. Many other Misplaced Pages articles (Steinmetz solid, for example) incorporate material copied from Stack Exchange websites. If the licences are incompatible then all this material needs to be identified and removed (or kept, in the unlikely event that the authors can be contacted and agree to relicence it appropriately—Stack Exchange websites don't make it possible for contributors to be contacted directly). —Psychonaut (talk) 11:14, 16 April 2014 (UTC)
Does Stack Exchange claim copyright on user-contributed content? That's even more obnoxious than everything else mentioned so far. Misplaced Pages is by far the largest publisher of CC content and it uses nofollow on all pages and noindex on most (500 million noindexed history pages on en.wp alone) plus it gets a stupdendous amount of web traffic and it has millions of contributors in hundreds of languages. Stack Exchange is a pipsqueak by comparison. So I think Misplaced Pages presents a far more credible standard of CC contributor expectations than Stack Exchange. That page on SE is one guy spouting fringe opinions that most of the SE contributors have probably never noticed. 70.36.142.114 (talk) 12:05, 16 April 2014 (UTC)
Actual terms of service are at -- they seem convoluted to me; in that the claim CC by SA but then appear to impose terms that go beyond CC by SA. NE Ent 12:20, 16 April 2014 (UTC)
Actually, the Terms of Service you linked to are much more reasonably worded than the blog post. While they still contain the "nofollow" provision, they are careful to specify that the hyperlink requirements apply only to "Internet use", and that the attribution to Stack Exchange need not be "visual":
You agree that You will follow the attribution rules of the Creative Commons Attribution Share Alike license as follows:
a. You will ensure that any such use of Subscriber Content visually displays or otherwise indicates the source of the Subscriber Content as coming from the Stack Exchange Network. This requirement is satisfied with a discreet text blurb, or some other unobtrusive but clear visual indication.
b. You will ensure that any such Internet use of Subscriber Content includes a hyperlink directly to the original question on the source site on the Network (e.g., http://stackoverflow.com/questions/12345)
c. You will ensure that any such use of Subscriber Content visually display or otherwise clearly indicate the author names for every question and answer so used.
d. You will ensure that any such Internet use of Subscriber Content Hyperlink each author name directly back to his or her user profile page on the source site on the Network (e.g., http://stackoverflow.com/users/12345/username), directly to the Stack Exchange domain, in standard HTML (i.e. not through a Tinyurl or other such indirect hyperlink, form of obfuscation or redirection), without any “nofollow” command or any other such means of avoiding detection by search engines, and visible even with JavaScript disabled.
Even assuming that these terms of service override the blog post (and, given that it's the blog post rather than the ToS which is linked to in the copyright notice on every page, I'm not sure that they do), the question remains as to whether the technical restrictions in (d) are reasonably covered under the terms of the original CC-BY-SA 3.0 licence. I still strongly suspect that they are not, which would make Stack Exchange user content unusable on Misplaced Pages and other CC-BY-SA-licensed projects. —Psychonaut (talk) 12:41, 16 April 2014 (UTC)
In the blog post, Jeff Atwood, co-founder of SO, writes "All the content contributed to Stack Overflow or other Stack Exchange sites is cc-wiki (aka cc-by-sa) licensed, intended to be shared and remixed". So there is an intent by SO for content to be shared. Before we start removing SO content, we should ask SO first whether they clarify their attribution terms, and/or make them more permissive. I think the blog post was intended to stop simple copies of SO only made to aggregate traffic and display ads, and not to render SO content incompatible with well intentioned CC-BY-SA projects like WP. So there is a potential to talk.
The SO TOS contains "You agree that all Subscriber Content that You contribute to the Network is perpetually and irrevocably licensed to Stack Exchange under the Creative Commons Attribution Share Alike license.". This gives them a blank CC license by the original contributors. When the attribution terms are breaking CC, the copyright of the original authors can be breached, as they are breaching the 'No additional restrictions'-part of CC. Original authors give SO an additional license called "Content License", which permits SO to display the content. This "Content License" however does not permit SO to re-license the content. So when SO relicenses the content, they must follow CC-BY-SA. Even when original author copyright is breached, WP cannot use the content until the original autors enforce the CC onto SO, or also license it to WP under CC. (AFAIK, when A licenses b to C under a CC license, still one of A or C have to agree when D wants to use b under CC terms).
Because of the blank CC license, I think SO has the right to modify the attribution terms. Therefore I think we should ask SO for assistance, even when the TOS indicate that they think the CC allows the distributor to set "Attribution terms". I wonder, whether those "Attribution terms" propagate on remix, so that wikipedia then would also have to enforce those attribution terms to all people citing WP.
Please also note that I'm no lawyer. --Muelleum (talk) 13:44, 16 April 2014 (UTC)
Assuming those extra attribution conditions are valid and in effect, could Stack Exchange's modification of them now apply retroactively to past contributions? Or would this require the consent of all contributors? —Psychonaut (talk) 15:05, 16 April 2014 (UTC)
Misplaced Pages should inform Stack Exchange that it is not allowed to republish Misplaced Pages content through those additional terms. I don't think they are going to loosen up for us. If anything, that nofollow thing was aimed directly at Misplaced Pages. I don't see how the SE TOS applies to non-SE users who simply republish under CC, since the idea of CC is you don't have to agree to additional terms. The thing about "any other such means of avoiding detection by search engines" would seem to mean it's not ok to put the stuff on a web forum that's accessible only to logged in users. The whole purpose seems to make content re-users assist in SE's SEO operations. That is almost surely uncontemplated under CC-BY which allows commercial re-use through the channels of one's choice (e.g. to humans but not to search engines). I think they are trying to undermine CC principles and we should ask CC itself to intervene. Sue Gardner and Jimmy Wales are both on the CC advisory board. I'm going to leave a message on Jimbo's talk page. 70.36.142.114 (talk) 14:45, 16 April 2014 (UTC)
I think WP should follow WP:GF and not start a war with SE before we've heard SE's opinion on this. Only because they want to make money they are not evil. They try to be open. We should let them decide whether they go the open way or the GeoGebra way. --Muelleum (talk) 15:23, 16 April 2014 (UTC)
Sure, of course WP should make a polite request before escalating, however it should go in with the view that the !nofollow demand is incompatible with the CC license that it claims to use. I notice the SE pages now say at the bottom "cc by-sa 3.0 with attribution required rev 2014.4.16.1551" which suggests something changed yesterday--could that be in response to this discussion? Keep in mind also that SE doesn't publish those user contributions as a copyright holder entitled to put forth terms in the first place. It publishes them as a licensee of the contributors. The blog post tries to argue that forbidding nofollow is a reasonable expectation of users wanting to ensure proper attribution, but 1) as said below, the vast majority of CC-BY-SA content contributors are on Misplaced Pages rather than SE, and WP contributors observably don't have that expectation; 2) the SE demand is self-servingly bogus since it wants the contributor "credited" with a link back to SE rather than to the contributor's own site. 70.36.142.114 (talk) 19:13, 16 April 2014 (UTC)
Math Overflow operated independently for several years before becoming part of the SE network. I very much doubt that MO had anything like that nofollow prohibition in its explication of CC-BY-SA. MO contributors before the acquisition didn't agree to the SE TOS. SE is attempting to relicense their contributions under more restrictive terms than they contributed under, contravening the Share-alike part of the license. I had a bad feeling when MO joined up with SE, and now I feel sickened. I wonder if MO can withdraw from SE. 70.36.142.114 (talk) 15:05, 16 April 2014 (UTC)

LOL, stackexchange itself serves links with nofollow, including links to wikipedia. And they have a thread about nofollow. 70.36.142.114 (talk) 15:50, 16 April 2014 (UTC)

I left Jimbo a talk message, but his page mentions that he's unavailable for a week or so. I also wonder if that SE TOU purports to require contributors to enforce that nofollow condition on SE's behalf. That would of course be even crazier than the other parts. 70.36.142.114 (talk) 15:49, 17 April 2014 (UTC)

I left a message on Mindspillage's user talk (she does legal stuff for CC now). 70.36.142.114 (talk) 10:04, 18 April 2014 (UTC)

Anyone? Anything? Muelleum (talk) 18:15, 20 May 2014 (UTC)

I just left Jimbo another talk message. He responded and is going to point Jeff Atwood at the discussion. 70.36.142.114 (talk) 21:10, 21 May 2014 (UTC)

Hey, this is Jay from Stack Exchange. Thanks for highlighting this - I'm going to work with our lawyers on how we can clarify our ToS, but here's the gist:

  • Forget the blog post; it's old, and our ToS (and probably CC-SA) have been updated since then. (I know we link to it weirdly in places, and will look at cleaning that up.)
  • We do not claim any copyright on user-created content; it's simply licensed to us under CC-SA, which we relicense to others under CC-SA
  • You're correct that CC-SA does not allow for downstream restrictions
  • CC-SA specifies some specific things that may be required, like the author's name, additional "attribution parities" (like us, in this cade), among other things, but leaves the manner of linking, presentation, etc. to be covered by the phrase, " reasonable to the medium or means You are utilizing"
  • The CC-SA license is what covers what's allowed w/r/t user content. The specifics regarding nofollow are our interpretation of "reasonable to the medium or means You are utilizing"
  • We read that to mean roughly, "in the normal, typical manner for a given medium," which on the internet seems to us like an actual hyperlink (not just the text of the URL) without additional restrictions like nofollow, etc.
  • But the license is CC-SA, and any interpretation we may make of what "reasonable to the medium or means You are utilizing" is superseded by the rules of the license itself.

Put another way, assuming you're sure that you're giving attribution for our content in a manner consistent with CC-SA, you should be fine. We love Misplaced Pages, and one of the main reasons we use CC-SA is the desire to be two-way compatible with your stuff (mostly so our users can post excerpts when needed). JaydlesSE (talk) 21:09, 22 May 2014 (UTC)JaydlesSE

Reporting copyright problems

Dear copyright experts: I have several times tried to report copyright problems here, and despite doing my best to follow directions, not once have my entries ended up in the right place. I apologize in advance, since this will likely happen again unless I just don't report any more of these. —Anne Delong (talk) 01:45, 18 May 2014 (UTC)

Hi, User:Anne Delong. :) When you put the {{copyvio}} template on a page, it generates a link for you to follow to put the report in the right place as well as the code for making the report. For example, look at Osman Nuhu Sharubutu, which I flagged today. There's a section labeled "Instructions for filing". It should give you everything you need.
That said, I'm sorry you've found this so difficult. Maybe if you talk a little more about where you are running into issues, we can fix it. :) The WP:CP page is more like WP:AFD or WP:3RR then it is a regular noticeboard, so it's not really set up like a normal discussion page. It's template-driven, generally. :) --Moonriddengirl 02:07, 18 May 2014 (UTC)
Well, for example, today at the bottom of the page it said put new article listings in "Misplaced Pages:Copyright problems/2014 May 18", which was a red link. I clicked on the link, thinking that this would start a new page for that day which would then be transcluded into the main page. However, when I looked later my entry was on the previous day's list. I didn't realize that I was supposed to be tagging the page; I don't think I want to do that when I am not sure there is a violation, so maybe I'll just leave this type of thing for someone bolder. —Anne Delong (talk) 02:25, 18 May 2014 (UTC)
No, you put it on the right day - Misplaced Pages:Copyright problems/2014 May 18. The problem is that our bots are broken, so the header with the date is not automatically created anymore. :/ I've asked our botmaster to help out, but unfortunately his non-Misplaced Pages life is consuming his time. :/
Okay, thanks. —Anne Delong (talk) 03:20, 18 May 2014 (UTC)
If you're not sure there's a violation, the better place to ask is here generally. Or you're always welcome to stop by my talk page. :)
I think it would be a shame if you stopped following up on these because of fear that you're not doing something correctly. In terms of WP:IAR, this is definitely one of those things where even if you weren't doing it correctly, it would be better for the encyclopedia for you to keep doing it. :) --Moonriddengirl 02:32, 18 May 2014 (UTC)
I've checked out the issue, Anne Delong, and you were right to flag it. :) One thing to keep in mind, though, is that {{copyvio}} is perfectly fine for ambiguous cases. That's partially what it's for. It's okay to blank a section of the article pending resolution of copyright status. But, again, if you're unsure and don't want to tag, please feel free to inquire. :) I greatly appreciate the attention you've been showing to copyright issues, and I certainly wouldn't want you discouraged from that. --Moonriddengirl 02:39, 18 May 2014 (UTC)

Alfred Dunhill

Hi, although this photograph of the deceased Alfred Dunhill was taken in 1893, I have not been able to prove that it was published before 1944 or whenever the copyright limitations expire. Upload: Tom (talk) 17:47, 18 May 2014 (UTC)

Hello, Tom. If you have questions about the copyright status of media, you should consider asking them at WP:MCQ. :) --Moonriddengirl 19:16, 20 May 2014 (UTC)

Uskok War

The 3rd paragraph of this article is copied from the reference cited for it. Consequently I am also concerned about the 2nd and 4th paragraphs, but I don't have access to the cited work to check them. Lavateraguy (talk) 16:34, 24 May 2014 (UTC)

Thank you, User:Lavateraguy. Ordinarily, I'd ask you to flag it and list it in accordance with the directions at Misplaced Pages:Copyvio101, but I went ahead this time to review it. The third paragraph was added by a different user than the 2nd and 4th, and I've verified the issues and removed it. I've noted on the talk page that there is some concern especially with machine translation. If you'd like to share any additional concerns there, please do! It may help others clean up any remaining issues. --Moonriddengirl 14:46, 25 May 2014 (UTC)

Soham murders - alleged possible copy-pasting from a book

A newly-registered account alleges that "This article contains copyrighted material from a biography of Ian Huntley, Beyond Evil by Nathan Yates ISBN 1844541428, but without citing the book. I have entered citations and a reference". I don't know if this means entire paragraphs were copy-pasted, but it does appear that even after the new account has entered citations, the copied material is still not enclosed within quotation marks. It looks like the material indicated may have been in the article since at least 2009. --Demiurge1000 (talk) 15:47, 25 May 2014 (UTC)

(Note: MRG is currently following this up on the talk page of the person who originally reported the issue.) --Demiurge1000 (talk) 19:43, 28 May 2014 (UTC)

Should a different presentation of material have a link?

Could someone have a look at Template talk:Weather box#Separate templates and User talk:CambridgeBayWeather#You wrong. Is Subtropical-man correct and the templates require attribution. Thanks. CBWeather, Talk, Seal meat for supper? 16:40, 26 May 2014 (UTC)

Main discussion is here: Template_talk:Weather_box#Separate_templates. Subtropical-man talk
(en-2) 16:59, 26 May 2014 (UTC)
@CambridgeBayWeather:, @Subtropical-man::
Copyright is complicated. As far as Misplaced Pages is concerned, the law that matters is the US copyright law (individual editors are governed by laws in their jurisdiction, but since US copyright law governs the website it is that to which our content must conform). If the templates are completely uncreative, then their content does not require attribution under US law, as US law does not recognize "sweat of the brow". ("Sweat of the brow" is the right to be recognized for the labor in your work; the U.S. does not recognize labor, but only creativity.) If there is creativity in the contents (and the US government sets the bar for creativity deliberately very low), then the material may be copyrighted, and attribution is required. Since attribution costs nothing and since failure to attribute can create problems for the project and especially for the person who fails to attribute, I myself would lean towards attribution. This isn't a question of "Do we have to delete it?" but "Do we have to name the person who created it?" It's a far different paradigm than copying from external sources.
If Template:Green Bay, Wisconsin weatherbox were listed at CV, I'd say that the only creativity I see there is perhaps the arrangement of the table (including selection of colors) and footnote (a). To check creativity of arrangement, I'd look at other tables representing weather data to see if that's a standard method of display. If it is, it's not creative, and attribution for that is not required. At two sentences, footnote (a) is minimal, but, again, since attribution costs nothing but failure to attribute can cost much, I'd attribute. It's my opinion that failure to attribute those two sentences is not likely to rise to the level of copyright violation, as it's not likely to be substantial.
Misplaced Pages:Copyright in lists talks a little bit about creativity in content versus creativity of arrangement and how the US has interpreted these things.
Some other countries do recognize "sweat of the brow", and in those countries, attribution might be required for the table data itself. This has nothing to do with Misplaced Pages, though - while the user who placed the content might be held to that law if he lives in a country where it is practiced, it's not a violation of copyright on English Misplaced Pages if it doesn't violate US law. --Moonriddengirl 10:53, 28 May 2014 (UTC)

Crimean status referendum, 2014

I am not sure whether this is the right place to start this discussion, but I was led here by a link at Template:copyvio-revdel.

RevDel of a range spanning hundreds of edits of Crimean status referendum, 2014 unrelated to the copyvio has been requested. I oppose the RevDel as this would break the attribution of these edits. Petr Matas 09:32, 28 May 2014 (UTC)

Hi, Petr Matas. As the English Misplaced Pages community interprets policy, RevDel doesn't break attribution of edits as long as it leaves the usernames. Everyone who contributes content to Wikimedia project agrees to accept any of the following forms of attribution:

Through hyperlink (where possible) or URL to the article to which you contributed (since each article has a history page that lists all authors and editors);
Through hyperlink (where possible) or URL to an alternative, stable online copy that is freely accessible, which conforms with the license, and which provides credit to the authors in a manner equivalent to the credit given on the Project website; or
Through a list of all authors (but please note that any list of authors may be filtered to exclude very small or irrelevant contributions).

This is from our site's WMF:Terms of Use. As long as the names are accurate and intact in history, the list of all authors is retained and attribution is met. --Moonriddengirl 10:37, 28 May 2014 (UTC)
I see that my understanding of attribution was wrong. Petr Matas 11:00, 28 May 2014 (UTC)

Still, I think that such RevDel would make to much harm to the article's history, which contains a large amount of information, which would be lost to the public. I think that the removal of the copyvio in a single sourced paragraph reporting on the opinion of the Hungarian ministry is not worth it. Petr Matas 11:00, 28 May 2014 (UTC)

I think that's a valid perspective, Petr Matas, but there are different opinions on that matter. I myself tend to use RevDelete when large amounts of content are involved and when the risk of it returning to publication are high, or when the deletion does not cost us anything substantial. Others believe that deletion should be applied more routinely, which I think is also a valid perspective. (The legal team recently posted meta:Wikilegal/Copyright Status of Misplaced Pages Page Histories wherein they discuss the status of copyright violations in article histories (on community request).) We haven't yet reached a wide consensus on when and how this should be applied to balance the needs of copyright cleanup with transparency. --Moonriddengirl 11:26, 28 May 2014 (UTC)

WP:Turnitin

Wondering if everyone here is aware of this effort? The hope is to run new edits over a certain size through Turnitin and flag those which may have likely copyright issues for human follow-up. The plan was initially to launch it for medical articles. Would this tool be useful to this group as well? We have some support from the Wiki Education Foundation as well as a number of other Wikipedians. Doc James (talk · contribs · email) (if I write on your page reply on mine) 02:46, 1 June 2014 (UTC)