User talk:ClueBot Commons: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 22:36, 28 December 2013 editJausio (talk \| contribs)16 edits →Yor Bot: new section← Previous edit		Revision as of 22:40, 28 December 2013 edit undoDavidLeighEllis (talk \| contribs)Extended confirmed users, Pending changes reviewers, Rollbackers58,329 editsm Reverted edits by Jausio (talk) to last version by ClueBot IIINext edit →
Line 90:		Line 90:

	There is an incorrect link under #Dataset Review Interface. --] (]) 23:25, 26 December 2013 (UTC)		There is an incorrect link under #Dataset Review Interface. --] (]) 23:25, 26 December 2013 (UTC)

	== Yor Bot ==

	is shrott and because that for indefinite time blocked. --] (]) 22:36, 28 December 2013 (UTC)

Revision as of 22:40, 28 December 2013

ClueBot NG Links!Report False Positives • Review edits for the dataset • Frequently Asked Questions False PositivesIf you believe that ClueBot NG has made a mistake, please follow the directions in the warning it gave or click here. Please do not report them here. It takes less time to report them to the correct location, and we can handle it more effectively if reported in the correct location. Purpose of this PageThis page is for comments on or questions about the ClueBots.

The current status of ClueBot NG is: Running
Praise should go on the praise page. Barnstars and other awards should go on the awards page.
Use the "new section" button at the top of this page to add a new section. Use the link above each section to edit that section.
This page is automatically archived by ClueBot III.
The ClueBots' owner or someone else who knows the answer to your question will reply on this page.

Template:Archive box collapsible

ClueBots
ClueBot NG/Anti-vandalism · ClueBot II/ClueBot Script
ClueBot III/Archive · Talk page for all ClueBots

Beware! This user's talk page is monitored by talk page watchers. Some of them even talk back.

Collaboration with the Axis Powers

Why are you keep reverting useful information? I AM ADDING references one by one. Don't use the label vandalism so recklessly. — Preceding unsigned comment added by 114.79.52.10 (talk) 04:35, 20 December 2013 (UTC)

I saw your question, and although I don't run ClueBot, I thought I would jump in and try to answer it for you. You probably triggered the bot by deleting so much information as you did here. You can't just remove 2/3 of the article like that, it is not constructive. I see that you added a single reference for one change, but any time you make changes, you need to cite a reference for each change - at the time of the change. Hope this helps. Josh3580_talk/hist 04:40, 20 December 2013 (UTC)

OK thank you for your response. One clarification regarding the giant deletion is that it was not my intention to delete so much, it was meant to be a minor edit as stated in the reason of the edit, it only turned into a major deletion due to computer error(I was using a touch-screen rather than a mouse, it must been wrongly moved before my last edit so it selected a large part of the article for deletion). I would surely have corrected it had the bot had not done it before me. For the sources, I am searching to add more for each edit paragraph, but the bot was moving so fast that it deleted all before I could add more. — Preceding unsigned comment added by 114.79.51.217 (talk) 13:49, 22 December 2013 (UTC)

No one at the controls?

The ClueBot false positives report page leads this page for which the registration has expired. User:Cobi (the bot owner) appears to be only marginally active.

ClueBot is very nice, but it does throw false positives from time to time, like here/] and these need to have a live human being looking at them, figuring out what went wrong, and addressing the issue. If there's no one at the throttle its likely to run off the rails more and more I'd think. Can User:Cobi be persuaded to turn this bot over to someone else if he's burnt out on it? Herostratus (talk) 03:34, 21 December 2013 (UTC)

There's a sensitivity / specificity trade-off which essentially means the bot is designed to make a small amount of false positives, don't think of it as something going wrong that can be investigated and fixed (and certainly not as "running off the rails"). That said, it's been known for a long time that remaining developers are largely inactive, and the detection could be improved using more-modern machine learning methods, as to what we do about that… edit: Also that line had a leading space, this is the version you reverted to benmoore 14:51, 21 December 2013 (UTC)

Hmmm. I would suppose that the bot is not designed to make a small amount of false positives but rather expected to. Naturally nothing's perfect and ClubBot, which is amazingly good, isn't either. However, we have to keep in mind that each false positive falsely accuses the person of vandalism which is a pretty serious charge. Given that, I would expect that each false positive would be looked at by a human person, who would address that. There're a number of ways to address false positives I guess. I'm not a programmer, and the stuff about machine-learning is way over my pay grade, but even if the program writes and corrects its own rules and algorithms, at the end of the day some human person must have written the code that allows it to do that and so if that's working suboptimally it ultimately falls to some human person to make any necessary corrections I would think. Anyway its not clear to me how the robot can correct itself if its not accepting error reports anymore. Hoping this is not too naive, I suppose some ways you can address a false positive would be:

Find out why it occurred and change the code so it doesn't happen anymore, or not as often.
Log it, look for patterns, analyze the underlying cause of the problems, and at some future time change the code to reduce the occurrence of certain classes of problems.
Try to find out why it occurred and be unable to do so, which happens of course.
Find out why it occurred, conclude that's its not possible to fix it, either at all or without an unreasonable amount of effort and/or without degrading the tool in some worse way, and accept that.
Do nothing, but make general-purpose soothing noises in the manner of "your complaint is important to us, be assured that our top people are on the case" or whatever.

However, there's apparently no one with sufficient interest to even do #5, which is troubling. It would therefore be an increase in message accuracy, and therefor a user interface upgrade, to replace the message "False positive? Report it here" to "False positive? Sucks to be you" or some more formal equivalent; such a boldly stated but accurate message would possibly cause a political problem for ClueBot, so we do have a problem here I think.

ClueBot is very good, but very busy and powerful, and should't run unattended such that false positives are accepted with no attempt to prevent or reduce them in future, or at least pretend to, I think. I wonder if the Foundation with its full-time software developers could absorb it into the base software, as a adjunct feature or something? I'm inclined to propose that (it probably won't go anywhere, but you never know), is there any objection to my doing that? Herostratus (talk) 17:15, 21 December 2013 (UTC)

Oh please no, do not suggest having the Foundation with their execrable programming record take over ClueBot. I'm sure the registration will be renewed quickly; this kind of embarrassing slip-up in paperwork happens to major corporations, and although it's problematic when false positives can't be reported, the bot has remarkably few false positives. I'll go jump up and down on IRC. Yngvadottir (talk) 17:37, 21 December 2013 (UTC)

If you review the NG Userpage you'll understand why what you suggest is not feasible, it's not the simple rule-based system many people on this page imagine (i.e., there's not a system of interpretable rules and thresholds that can be adjusted, it's a neural network which "learns" its rules from the training set, making it non-trivial to reverse-engineer its output). And when I say "designed to" make false positives, that's really what I mean—to understand, consider a receiver operating characteristic curve: it's possible to limit Cluebot's false positives to a level where we almost never see them, but in doing so we reduce our true positive rate also and a lot of vandalism goes unreverted.

Regarding the WMF taking hold of a vandalism bot, this is something I've assumed (/hoped) they aim to do anyway. This issue has been discussed here and elsewhere each time a problem comes up and it takes time to find any knowledgable party; something needs to be done sooner or later for sure. benmoore 18:26, 21 December 2013 (UTC)

Yes OK sure, I won't get involved. Hmmm yes ben, I see what you're saying (without, of course, actually understanding it). However, if that's true, I wonder what is the function of the "here" in "False positive, report it here". I gather that it may be that reporting false positives doesn't do anything useful, it's just an alternative to "False positive? Don't worry about it, all this is way over your head, just go about your business" which would rub people the wrong way. Which, you know, is actually reasonable.

Also -- again, I'm not trying to cause trouble or denigrate ClueBot, just making a point -- over at Misplaced Pages:WikiProject Editor Retention the lament was made that new editors' early contributions are reverted more (and more ruthlessly) than in earlier days, and this hurts editor recruitment. The main cause of this by far (or if it even is a problem -- it's complicated) is other human editors, but ClueBot false positives don't help. The decision of what level of false positives to tolerate is ultimately a business decision and has to be understood in total context. That said, ClueBot is much better than a gawping civilian like me would have ever expected and so there's no action item regarding the actual level of false positives at this time. Herostratus (talk) 19:17, 21 December 2013 (UTC)

I can't profess to knowing the ins and outs of the "report false positive" process (Damian?) but I expect it will add the edge case to the training set (after review), which to some degree will sway the network away from making the same mistake (though the training set has thousands of ham/spam cases). So I don't believe it's a pointless endeavour, but is presumably more of a fine-tuning instrument than a hard-and-fast "don't do that again". Regarding your second paragraph, I agree and if the bot were under the jurisdiction of the WMF presumably makes that kind of decision more transparent and amenable to community input benmoore 20:17, 21 December 2013 (UTC)

The report interface feeds those edits, along with a bunch of other random edits into the review interface where people can review edits to generate a corpus of edits to be fed to the bot the next time the bot is trained. While any specific instance makes ever so small changes (there 10's to 100's of thousands of edits in the corpus already), it is a useful endeavor. As to the note about more modern machine learning methods, there hasn't been much improvement in that area of research that I am aware of since ClueBot NG was written. There are some minor things that could be improved, like understanding more wikitext, but that is not in the realm of machine learning, but rather just more inputs to the existing machine learning system. Furthermore, wikitext is not an easily parsable language, so it would take a significant amount of time to implement parsing of wikitext. The main thing that will help it is a larger corpus. And that takes man-hours to categorize huge amounts of edits -- Both good and bad edits in about the correct proportion that Misplaced Pages gets. -- Cobi 21:52, 21 December 2013 (UTC)

Reporting error

Yea, this robot is so retarded! I undo vandalism reverts like on jizz and it stupidly reverts me. Since WP:CIVIL and WP:NPA doesnt protect bots from policy, u r the stupidest invention I have EVER seen. Dragonron (talk) —Preceding undated comment added 18:03, 24 December 2013 (UTC)

Responded on your talk page; the bot was right. Yngvadottir (talk) 21:19, 24 December 2013 (UTC)

Dehumanizing

I rarely noticed vandalism before. Just another toy for the Wikileetists to wield over the "anyone" that is SUPPOSED to be able to edit wikipedia. If you want to make it your own private encyclopedia then do so but stop trying to pretend that "anyone" can edit it because between the bots and their wiki-nazi overlords us commoners can't. — Preceding unsigned comment added by 75.164.232.68 (talk) 22:14, 24 December 2013 (UTC)

#Dataset Review Interface

There is an incorrect link under #Dataset Review Interface. --Greenmaven (talk) 23:25, 26 December 2013 (UTC)