Misplaced Pages

:Bots/Requests for approval/BetacommandBot Task 9 - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
< Misplaced Pages:Bots | Requests for approval

This is the current revision of this page, as edited by John Vandenberg (talk | contribs) at 11:27, 12 February 2012 (replace C: with CAT:; see meta:Requests for comment/Wikimedia Commons). The present address (URL) is a permanent link to this version.

Revision as of 11:27, 12 February 2012 by John Vandenberg (talk | contribs) (replace C: with CAT:; see meta:Requests for comment/Wikimedia Commons)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)


The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.

BetacommandBot Task 9

Function: Replacing all images on en.wiki with commons versions that have the same SHA1 hashes, (gathered from toolserver queries) and then double checked with MD5 hash checks when running. once all en.wiki usages are converted to the commons name it tags it as a commons dupe.

Discussion

Will this end up tagging a PNG image if the commons version is an SVG? And will the SHA1 and MD5 hashes ensure pixel by pixel similarity under all circumstances? Also if the description is different, what will happen? MBisanz 03:25, 17 April 2008 (UTC)

images of different file types dont have the same hashes, only exact copies have the same hashes. β 03:27, 17 April 2008 (UTC)
I have an article for you to test this on, if you want. In general, how do you know a local copy isn't needed for some reason, for instance as an anti-vandalism measure (eg. DYK)? In some cases the local copy may not have the same name. Gimmetrow 03:37, 17 April 2008 (UTC)
if there are templates used for DYK I can add them to the bots ignore list, but in general {{NoCommons}} works. As for different names the bot replaces them with the new name. β 03:39, 17 April 2008 (UTC)
Will it then correct any articles that may have include the renamed picture? Q 03:41, 17 April 2008 (UTC)
Define renamed image. what Gimmetrow was saying was that images on commons may not have the same name as the version on en. what I said was that I would update en with the commons name if they were different. β 03:43, 17 April 2008 (UTC)
Well say a picture is uploaded to commons as ILoveBeans.jpg, and then some user uploads it here as I_Love_Beans.jpg. From what I can tell from your bot will say "Hey, these two pictures are the same, lets delete this local one and use the commons pic". So what happens to articles that were including I_Love_Beans.jpg? Q 03:46, 17 April 2008 (UTC)
I replace I_Love_Beans.jpg with the commons name (ILoveBeans.jpg). (I thought I said that  :/) β 03:47, 17 April 2008 (UTC)
Oh, ok, ignore me, I was just equating replace with something else :) Q 03:53, 17 April 2008 (UTC)
If an image from commons is used for DYK, a local copy is uploaded to avoid vandalism. It's supposed to have {{c-uploaded}} and usually has the same name, but not always. Such images shouldn't be replaced by the commons name. Gimmetrow 03:50, 17 April 2008 (UTC)
Images with {{c-uploaded}} are now skipped. β 03:52, 17 April 2008 (UTC)
Presumably such images would also not be tagged as dupes because they are linked from a protected page and the bot therefore couldn't replace all links with the commons link. Would it get halfway through replacing the links before it discovered the protection? Bovlb (talk) 07:30, 17 April 2008 (UTC)

Looks pretty solid to me. You might want to look around a bit for other templates like {{NoCommons}}. How common is it's use, in the described situation? SQL 03:53, 17 April 2008 (UTC)

Another question, will it tag the old enwiki images for CSD? SQL 03:54, 17 April 2008 (UTC) Reading the function summary twice should be required. SQL 03:55, 17 April 2008 (UTC)

It seems to be that the request function seems to be missing the second half of the process. As I read the request, it relates to replacing images that are found in the Image name space and renaming them, if necessary, to match the name used in Commons. No mention seems to be made of the articles in the Article name space that make use of the images. Can the function request be expanded to include replacing (updating) all image links where the image name has been changed to match that used in Commons? Dbiel 04:11, 17 April 2008 (UTC)

it will replace all usages of en.wiki's copy with the commons name regardless of the namespace. (a complete replacement) β 04:14, 17 April 2008 (UTC)
Thanks for the reply; I thought it would, but it just was not clearly stated in the request or in the discussion that followed. So thanks for clearing that up. It sounds like a good use for the bot. Dbiel 04:59, 17 April 2008 (UTC)
Ok, my questions are answered. As long as commons will be able to get deleted image page stuff on request, then I have no issue. MBisanz 04:31, 17 April 2008 (UTC)

Could the bot also do duplicates on only en.wiki? Example: Image:Tunday_Akintan.jpg and Image:Tunday.jpg. My count has us at about 4,000 duplicates on en.wiki alone. Perhaps simply give favor to the one with more image links? --MZMcBride (talk) 04:39, 17 April 2008 (UTC)

that would be a separate request that is down the road. β 04:43, 17 April 2008 (UTC)

I fundamentally disagree with adding yet another task which will produce a large number of edits to this already function-heavy bot. I would request that this new task be run under a new bot account. AKAF (talk) 06:44, 17 April 2008 (UTC)

Sounds like a good task, but I echo AKAF in that there shouldn't be a big deal with moving it to another account. — Werdna talk 06:58, 17 April 2008 (UTC)

This sounds like a good and helpful task. Please create a new bot account for it instead of adding yet another task to BetacommandBot. rspeer / ɹəədsɹ 07:22, 17 April 2008 (UTC)

Do you have an estimate on how many images are affected, and how many articles? Bovlb (talk) 07:30, 17 April 2008 (UTC)

Suppose an image on en.wiki is being validly used under fair use. Someone (wrongly) copies it to Commons. The bot then deletes the original image. Then, on Commons, the copied image is deleted because fair use images are not permitted. The image is then lost from its original fair use article. What is the best approach? Should the bot not move images for which fair use is being claimed? Thincat (talk) 09:32, 17 April 2008 (UTC)

Reading all this again (and I am not familiar with these sort of operations) will the bot merely edit articles to point to identical commons images, or will it additionally flag en.wiki images for deletion, or delete identical en.wiki images? Thincat (talk) 10:58, 17 April 2008 (UTC)

As stated in the second sentence of the summary, it will tag them (i.e. flag them for deletion). — Werdna talk 10:59, 17 April 2008 (UTC)

Thank you. So, regarding my "fair use" issue, I'll try and find how this is dealt with for images tagged "as a commons dupe". Any pointers? Of course, the bot wouldn't create a new problem but it might amplify any existing problem. Thincat (talk) 11:11, 17 April 2008 (UTC) Referring to Category:Images on Wikimedia Commons and Category:Images with the same name on Wikimedia Commons as of 17 April 2008 (are these the best references?), there seem to be no warnings about not deleting images with an apparently valid fair use claim (and possibly deleting the commons duplicate). The instructions should be improved before the bot is run or the bot should not handle images with a fair use claim. Thincat (talk) 11:30, 17 April 2008 (UTC)
Images used under fair use locally and then copied to Commons don't last long if we're made aware of them (we usually are) and deletion occurs very swiftly, but it would be nice to have a bot which can let us Commons administrators aware of duplicate images used under fair use here and existing on Commons, either without a licence tag or with a different licence. In this case, it would be nice to have BCBot make a list of potential copyright problems so that en.wp and Commons admins can decide on whether to delete the local image or the Commons image, so if BCBot finds an image here on en.wp that's not under a free licence, and finds a potential duplicate on Commons, it would be nice if instead of tagging the local image for deletion and changing the links to the image, it added a template to the image which lists the name of the duplicate on Commons and then adds the image to a category in order that we may examine the images. It would be advisable that it should only be identical images with the exact same licence on both en.wp and Commons that we tag for deletion here and change the image links to, everything else should need to be looked at to see the order in which image was uploaded first, if there's deleted history and such. Nick (talk) 16:28, 17 April 2008 (UTC)
Ill add in to skip pages with {{non-free or {{non free which should filter out all fair use media. but as for creating a new account I dont really see a need. it just makes my life harder. β 12:44, 17 April 2008 (UTC)
Thank you, I think this is the best way to deal with fair use claims. There is clearly a problem with duplicated images where one has a fair use claim on en-wiki. Perhaps a bot listing of such images might help towards manual investigation? Thincat (talk) 12:54, 17 April 2008 (UTC)


There is reasonable consensus that another account should be created, for a multitude of reasons. My approval is conditional on a new account for that part of the bot. — Werdna talk 12:48, 17 April 2008 (UTC)

And since you've already been running the bot from your user account (I count about 700(?) edits in your user contributions, but since you're also running the defaultsort bot and the crosswiki link removal bot, it's a bit hard to be sure), I don't see the difference between migrating the code to betacommandbot or to betacommandbot_t9. AKAF (talk) 13:09, 17 April 2008 (UTC)
Please stop making unfounded attacks against me. you have no clue what your talking about. instead of making assumptions why not ask me, I do use some python tools but who says they are bot? there is no proof so shut up. β 13:17, 17 April 2008 (UTC)
Let me put it like this: You've been doing exactly this task in an automated fashion with hundreds of edits with exactly the same edit summary from your user account. You now want to add this task to betacommandbot. I think it's fair to ask why WP:DUCK doesn't apply? AKAF (talk) 13:21, 17 April 2008 (UTC)
Your throwing around assumptions that I am running un-approved bots on my main account. I take offense to that. what I do do is fairly simple, I use a simi-auto script like AWB to test the idea out, I did the same thing when I started non-free image tagging. so please stop making personal attacks against me without proof. β 13:27, 17 April 2008 (UTC)
Okay then, I'm sorry. I actually mainly wanted to imply that you already had a tested code, but perhaps a test of logic would be more accurate. If that's the case then you can simply point people with questions to the list of edits which you've already made. I still think it would be better if you put the bot on a new account though. I appreciate that it's more work for you, but I think it's important, for reasons which I clarified a while back on your user page. AKAF (talk) 13:50, 17 April 2008 (UTC)

Is the trial/test period of this bot going to be foregone due to the tests done as a user? MickMacNee (talk) 14:43, 17 April 2008 (UTC)

I suppose, that would depend on what was run. Was it the same tool, except in manual mode (I've got a plugin for mine, that I can drop in, and have my tools force me to review diffs, etc, for instance -- very useful for debugging, or, evaluating a task by the way.) I'd really still lean towards seeing a few days test, (preferably linked here in the edit summary) so that any immediate issues may be brought to the surface, while we're all still paying close attention. Additionally, I agree with Werdna, regarding splitting this task to a new account
Yes, I know it's a pain, and, my own framework works the same way -- all the core code has to be moved to a new installation. One thing that worked for me, that may or may not work for pywiki, was creating a new directory, where my new bot would go (last time it was User:SQLBot-Hello), and, symlinking all the relevant code files over, making sure to skip the files that stored the configs, cookies, etc. Anyhow, I hope this helps. SQL 20:41, 17 April 2008 (UTC)
I really dont want to copy my whole install, (100+MB) and other shared code that I am constantly tweaking and updating. there is no need for another account, 98% of the NCC tagging is already done, (I think I found 20 images on my last run) BCBot is not really doing much, and when BCBot is operating I am normally nearby and am pinged (loudly) when someone edits either of the talkpages. there really isnt a reason to split accounts other than to make my work more difficult. β 21:13, 17 April 2008 (UTC)
I'm still not sure wether it'd work with pywikibot or not, but, I wrote a shell script to help symlink all files in one dir, to the CWD, that might help you. (You'd just have to delete the cookiejars, configs and etc afterwords) SQL 22:29, 17 April 2008 (UTC)

I don't see any technical issues here (beyond getting the right list of templates that cause an image to be skipped). It appears Betacommand has already manually reviewed enough edits to be confident the script is working correctly. If everyone thinks the exclusion list is sufficient (right now it looks like nocommons and no-free images are excluded), are there other issues remaining besides a new account?

Betacommand tells me he is planning to start with just 500 per day, so if there are issues it won't take long to fix them. — Carl (CBM · talk) 15:43, 17 April 2008 (UTC)

I would think that if Betacommand simply added the task number as part of the edit summary that the need for a separate bot account would be unnecessary as the edits would be clearly identified as being done as a separate bot task. Example: Bot task 9 ........ Dbiel 22:18, 17 April 2008 (UTC)

Well there's an excellent idea, how does this work for the folks who really really wanted him to run it under a new account? SQL 22:32, 17 April 2008 (UTC)

Isn't the point of having multiple accounts allowing one account to be blocked as malfunctioning without the others? — Werdna talk 01:01, 18 April 2008 (UTC)

Well, the multiple tasks thing could work well, with edit summaries, and something like User:SQLBot/tagem.run (Read say every 50 edits, could completely eliminate the need to block BCBot anymore....) But, I'm not trying to tell anyone how to run their bot... There does seem to be a fair argument for breaking it off, it seems, however. SQL 01:59, 18 April 2008 (UTC)
As I have repeatedly and repeatedly and repeatedly said I am sitting by the computer when the bot is running. leave me a note on my talk page, and the bot shuts down fairly quickly if there is an issue. β 02:46, 18 April 2008 (UTC)
PS I used to have BCBot shut down when it got new messages but people dont know how to read and kept posting notices there so I had to disable that function. β 02:47, 18 April 2008 (UTC)
Past history has shown Betacommand's preceeding statement to be true. The Bot has a history of reliability. The main issues with his bot have related to tasks that have not been specificly approved for his bot, and even then, shutting down the individual function while keeping the rest of the functions running has not been a major problem. Betacommand is rather head strong and independant and can be a real pain, but when it comes to bot operators, you will be hard pressed to find anyone who maintains and monitors his bot any better. His track record speaks for itself. And in his case, it would seem to be best to allow a single account until running a single account actually becomes a problem. At that point we would need to insist on separate accounts, but until then, why create more work for him and increase the chance of errors caused by trying to keep multiple accounts in sync with each other? What is so different about this task that it needs a separate account, while we continue to allow all the other tasks to run on a single account? If it is working, don't "fix" it. Dbiel 03:23, 18 April 2008 (UTC)

There is no chance of errors as a result of running separate accounts (nothing needs to be 'synced'). There have previously been problems in blocking Betacommandbot for an unauthorised task due to fears of damaging the so-called "mission-critical" tasks on the account. And it is not difficult to reprogram anything to use multiple accounts. — Werdna talk 06:04, 18 April 2008 (UTC)

This is, indeed, the reason. Even when BetacommandBot has run unapproved tasks, admins have been reluctant to block it, because blocking it involves blocking many unrelated tasks, many of which are useful. Running a single account has been a problem many times in the past. It would be preferable if all the tasks were separated, but this is at least a place to start. rspeer / ɹəədsɹ 08:06, 18 April 2008 (UTC)
Agreed. Along with confusion on all sides as to what tasks are approved for an account, when the number of tasks becomes large. Currently betacommandbot has 13 BRFA linked from the bot, which an interested user would have to locate and read. I wouldn't ask for a new account for every new task, but a new account for each major task, and for each group of related minor tasks allows all parties to keep track. AKAF (talk) 10:54, 18 April 2008 (UTC)
All of these issues and remedies were proposed in the last arbitration case, and ignored/declined. MickMacNee (talk) 12:39, 18 April 2008 (UTC)
...As it's a matter for the community to work out, which, I think we're actually doing pretty well at. Off-topic discussion aside, there seems to be a good, workable solution below, that may preclude the need to split this bot into separate accounts, do you guys have any comments on that? SQL 19:32, 21 April 2008 (UTC)

break

I'm in the school of thought that thinks that each edit that falls under a specific task should have a link to that task. It doesn't have to be any longer than ], which gives task 1, with that page describing the task, and new pages being created for each task number. It is not so much separate accounts, but having enough information in the edit summaries to be able to separate out the different edits made under different tasks. Carcharoth (talk) 12:58, 18 April 2008 (UTC)

I would accept this as a second-best option. Part of the reason that I would like to see betacommand in particular split off major new functions, is that I see many of his past problems as stemming from a lack of organisation, in particular regarding clear communication to others about his bot's functions. Therefore I don't necessarily accept betacommand's argument that creating a new bot account is more work for him, since I suspect that the extra work he invests now will be repaid with a reduced workload dealing with complaints and comments later. I think we would all like to find ways to reduce betacommand's blood pressure, and it is my hope that the confusion and vitriol on his talk page can be reduced by a one-bot one-function policy.AKAF (talk) 13:17, 18 April 2008 (UTC)
AKAF, there is a very easy way to stop the vitriol and its very easy, admins just need to enforce Misplaced Pages:CIVIL and Misplaced Pages:NPA something they dont like doing when its directed at me. As for the one bot/one function that is crap. Handeling standard questions is not an issue. My pholosphy in regards to bots is simple one operator, one flag. I have only really had issues with the NFCC#10c tagging because we had a very large percentage of users who did not know the policy/did not want to follow it. β 15:01, 18 April 2008 (UTC)
Betacommand, my comment about the edit summary was in regards to the redlinked category issue. Similarly, there was the SVG issue. See the list at User:Nilfanion/SVGlist. I asked East if he would be willing to undelete, and he replied here, saying "you'll have to revert the speedy deletion tag very fast before somebody else working CAT:SD deletes it". Would you be willing to have your bot edits rolled back as the images are undeleted, or even get your bot to do that itself? Also, would you be willing to do the work to get a full list of the redlinked categories your bot removed? I think the list you provided was only partial, but if you could provide a guaranteed list of all the redlinked category removals, that would be good. If you can't do this, then please consider organising your bot better in future so that (a) you can easily reverse the actions if needed; and/or (b) provide a list of the actions it provided (ie. make the log of its actions more understandable and easy to analyse and filter). At the moment, I can't make progress on these issues, and I need your help. Carcharoth (talk) 15:45, 18 April 2008 (UTC)
If you know how to read BCbots edit summaries (each task has a fairly unique edit summaries) its not that hard. If BCBot has an error, they are known to happen for one reason or another, I do mass revert BCBot, using the bot account. as for the SVG issue given the amount of time that has passed Im not sure how much replacement reversions are possible, but if I were an admin or had access to undelete, and deleted contribs, I could script a mass un-delete tool. But I will never give admin bots to users without full and 120% testing, and complete trust. β 16:06, 18 April 2008 (UTC)
"If you know how to read BCbots edit summaries (each task has a fairly unique edit summaries) its not that hard." - Betacommand, this would be a lot easier if you provided a list of what the edit summaries are for each task. As far as I can see, the form of your edit summaries is inconsistent. I've tried to analyse them and filter out the edits, but I've failed. Which is why I'm asking you. Again. "as for the SVG issue given the amount of time that has passed" - this should not be a problem. If you think it will be a problem, do the analysis to prove it. ie. Actually go and look at the images and articles they were used in, rather than shrugging your shoulders and walking away. Carcharoth (talk) 16:37, 18 April 2008 (UTC)
this is not the place for this conversation. it is not relivant to the given task. If you would like to have this discussion you know where my talk page is. β 16:51, 18 April 2008 (UTC)
The relevant bit for this task is: if you don't put this on a separate account, what edit summary will you use and what sort of information would you consider putting in the edit summaries? Carcharoth (talk) 17:19, 18 April 2008 (UTC)
Ive done some simi-auto versions of this on my main account if you take a look there you can see the two edit summaries that I use. β 17:37, 18 April 2008 (UTC)
How are we meant to find these diffs on your main account? The easiest thing would be for you to find them and provide diffs and quotes of the edit summaries. Someone quoted this above. "replacing image with commons copy" could be a tad more informative. What about the suggestion that you use the bot task number in the edit summary? Carcharoth (talk) 21:44, 18 April 2008 (UTC)
here is one example, if someone wants to write a quick summary Ill link to that in my summaries. β 22:14, 18 April 2008 (UTC)
See below for my suggested edit summaries and documentation. Carcharoth (talk) 17:09, 21 April 2008 (UTC)
This response by Betacommand shows that he does not understand why his bot has caused problems, even after he has been told by many people and gone through an arbitration case over it. Beta, it's not uncivil to say you can't do whatever you want with bots, that you have to be accountable for their actions. Here, a large number of people want you to start a separate account, for a number of reasons, so please do that. rspeer / ɹəədsɹ 15:58, 18 April 2008 (UTC)

I have to agree about the need to defer that task to a distinct clone of BCbot. Also, I'm not crazy about doing this too fast: there's still need for admin intervention to delete the en.wiki copies and if the bot creates a sudden huge backlog, the ensuing deletion will end up being sloppy which is a pretty bad idea. We still need to make sure that the commons images are categorized and that they include the original upload log (when the images were first uploaded to en.wiki). Obviously this is not an urgent task and though it makes sense in the long run, it really doesn't matter whether that cleanup is done in a week or in three years. Pascal.Tesson (talk) 12:29, 21 April 2008 (UTC)

FWIW, I really don't see a good rational reason why it should be denied based on under what user it runs as. Q 14:22, 21 April 2008 (UTC)

if there are any objections except for the account name of the bot that operates the task, I will begin trial runs in 48 hours. β 14:23, 21 April 2008 (UTC)
I think there is a word "not" missing somewhere. Under which account do you propose running the trial? Thincat (talk) 14:28, 21 April 2008 (UTC)
the bots, there is an argument that wants this task to be done under a new bot name. β 14:29, 21 April 2008 (UTC)
OverlordQ, this is an astounding comment, coming as it does at the end of a rather long list of good rational reasons why this task should be split off into different tasks (and a why it shouldn't). This BRFA is somewhat unusual, in that people from outside the BAG are actually trying their best to participate and give some feedback as to likely objections of the wider community. The BAG has previously indicated that this might be a good way to avoid previous debacles. If the BAG finds this extra input just gets in the way of doing things as they have been done, then future comment will be reduced. However I would note that this has been the source of some community dissatisfaction in the past. I would like to see this task run on a new account for a number of reasons, listed above. However I think that there are not a few people who would accept any token from betacommand and the BAG that the errors of the past have been acknowledged (even if not corrected). I see some compromise from betacommand above, reacting to questions about potential problem areas, which I find reassuring. Betacommand does not, however, currently appear to think that the recent spate of complaints were an indication of any deeper problem. It is my hope that both betacommand and the BAG will continue the recent improvement toward being better able to respond appropriately to concerns from the community about their use of bots. Best of all would be before the bots go live. AKAF (talk) 16:05, 21 April 2008 (UTC)
creating a new account for this task doesnt do anything. currently BCBot doesnt really ahve much going on, Ive spent the last year involved with NFCC tagging, that has been handed over to other bots so my involvement is minimal. if you post to my talkpage or the bots, use my name in an edit summary or post to one of several pages if Im near my PC I know about it within seconds, so shutting down BCBot is not that hard, post a note to my talkpage with the error and a link to a diff and it will be taken care of. I have the ability to mass revert BCBot if something goes wrong so that is not a issue. The only issue I really see right now is politics and wikilawyering. β 16:12, 21 April 2008 (UTC)
It's not about being able to get the bot shut down at a moment's notice, it is about more careful documentation to enable others to understand what your bot is doing (and hence reduce complaints), and make it easier for you (or at least others if you walk away from any problems or stonewall requests for you to revert any edits made by your bot) to fix any problems that are spotted later. See User:BHGbot and the way the tasks are laid out: 1, 2, 3, 4, and 7. If you had done this for the SVG tagging and for the redlinked categories, things would have been much easier to fix. Would you like someone else to write such a page for you? Carcharoth (talk) 16:34, 21 April 2008 (UTC)
Im working on a fairly detailed file logging for this task, (I had a similar one for NFCC tagging). because I dont know the size of this logfile Im not planning on posting it, but if people want I can post excerpts on request. β 16:37, 21 April 2008 (UTC)
Any reason it can't be posted on-wiki? Linking to versions in the page history when you add or clear a set of data is an efficient way of maintaining on-wiki data without having huge live pages that people load up each time they view it (though if I've made myself look silly here, with the resources needed to load diffs or to wipe a page being much more than incremental additions, just say so). Anyway, if I write an edit summary, to make filtering out the edits due to this taks easier, would you use that? I think you agreed to that up above. Carcharoth (talk) 16:47, 21 April 2008 (UTC)
this will be a per run logfile, Im not sure how fast it could grow, but some of my NFCC logs easily were 15+ MB of text. due to the sheer possible size that it could be, (I dont know Ive never used the logging yet) that is not something appropriate for wiki. β 16:51, 21 April 2008 (UTC)
Well, OK, forget that then (what is in the log files that makes them so large?). See below for my suggested edit summaries and documentation. Carcharoth (talk) 17:09, 21 April 2008 (UTC)
When I log I go for all details so that finding a problem is easy. β 17:34, 21 April 2008 (UTC)
About the only valid reason I saw was that it might be hard to tell what task it was working on, I really don't find the 'make it hard to block' and other arguments hold their water. Q 21:23, 23 April 2008 (UTC)

Edit summaries

  • example - suggest:
    • "image is on commons, tagging with ] (see ])" (or however you normally indicate a template name in an edit summary - use {{CommonsNow}} if that is shorter), which renders as:
  • example - suggest:
    • "replacing ] with ] (see ])" (using the image names, to save people clicking a diff to find out this information - not needed for the first example, because the image name is part of the log), which renders as:

Whether or not you mark them as minor is up to you. Would those edit summaries be OK? I realise long image names may mess things up, and you may not want to check for long image names, but would you consider this? You would also need to create User:BetacommandBot/9, but as I said, anyone can do that for you, if you don't have time. This would address my concerns, which are not about stopping the bot or it doing too many tasks, but about tracking what it is doing, and which edits are related to approved tasks and which ones are not. Carcharoth (talk) 17:09, 21 April 2008 (UTC)

other than changing (see ]) to ] and placing that at the beginning, (to shorten and avoid it getting cut off if the file names are too long) I can implement that. β 17:38, 21 April 2008 (UTC)
Thanks. No more objections from me - not that they would have counted for anything anyway! :-) (please don't take that the wrong way!) Carcharoth (talk) 19:20, 21 April 2008 (UTC)
This seems like a good compromise to me. Are there any objections to a trial run, using the edit summary method of identification, run under the account User:BetacommandBot? My reasoning for proposing this, is that I believe the main reason (hard to separate tasks) has been addressed. Betacommandbot no longer does NFCC tagging (It seems about 250 edits per day now, as opposed to up to 2000-10000 per day), and, with the edit summary method it would make it very easy to tell them apart. I would suggest, that BetacommandBot's other tasks be labeled similarly. What's everyone think? SQL 19:37, 21 April 2008 (UTC)
Reading the whole thing, I still don't see what the problem is with running the bot on a separate account. Logging in and out of Misplaced Pages is not a hard thing to do for a well written bot. The assertion that it would complicate anybody's life seems to indicate either misunderstanding of what bots can do, or stubborn weaseling out of community consensus. I'm not going to make the judgment which answer is more likely. Zocky | picture popups 05:52, 22 April 2008 (UTC)
Zocky, accusations aside, have you tried this with a pywikipedia bot? While I don't have as extensive of a setup as Betacommand, and I certainly do not speak python, I tried this, and found his statements to be true. Am I doing something wrong? How's yours set up, if I am? SQL 06:06, 22 April 2008 (UTC)
Someone has already suggested a way to use symbolic links to run multiple pywikipedia bots on the same version of the code. It's not hard. I don't like the edit summary option, because it doesn't address most of the concerns that have been brought up here. rspeer / ɹəədsɹ 05:18, 23 April 2008 (UTC)
(That was me :) I didn't try them in my run, however) SQL 23:42, 23 April 2008 (UTC)
What rspeer said, plus, who says that this all has to run from the same code? Python has the import command, so any shared code can be imported, and different tasks should be run as separate programs. Zocky | picture popups 09:36, 23 April 2008 (UTC)
Zocky, thanks for showing you have no clue how python works, all of my scripts are seperate files, the issues happen when importing. I think I know my code. the issue happens due to the centralauth nature of the program and it wanting to share the same login throughout the code. β 14:56, 23 April 2008 (UTC)
It doesn't matter how python works. Logging in and out of websites is a simple operation. If your code can't handle it easily, then its broken. Zocky | picture popups 19:45, 23 April 2008 (UTC)
You have no clue what your talking about so please dont spread nonsense and insults. β 21:19, 23 April 2008 (UTC)
So in one python install, all scripts have to use the same login? Even if run at different times? MickMacNee (talk) 22:30, 23 April 2008 (UTC)
It's not a problem with Python, it's a problem with Pywikipedia. But computers do what you ask them to, and there are a couple of ways I can think of to solve this:
  1. Use a separate installation of pywikipedia for each bot. BC objects to this because of the difficulty of updating it, which I understand. So:
  2. Use symbolic links to create different directories that look like separate installations of pywikipedia, but share all the same core files, and differ in only the login settings and the script that runs.
  3. Use a single directory, but change config.py so it sets a different value of usernames depending on what script is being run. One way to distinguish this is to check sys.argv, which contains the name of the running script.
rspeer / ɹəədsɹ 23:04, 23 April 2008 (UTC)
you guys can talk till your blue in the face, like I have said im not separating this task. If you want to stop playing wikipolitics and actually help improve the encyclopedia lets do it, but I will not pander to wikilawyering and politics. Now can we move on and actually do something productive. symbolic links wont work and your idea to change config.py makes no sense if you understand python. β 23:09, 23 April 2008 (UTC)
I do understand Python, Beta. I know that with BAG's emphasis on technical expertise, it's easy to write off someone's views by claiming they don't understand the code, but that's not going to work on me. I've written in Python for about 8 years, and I get paid to write Python and teach other people how to write it. I've also written several wiki bots using pywikipedia (to maintain private wikis). And I just experimented with the sys.argv-based version, and found that it worked exactly as I described. You are not the only one with technical expertise here, so why not try taking a simple suggestion from someone else once in a while? rspeer / ɹəədsɹ 23:34, 23 April 2008 (UTC)
Please stop telling me how you think I should program my bot, your idea would work if it was a monodimensional program like what you are working with. I personally use a polydimensional program design, so that wont work. Like I have said Im done with the wikipolitics, and people trying to tell me how to code my bots when they have no fucking clue about my program design. I am not going to split the bot, end of story. So lets get the politics and stupidity out of the way and get this task going. β 00:34, 24 April 2008 (UTC)
Polydimensional program design? Impressive terminology. Too bad it doesn't mean anything. And when I say "anything", I mean Google has never heard of it. This finally convinces me that you simply do not know how to make the bot use separate accounts for separate tasks, and you're embarrassed to admit. Zocky | picture popups 00:52, 24 April 2008 (UTC)
There may not be much on google, but it does exit, see "FISh supports polydimensional programming" and "but its typical use is for supporting polydimensional programming" Dbiel 01:33, 24 April 2008 (UTC)
I don't think that's exactly what BC is talking about, but thanks for the links. Very interesting. Zocky | picture popups 01:38, 24 April 2008 (UTC)
Zocky Polydimensional program design is based on a modular coding design that wraps easy plug-in play code with a adaptive interface. There is a reason that its not on google, my employer is developing the design for use in their software. since I dont have several hours to explain the theory and implementation of the code design lets just say it wraps user interfaces with code very easily and with little work, and is infinitely expandable. just because the design is non-notable doesnt mean it doesnt exist. β 01:01, 24 April 2008 (UTC)

(Outdent) OK, so you have a new, as yet not widely known, programming paradigm, which has easy plug-ins, adaptive interfaces and is infinitely expendable. And you still can't make your scripts log into Misplaced Pages using separate settings? Zocky | picture popups 01:10, 24 April 2008 (UTC)

Let's stop beating a dead horse. Betacommand has made it clear that he does not what to run separate bots. So the question is simple, do we what Betacommand to handle this bot fuction or not? If yes, approve it. If no, lets drop it and move on. As far as the issue of blocking goes, Betacommand has already said if his bot acts up and he does not respond in a matter of minutes go ahead and block it. That should take care of the blocking issue. Betacommand seems to have agreed to use the edit summary to identify separate bot functions, so that should take care of that point. So should the task be approved or not? I would say yes. What about the rest of you? Dbiel 00:41, 24 April 2008 (UTC)

Agreed, spinning wheels go nowhere. My personal feeling on this, are that (as stated above), that clearly identifying the task in edit summaries would be about the best compromise we are going to come to. That seems OK to me, as the main complaint is that it is hard to identify what tasks it is running. The other concern on not splitting it off, are that it effects other tasks. If that's how he wants to do it, so be it. If one task breaks, unless it's going completely haywire, we generally get ahold of Beta (he is usually on when the bot's running, esp as NFCC has now stopped (I think)), and failing that, block the bot. The community should not rely on any one bot to do any mission-critical tasks anyhow, and temporarily blocking a malfunctioning bot is not a big deal.
I am going to conditionally trial this bot. Here are the conditions:
  1. The bot will use the edit summary method described here. It's strongly encouraged that this be done for all tasks.
  2. The page linked in the edit summary will clearly indicate that this bot is under trial approval, and will link here. Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. SQL 01:42, 24 April 2008 (UTC)
trial started, first round of taggings are done logs for that can be found User:Betacommand/Sandbox there are a few minor bugs that Im working on, there are some uses that for some reason it does not replace, Im looking into that. when it doesnt replace it also skips tagging the image. β 18:03, 25 April 2008 (UTC)

{{BAGAssistanceNeeded}}

What? You said you were busy doing something. — Werdna talk 10:04, 6 May 2008 (UTC)

it does not concern this issue {{BAGAssistanceNeeded}} I would like to get this approved so I can get rolling. β 12:11, 6 May 2008 (UTC)

Working in tandem with MetsBot?

It would be a big big plus for this bot to be run in conjunction with MetsBot, who (among other tasks) verifies that Commons dupes pass the requirements for speedy deletion I8. MetsBot is imperfect in the sense that it provides a lot of false positives but to the best of my knowledge (I've seen many of its reports), it has never created a single false negative and, in cases of false positives, provides very clear and helpful info about the potential problem. I'd like to see the BCbot for this task work hand-in-hand with MetsBot for otherwise the deletion process of dupes will either be very very slow or very very careless. Pascal.Tesson (talk) 18:05, 21 April 2008 (UTC)

MetsBot normally checks all images anyway, so I would assume that it would check these. β 18:09, 21 April 2008 (UTC)
Actually, MetsBot only runs every now and then (basically whenever Mets501 remembers that he should run the bot) and for some mysterious reason does not process all these images. Like I said above, I see no rush for BCbot to start trial runs for this task and I think that you should first take a few days to talk it out with Mets501 and see how the two tasks can be run simultaneously. It's pretty trivial to arrange and I don't think it makes sense for admins clearing these backlogs to run back and forth between two bot operators, trying to make sure they are working in coherent fashion. For what it's worth, BC stated above:
if there are any objections except for the account name of the bot that operates the task, I will begin trial runs in 48 hours.
I consider my objection above to be sufficient to deny trial runs for now. Pascal.Tesson (talk) 18:40, 21 April 2008 (UTC)

I have explained to Betacommand the two reasons that I believe that separated accounts are good, on the whole. In brief, they are:

  1. Allows the clear delineation of tasks, which also facilitates better detection of unapproved tasks.
  2. Allows the blocking of a single malfunctioning or unapproved task without affecting the others.

Number one is partially addressed by the edit summary method. Number two is not. — Werdna talk 06:08, 22 April 2008 (UTC)

I agree with you there Werdna. Especially on number two. However, to quote someone (I can't recall whom), no bot task is that important, that it should / could not have a doppleganger or backup somewhere. If the bot account is being used in a manner that it requires it be blocked, so be it. Bots are a nice thing to have, but, by no means a requirement. SQL 06:11, 22 April 2008 (UTC)
Like I have said before, blocking is not an issue, leave a note on my talkpage if I dont respond in about five minutes block, Most of BCBots other tasks are not really active, so seperating tasks, and blocking is are not really an issue. β

Betacommand, we can't have special rules for you because you think you're above them. I propose that approval is conditional on the bot task being split into multiple accounts. If there is a need for the task, I'll code up the task myself. — Werdna talk 01:13, 24 April 2008 (UTC)

Im not asking for special rules, Im asking for additional task approval the same thing Ive done countless times before. There are no issues with using the same mainly dormant account. Like I said the idea that BCBot should be forced into separate accounts is bullshit. why not just use some commons sense and approve the task. the task has consensus, its the wikilawyers who are stopping this task. Werdna since you obviously cannot objectively look at this request I ask that you recuse yourself from approving the task. how lets actually be productive. β 01:24, 24 April 2008 (UTC)

That sounds reasonable. I will not approve or deny this bot, because I am working on a substitute that will work on the Werdnabot II account. — Werdna talk 01:27, 24 April 2008 (UTC)

After reviewing the trial output at User:Betacommand/Sandbox, I have no objection to this task being approved. As far as the multiple accounts issue, I haven't followed that thread of comments and will leave it to the BAG to sort out. MBisanz 18:08, 25 April 2008 (UTC)

Hi guys. I've set up MetsBot to run every day automatically now, assuming that my computer is on. If it's not, then it'll run when I turn on the computer. Essentially, it will run every day except when I'm on vacation (which shouldn't be happening for a while). If anything changes, we can make other plans, but for now, you can assume that it will run every day. —METS501 (talk) 01:13, 28 April 2008 (UTC)

Unanswered questions

I asked a couple of questions above and I can't see any response to them. Perhaps they got lost in the Great Bot Account Debate. Sorry to take to long to repost them:

  • How does the bot handle the case where an image is linked from a protected page?
  • Do you have an estimate on how many images are affected, and how many articles?

Cheers, Bovlb (talk) 20:20, 7 May 2008 (UTC)

Part one. it changes all other usages and then doesnt tag the image, as the old one is still in use, (the same thing happens for spam blacklist, edit conflicts, ect). it then logs everything into a log file for future review. If it comes to a protected page Ill either ask an admin to do the change or file a {{editprotected}} for the change. As for part two I dont know about the numbers, I am taking this task fairly slow and being nice to all servers, en, commons and toolserver. I guessing over 10,000 images are affected with this issue. β 03:10, 8 May 2008 (UTC)
Thanks. Bovlb (talk) 03:27, 8 May 2008 (UTC)
The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.
Category: