Revision as of 13:17, 18 April 2008 editAKAF (talk | contribs)Extended confirmed users872 edits →Discussion← Previous edit | Revision as of 15:01, 18 April 2008 edit undoBetacommand2 (talk | contribs)947 edits →DiscussionNext edit → | ||
Line 77: | Line 77: | ||
:::All of these issues and remedies were proposed in the last arbitration case, and ignored/declined. ] (]) 12:39, 18 April 2008 (UTC) | :::All of these issues and remedies were proposed in the last arbitration case, and ignored/declined. ] (]) 12:39, 18 April 2008 (UTC) | ||
=== break === | |||
I'm in the school of thought that thinks that each edit that falls under a specific task should have a link to that task. It doesn't have to be any longer than <nowiki>]</nowiki>, which gives ], with that page describing the task, and new pages being created for each task number. It is not so much separate accounts, but having enough information in the edit summaries to be able to separate out the different edits made under different tasks. ] (]) 12:58, 18 April 2008 (UTC) | I'm in the school of thought that thinks that each edit that falls under a specific task should have a link to that task. It doesn't have to be any longer than <nowiki>]</nowiki>, which gives ], with that page describing the task, and new pages being created for each task number. It is not so much separate accounts, but having enough information in the edit summaries to be able to separate out the different edits made under different tasks. ] (]) 12:58, 18 April 2008 (UTC) | ||
:I would accept this as a second-best option. Part of the reason that I would like to see betacommand in particular split off major new functions, is that I see many of his past problems as stemming from a lack of organisation, in particular regarding clear communication to others about his bot's functions. Therefore I don't necessarily accept betacommand's argument that creating a new bot account is more work for him, since I suspect that the extra work he invests now will be repaid with a reduced workload dealing with complaints and comments later. I think we would all like to find ways to reduce betacommand's blood pressure, and it is my hope that the confusion and vitriol on his talk page can be reduced by a one-bot one-function policy.] (]) 13:17, 18 April 2008 (UTC) | :I would accept this as a second-best option. Part of the reason that I would like to see betacommand in particular split off major new functions, is that I see many of his past problems as stemming from a lack of organisation, in particular regarding clear communication to others about his bot's functions. Therefore I don't necessarily accept betacommand's argument that creating a new bot account is more work for him, since I suspect that the extra work he invests now will be repaid with a reduced workload dealing with complaints and comments later. I think we would all like to find ways to reduce betacommand's blood pressure, and it is my hope that the confusion and vitriol on his talk page can be reduced by a one-bot one-function policy.] (]) 13:17, 18 April 2008 (UTC) | ||
::AKAF, there is a very easy way to stop the vitriol and its very easy, admins just need to enforce ] and ] something they dont like doing when its directed at me. As for the one bot/one function that is crap. Handeling standard questions is not an issue. My pholosphy in regards to bots is simple one operator, one flag. I have only really had issues with the NFCC#10c tagging because we had a very large percentage of users who did not know the policy/did not want to follow it. ] 15:01, 18 April 2008 (UTC) |
Revision as of 15:01, 18 April 2008
BetacommandBot Task 9
Function: Replacing all images on en.wiki with commons versions that have the same SHA1 hashes, (gathered from toolserver queries) and then double checked with MD5 hash checks when running. once all en.wiki usages are converted to the commons name it tags it as a commons dupe.
Discussion
Will this end up tagging a PNG image if the commons version is an SVG? And will the SHA1 and MD5 hashes ensure pixel by pixel similarity under all circumstances? Also if the description is different, what will happen? MBisanz 03:25, 17 April 2008 (UTC)
- images of different file types dont have the same hashes, only exact copies have the same hashes. β 03:27, 17 April 2008 (UTC)
- I have an article for you to test this on, if you want. In general, how do you know a local copy isn't needed for some reason, for instance as an anti-vandalism measure (eg. DYK)? In some cases the local copy may not have the same name. Gimmetrow 03:37, 17 April 2008 (UTC)
- if there are templates used for DYK I can add them to the bots ignore list, but in general {{NoCommons}} works. As for different names the bot replaces them with the new name. β 03:39, 17 April 2008 (UTC)
- Will it then correct any articles that may have include the renamed picture? Q 03:41, 17 April 2008 (UTC)
- Define renamed image. what Gimmetrow was saying was that images on commons may not have the same name as the version on en. what I said was that I would update en with the commons name if they were different. β 03:43, 17 April 2008 (UTC)
- Well say a picture is uploaded to commons as ILoveBeans.jpg, and then some user uploads it here as I_Love_Beans.jpg. From what I can tell from your bot will say "Hey, these two pictures are the same, lets delete this local one and use the commons pic". So what happens to articles that were including I_Love_Beans.jpg? Q 03:46, 17 April 2008 (UTC)
- I replace I_Love_Beans.jpg with the commons name (ILoveBeans.jpg). (I thought I said that :/) β 03:47, 17 April 2008 (UTC)
- Oh, ok, ignore me, I was just equating replace with something else :) Q 03:53, 17 April 2008 (UTC)
- I replace I_Love_Beans.jpg with the commons name (ILoveBeans.jpg). (I thought I said that :/) β 03:47, 17 April 2008 (UTC)
- Well say a picture is uploaded to commons as ILoveBeans.jpg, and then some user uploads it here as I_Love_Beans.jpg. From what I can tell from your bot will say "Hey, these two pictures are the same, lets delete this local one and use the commons pic". So what happens to articles that were including I_Love_Beans.jpg? Q 03:46, 17 April 2008 (UTC)
- Define renamed image. what Gimmetrow was saying was that images on commons may not have the same name as the version on en. what I said was that I would update en with the commons name if they were different. β 03:43, 17 April 2008 (UTC)
- If an image from commons is used for DYK, a local copy is uploaded to avoid vandalism. It's supposed to have {{c-uploaded}} and usually has the same name, but not always. Such images shouldn't be replaced by the commons name. Gimmetrow 03:50, 17 April 2008 (UTC)
- Images with {{c-uploaded}} are now skipped. β 03:52, 17 April 2008 (UTC)
- Presumably such images would also not be tagged as dupes because they are linked from a protected page and the bot therefore couldn't replace all links with the commons link. Would it get halfway through replacing the links before it discovered the protection? Bovlb (talk) 07:30, 17 April 2008 (UTC)
- Images with {{c-uploaded}} are now skipped. β 03:52, 17 April 2008 (UTC)
- Will it then correct any articles that may have include the renamed picture? Q 03:41, 17 April 2008 (UTC)
- if there are templates used for DYK I can add them to the bots ignore list, but in general {{NoCommons}} works. As for different names the bot replaces them with the new name. β 03:39, 17 April 2008 (UTC)
- I have an article for you to test this on, if you want. In general, how do you know a local copy isn't needed for some reason, for instance as an anti-vandalism measure (eg. DYK)? In some cases the local copy may not have the same name. Gimmetrow 03:37, 17 April 2008 (UTC)
Looks pretty solid to me. You might want to look around a bit for other templates like {{NoCommons}}. How common is it's use, in the described situation? SQL 03:53, 17 April 2008 (UTC)
Another question, will it tag the old enwiki images for CSD? SQL 03:54, 17 April 2008 (UTC)Reading the function summary twice should be required. SQL 03:55, 17 April 2008 (UTC)
It seems to be that the request function seems to be missing the second half of the process. As I read the request, it relates to replacing images that are found in the Image name space and renaming them, if necessary, to match the name used in Commons. No mention seems to be made of the articles in the Article name space that make use of the images. Can the function request be expanded to include replacing (updating) all image links where the image name has been changed to match that used in Commons? Dbiel 04:11, 17 April 2008 (UTC)
- it will replace all usages of en.wiki's copy with the commons name regardless of the namespace. (a complete replacement) β 04:14, 17 April 2008 (UTC)
- Thanks for the reply; I thought it would, but it just was not clearly stated in the request or in the discussion that followed. So thanks for clearing that up. It sounds like a good use for the bot. Dbiel 04:59, 17 April 2008 (UTC)
- Ok, my questions are answered. As long as commons will be able to get deleted image page stuff on request, then I have no issue. MBisanz 04:31, 17 April 2008 (UTC)
Could the bot also do duplicates on only en.wiki? Example: Image:Tunday_Akintan.jpg and Image:Tunday.jpg. My count has us at about 4,000 duplicates on en.wiki alone. Perhaps simply give favor to the one with more image links? --MZMcBride (talk) 04:39, 17 April 2008 (UTC)
- that would be a separate request that is down the road. β 04:43, 17 April 2008 (UTC)
I fundamentally disagree with adding yet another task which will produce a large number of edits to this already function-heavy bot. I would request that this new task be run under a new bot account. AKAF (talk) 06:44, 17 April 2008 (UTC)
Sounds like a good task, but I echo AKAF in that there shouldn't be a big deal with moving it to another account. — Werdna talk 06:58, 17 April 2008 (UTC)
This sounds like a good and helpful task. Please create a new bot account for it instead of adding yet another task to BetacommandBot. rspeer / ɹəədsɹ 07:22, 17 April 2008 (UTC)
Do you have an estimate on how many images are affected, and how many articles? Bovlb (talk) 07:30, 17 April 2008 (UTC)
Suppose an image on en.wiki is being validly used under fair use. Someone (wrongly) copies it to Commons. The bot then deletes the original image. Then, on Commons, the copied image is deleted because fair use images are not permitted. The image is then lost from its original fair use article. What is the best approach? Should the bot not move images for which fair use is being claimed? Thincat (talk) 09:32, 17 April 2008 (UTC)
- Reading all this again (and I am not familiar with these sort of operations) will the bot merely edit articles to point to identical commons images, or will it additionally flag en.wiki images for deletion, or delete identical en.wiki images? Thincat (talk) 10:58, 17 April 2008 (UTC)
As stated in the second sentence of the summary, it will tag them (i.e. flag them for deletion). — Werdna talk 10:59, 17 April 2008 (UTC)
- Thank you.
So, regarding my "fair use" issue, I'll try and find how this is dealt with for images tagged "as a commons dupe". Any pointers? Of course, the bot wouldn't create a new problem but it might amplify any existing problem. Thincat (talk) 11:11, 17 April 2008 (UTC)Referring to Category:Images on Wikimedia Commons and Category:Images with the same name on Wikimedia Commons as of 17 April 2008 (are these the best references?), there seem to be no warnings about not deleting images with an apparently valid fair use claim (and possibly deleting the commons duplicate). The instructions should be improved before the bot is run or the bot should not handle images with a fair use claim. Thincat (talk) 11:30, 17 April 2008 (UTC)- Images used under fair use locally and then copied to Commons don't last long if we're made aware of them (we usually are) and deletion occurs very swiftly, but it would be nice to have a bot which can let us Commons administrators aware of duplicate images used under fair use here and existing on Commons, either without a licence tag or with a different licence. In this case, it would be nice to have BCBot make a list of potential copyright problems so that en.wp and Commons admins can decide on whether to delete the local image or the Commons image, so if BCBot finds an image here on en.wp that's not under a free licence, and finds a potential duplicate on Commons, it would be nice if instead of tagging the local image for deletion and changing the links to the image, it added a template to the image which lists the name of the duplicate on Commons and then adds the image to a category in order that we may examine the images. It would be advisable that it should only be identical images with the exact same licence on both en.wp and Commons that we tag for deletion here and change the image links to, everything else should need to be looked at to see the order in which image was uploaded first, if there's deleted history and such. Nick (talk) 16:28, 17 April 2008 (UTC)
- Ill add in to skip pages with {{non-free or {{non free which should filter out all fair use media. but as for creating a new account I dont really see a need. it just makes my life harder. β 12:44, 17 April 2008 (UTC)
- Thank you, I think this is the best way to deal with fair use claims. There is clearly a problem with duplicated images where one has a fair use claim on en-wiki. Perhaps a bot listing of such images might help towards manual investigation? Thincat (talk) 12:54, 17 April 2008 (UTC)
There is reasonable consensus that another account should be created, for a multitude of reasons. My approval is conditional on a new account for that part of the bot. — Werdna talk 12:48, 17 April 2008 (UTC)
- And since you've already been running the bot from your user account (I count about 700(?) edits in your user contributions, but since you're also running the defaultsort bot and the crosswiki link removal bot, it's a bit hard to be sure), I don't see the difference between migrating the code to betacommandbot or to betacommandbot_t9. AKAF (talk) 13:09, 17 April 2008 (UTC)
- Please stop making unfounded attacks against me. you have no clue what your talking about. instead of making assumptions why not ask me, I do use some python tools but who says they are bot? there is no proof so shut up. β 13:17, 17 April 2008 (UTC)
- Let me put it like this: You've been doing exactly this task in an automated fashion with hundreds of edits with exactly the same edit summary from your user account. You now want to add this task to betacommandbot. I think it's fair to ask why WP:DUCK doesn't apply? AKAF (talk) 13:21, 17 April 2008 (UTC)
- Your throwing around assumptions that I am running un-approved bots on my main account. I take offense to that. what I do do is fairly simple, I use a simi-auto script like AWB to test the idea out, I did the same thing when I started non-free image tagging. so please stop making personal attacks against me without proof. β 13:27, 17 April 2008 (UTC)
- Okay then, I'm sorry. I actually mainly wanted to imply that you already had a tested code, but perhaps a test of logic would be more accurate. If that's the case then you can simply point people with questions to the list of edits which you've already made. I still think it would be better if you put the bot on a new account though. I appreciate that it's more work for you, but I think it's important, for reasons which I clarified a while back on your user page. AKAF (talk) 13:50, 17 April 2008 (UTC)
- Your throwing around assumptions that I am running un-approved bots on my main account. I take offense to that. what I do do is fairly simple, I use a simi-auto script like AWB to test the idea out, I did the same thing when I started non-free image tagging. so please stop making personal attacks against me without proof. β 13:27, 17 April 2008 (UTC)
- Let me put it like this: You've been doing exactly this task in an automated fashion with hundreds of edits with exactly the same edit summary from your user account. You now want to add this task to betacommandbot. I think it's fair to ask why WP:DUCK doesn't apply? AKAF (talk) 13:21, 17 April 2008 (UTC)
- Please stop making unfounded attacks against me. you have no clue what your talking about. instead of making assumptions why not ask me, I do use some python tools but who says they are bot? there is no proof so shut up. β 13:17, 17 April 2008 (UTC)
Is the trial/test period of this bot going to be foregone due to the tests done as a user? MickMacNee (talk) 14:43, 17 April 2008 (UTC)
- I suppose, that would depend on what was run. Was it the same tool, except in manual mode (I've got a plugin for mine, that I can drop in, and have my tools force me to review diffs, etc, for instance -- very useful for debugging, or, evaluating a task by the way.) I'd really still lean towards seeing a few days test, (preferably linked here in the edit summary) so that any immediate issues may be brought to the surface, while we're all still paying close attention. Additionally, I agree with Werdna, regarding splitting this task to a new account
- Yes, I know it's a pain, and, my own framework works the same way -- all the core code has to be moved to a new installation. One thing that worked for me, that may or may not work for pywiki, was creating a new directory, where my new bot would go (last time it was User:SQLBot-Hello), and, symlinking all the relevant code files over, making sure to skip the files that stored the configs, cookies, etc. Anyhow, I hope this helps. SQL 20:41, 17 April 2008 (UTC)
- I really dont want to copy my whole install, (100+MB) and other shared code that I am constantly tweaking and updating. there is no need for another account, 98% of the NCC tagging is already done, (I think I found 20 images on my last run) BCBot is not really doing much, and when BCBot is operating I am normally nearby and am pinged (loudly) when someone edits either of the talkpages. there really isnt a reason to split accounts other than to make my work more difficult. β 21:13, 17 April 2008 (UTC)
- I'm still not sure wether it'd work with pywikibot or not, but, I wrote a shell script to help symlink all files in one dir, to the CWD, that might help you. (You'd just have to delete the cookiejars, configs and etc afterwords) SQL 22:29, 17 April 2008 (UTC)
- I really dont want to copy my whole install, (100+MB) and other shared code that I am constantly tweaking and updating. there is no need for another account, 98% of the NCC tagging is already done, (I think I found 20 images on my last run) BCBot is not really doing much, and when BCBot is operating I am normally nearby and am pinged (loudly) when someone edits either of the talkpages. there really isnt a reason to split accounts other than to make my work more difficult. β 21:13, 17 April 2008 (UTC)
I don't see any technical issues here (beyond getting the right list of templates that cause an image to be skipped). It appears Betacommand has already manually reviewed enough edits to be confident the script is working correctly. If everyone thinks the exclusion list is sufficient (right now it looks like nocommons and no-free images are excluded), are there other issues remaining besides a new account?
Betacommand tells me he is planning to start with just 500 per day, so if there are issues it won't take long to fix them. — Carl (CBM · talk) 15:43, 17 April 2008 (UTC)
I would think that if Betacommand simply added the task number as part of the edit summary that the need for a separate bot account would be unnecessary as the edits would be clearly identified as being done as a separate bot task. Example: Bot task 9 ........ Dbiel 22:18, 17 April 2008 (UTC)
- Well there's an excellent idea, how does this work for the folks who really really wanted him to run it under a new account? SQL 22:32, 17 April 2008 (UTC)
Isn't the point of having multiple accounts allowing one account to be blocked as malfunctioning without the others? — Werdna talk 01:01, 18 April 2008 (UTC)
- Well, the multiple tasks thing could work well, with edit summaries, and something like User:SQLBot/tagem.run (Read say every 50 edits, could completely eliminate the need to block BCBot anymore....) But, I'm not trying to tell anyone how to run their bot... There does seem to be a fair argument for breaking it off, it seems, however. SQL 01:59, 18 April 2008 (UTC)
- As I have repeatedly and repeatedly and repeatedly said I am sitting by the computer when the bot is running. leave me a note on my talk page, and the bot shuts down fairly quickly if there is an issue. β 02:46, 18 April 2008 (UTC)
- PS I used to have BCBot shut down when it got new messages but people dont know how to read and kept posting notices there so I had to disable that function. β 02:47, 18 April 2008 (UTC)
- Past history has shown Betacommand's preceeding statement to be true. The Bot has a history of reliability. The main issues with his bot have related to tasks that have not been specificly approved for his bot, and even then, shutting down the individual function while keeping the rest of the functions running has not been a major problem. Betacommand is rather head strong and independant and can be a real pain, but when it comes to bot operators, you will be hard pressed to find anyone who maintains and monitors his bot any better. His track record speaks for itself. And in his case, it would seem to be best to allow a single account until running a single account actually becomes a problem. At that point we would need to insist on separate accounts, but until then, why create more work for him and increase the chance of errors caused by trying to keep multiple accounts in sync with each other? What is so different about this task that it needs a separate account, while we continue to allow all the other tasks to run on a single account? If it is working, don't "fix" it. Dbiel 03:23, 18 April 2008 (UTC)
There is no chance of errors as a result of running separate accounts (nothing needs to be 'synced'). There have previously been problems in blocking Betacommandbot for an unauthorised task due to fears of damaging the so-called "mission-critical" tasks on the account. And it is not difficult to reprogram anything to use multiple accounts. — Werdna talk 06:04, 18 April 2008 (UTC)
- This is, indeed, the reason. Even when BetacommandBot has run unapproved tasks, admins have been reluctant to block it, because blocking it involves blocking many unrelated tasks, many of which are useful. Running a single account has been a problem many times in the past. It would be preferable if all the tasks were separated, but this is at least a place to start. rspeer / ɹəədsɹ 08:06, 18 April 2008 (UTC)
- Agreed. Along with confusion on all sides as to what tasks are approved for an account, when the number of tasks becomes large. Currently betacommandbot has 13 BRFA linked from the bot, which an interested user would have to locate and read. I wouldn't ask for a new account for every new task, but a new account for each major task, and for each group of related minor tasks allows all parties to keep track. AKAF (talk) 10:54, 18 April 2008 (UTC)
- All of these issues and remedies were proposed in the last arbitration case, and ignored/declined. MickMacNee (talk) 12:39, 18 April 2008 (UTC)
- Agreed. Along with confusion on all sides as to what tasks are approved for an account, when the number of tasks becomes large. Currently betacommandbot has 13 BRFA linked from the bot, which an interested user would have to locate and read. I wouldn't ask for a new account for every new task, but a new account for each major task, and for each group of related minor tasks allows all parties to keep track. AKAF (talk) 10:54, 18 April 2008 (UTC)
break
I'm in the school of thought that thinks that each edit that falls under a specific task should have a link to that task. It doesn't have to be any longer than ], which gives task 1, with that page describing the task, and new pages being created for each task number. It is not so much separate accounts, but having enough information in the edit summaries to be able to separate out the different edits made under different tasks. Carcharoth (talk) 12:58, 18 April 2008 (UTC)
- I would accept this as a second-best option. Part of the reason that I would like to see betacommand in particular split off major new functions, is that I see many of his past problems as stemming from a lack of organisation, in particular regarding clear communication to others about his bot's functions. Therefore I don't necessarily accept betacommand's argument that creating a new bot account is more work for him, since I suspect that the extra work he invests now will be repaid with a reduced workload dealing with complaints and comments later. I think we would all like to find ways to reduce betacommand's blood pressure, and it is my hope that the confusion and vitriol on his talk page can be reduced by a one-bot one-function policy.AKAF (talk) 13:17, 18 April 2008 (UTC)
- AKAF, there is a very easy way to stop the vitriol and its very easy, admins just need to enforce WP:CIVIL and WP:NPA something they dont like doing when its directed at me. As for the one bot/one function that is crap. Handeling standard questions is not an issue. My pholosphy in regards to bots is simple one operator, one flag. I have only really had issues with the NFCC#10c tagging because we had a very large percentage of users who did not know the policy/did not want to follow it. β 15:01, 18 April 2008 (UTC)