User talk:Jimbo Wales: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 08:58, 9 August 2016 view sourceBlakegripling ph (talk \| contribs)Extended confirmed users, Pending changes reviewers, Rollbackers22,176 editsm Reverted edits by 86.151.51.85 (talk) to last version by Blakegripling ph← Previous edit		Revision as of 09:01, 9 August 2016 view source Blakegripling ph (talk \| contribs)Extended confirmed users, Pending changes reviewers, Rollbackers22,176 editsm Reverted edits by Blakegripling ph (talk) to last version by 86.151.51.85Next edit →
Line 83:		Line 83:

	:: If you want to stick with LaTeX you need a quite exhaustive LaTeX installation which needs several gigabytes, which you can not easily transfer to the client, just to create a PDF. I tried hard to reduce the amount of software that needs to be installed, but was not successful. Furthermore the LaTeX compiler and some other auxiliary tools are binaries compiled to run on a PC, and cannot easily be turned into java script. I in deed offer binary releases of mediawiki2latex for download on sourceforge which can run stand alone on a client PC, but the are just standard binary executables and have nothing to do which scripts running in a web browser. Yours ] (]) 06:12, 8 August 2016 (UTC)		:: If you want to stick with LaTeX you need a quite exhaustive LaTeX installation which needs several gigabytes, which you can not easily transfer to the client, just to create a PDF. I tried hard to reduce the amount of software that needs to be installed, but was not successful. Furthermore the LaTeX compiler and some other auxiliary tools are binaries compiled to run on a PC, and cannot easily be turned into java script. I in deed offer binary releases of mediawiki2latex for download on sourceforge which can run stand alone on a client PC, but the are just standard binary executables and have nothing to do which scripts running in a web browser. Yours ] (]) 06:12, 8 August 2016 (UTC)

			== Misplaced Pages, we have a problem ==

			NeilN is the person who claims that a picture of a man in a pulpit in a mosque addressing a congregation of six is actually Muhammad seated on a camel on a mountaintop explaining to thousands of pilgrims how he is going to reform the calendar. 400,000 people signed a petition saying that the picture was not relevant to the article subject and that under the principle of least astonishment it should be removed. All that has happened is that the article has been protected permanently so that nobody can edit it.

			This is what Jimbo has said:

			* In general I would prefer that we be more lenient than normal on my user talk page, particularly when someone is upset about something they view as an incorrect application of the rules. If he posts it again, I'd rather we discuss it than revert it - to a point, of course. Possibly something useful will come of it.--] (]) 20:15, 27 November 2011 (UTC)

			That relates to an editor who was blocked. Jimbo has said many times that it is permissible for blocked or banned users to post here provided they do not use the page as an infinite soapbox to harass other editors. He opposes semi - protection for that reason, and also because the page has the function of being a place where whistleblowers can report abuse.

			WMF operates on the same basis. Functionaries (e.g. Philippe Beaudette) have given instructions that posts to their en:wiki talk pages are not to be removed, because they want to see them, even if the posters are blocked or banned. ] (]) 15:47, 5 August 2016 (UTC) <!-- Template:Unsigned IP --><small class="autosigned">— Preceding ] comment added by ] (]) </small> <!--Autosigned by SineBot-->

			:] has provided a link to an LTA report. Clicking on the link it would appear to be a spoof report. Severity is described as "high", commencement date is given as 5 March 2010. So if the traffic is so bad that the instruction is "semi – protect the affected pages" why did it take 2,081 days for the report to be created?

			:The rant begins with the claim

			{{xt\|She regularly edit wars with Jc3s5h}}

			According to this analysis ] all she is doing is removing vandalism by Jc3s5h. ] (]) 05:38, 9 August 2016 (UTC)

			I’ve reverted ]’s deletion of this section made on the arrogant assumption that only he gets to decide what appears on Jimbo’s talk page per ] as he features in the report:

			176.In his propaganda Favonian says "This user knows that EVERYBODY LIES". Really? He also says he is "not a communist" then adds that he is a liar '''and''' a communist. He's quietly blocked me although the SPI remains open.
			] (]) 16:10, 23 July 2011 (UTC)

			194.Anyone can criticise someone else from the safety of the edit summary. ... isn't man enough to come down into the thread and make his claims there because they're ... there are no diffs. - 86.164.126.212 20:38, 18 August 2011. <small class="autosigned">— Preceding ] comment added by ] (]) </small><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->

	== Commons may be broken ==		== Commons may be broken ==

Revision as of 09:01, 9 August 2016

Welcome to my talk page. Please sign and date your entries by inserting ~~~~ at the end.
Start a new talk topic.

Jimbo welcomes your comments and updates.
He holds the founder's seat on the Wikimedia Foundation's Board of Trustees.
The current trustees occupying "community-selected" seats until Wikimania 2017 are Pundit and Raystorm.
The Wikimedia Foundation's Director of Support and Safety is Maggie Dennis.

Sometimes this page is semi-protected and you will not be able to leave a message here unless you are a registered editor. In that case,
you can leave a message here

This is Jimbo Wales's talk page, where you can send them messages and comments.

Put new text under old text. Click here to start a new topic.
New to Misplaced Pages? Welcome! Learn to edit; get help.

Archives: Index, Index, A, B, C, D, E, F, G, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252Auto-archiving period: 1 day

This user talk page might be watched by friendly talk page stalkers, which means that someone other than me might reply to your query. Their input is welcome and their help with messages that I cannot reply to quickly is appreciated.

Centralized discussion

Village pumps: policy; tech; proposals; idea lab; WMF; misc

For a listing of ongoing discussions, see the dashboard.

Book creator

This is getting embarrassing:

You can create and edit a Misplaced Pages book design using the Book Creator and upload it to an external rendering service:

MediaWiki2LaTeX provides a softcopy conversion service to pdf and other formats. It remains under active support and may be used online or installed locally.
Pedia Press offer final tidying and ordering of print-on-demand bound copies in (approximately) A5 format.

For help with downloading a single Misplaced Pages page as a PDF, see Help:Download as PDF.

Status last updated 23 August 2020.

— Cheers, Steelpillow (Talk) 05:57, 1 August 2016 (UTC)

I wonder if we shouldn't just remove those pages. Or are you arguing that the Wikimedia Foundation should invest resources in fixing the problem? I'm not opposed to that in principle, but I don't believe the tool was ever used much. I might be wrong about that, though, so if I am, then let me know!--Jimbo Wales (talk) 13:49, 1 August 2016 (UTC)

I'd really, really appreciate if Misplaced Pages's standard PDF export function (the clickable link in the left margin on each page – is that the same software as Book Creator?) would render tables. Only today I added this to the WP:CONTENTFORK guidance, partially based on the fact that tables are not exported. I'd rather not need a bypass for that guidance for such reason (a kind of guidance that is prone to inadvertent shadyness). Also it's quite frustrating, e.g. I've been putting some energy in List of repertoire pieces by Ferruccio Busoni lately: click on the PDF export function and *poof* almost nothing remains apart from four pages of references referencing something that isn't there. Yeah, imho, would be money well spent to get that sorted. --Francis Schonken (talk) 16:59, 1 August 2016 (UTC)

I hope someone has some usage statistics. I know I field a number of questions at OTRS about the tool (mostly bug reports, but they do substantiate some level of usage), so I know there is interest, but I don't have a clue about whether the usage is high enough to justify expense.--S Philbrick (Talk) 17:10, 1 August 2016 (UTC)

Usage statistics OK, but that cuts both ways (talking about the PDF export function, still not sure whether that's the same as Book Creator): I don't use it any more while it doesn't do what it should do, i.e. not maim an article when converting it to PDF. How can one extract insightful usage statistics from something that is avoided for its cumbersome MO? Use PDFcreator or some similar tool on the weblayout is what everyone says when I bring up the issue of the discarded tables, so I assume that's what most people do when they want to create PDFs – but the result is considerably different from what one gets with the built-in PDF export function (which has a better readability afaics). Current usage as such doesn't learn much... how many Misplaced Pages pages are sent to local software PDF generators? Wouldn't people prefer prints in "PDF export function" layout over "weblayout" generated by local software? --Francis Schonken (talk) 17:27, 1 August 2016 (UTC)

Meaningful usage statistics would have to predate the 2014 "update", I wouldn't know where or how to look. All I can say is that there is still a fair trickle of complaints at Help:Books/Feedback and it pretty much borks most new submissions to pediapress. For any one editor making their presence felt there, a standard rule of thumb is that there are 100 to 1,000 silent editors who just walked, and ten times as many visitors left with the impression that the whole business sucks. If nobody's gonna fix it, then I think it needs to be killed. OTOH if the copyrighting battle against rip-off artists is worth the fight, then book creation needs fixing up properly so pediapress and the rest of us can leverage it again. Either way, doing nothing is bad. — Cheers, Steelpillow (Talk) 19:00, 1 August 2016 (UTC)

How much does http://pediapress.com donate for the premium service of ? This press release from 2007 explains what happened. The reason is when people started selling PDFs on Amazon. EllenCT (talk) 18:41, 1 August 2016 (UTC)

It seems to me that this is a nice conclusion you've jumped to. From what I see, and I could be wrong, there are the following problems with your theory: 1) PediaPress seems to be focused on creation of physical paper books, not just files of Misplaced Pages content as anyone (should) be able to generate. 2) PediaPress seems to depend upon the same book creator that creation of files do. I don't know, but I wonder if that service now suffers from the same rendering problems that Book Creator does. And I wonder if you know by experience, or are you just speculating? But I actually am writing to say that I'm one of the silent masses who would really like the book creator to be fixed; it would be nice to have the ability to port collections of astronomy articles to a document file and be able to read them offline at our observatory. LaughingVulcan 17:27, 2 August 2016 (UTC)

I'm completely sure I remember when Misplaced Pages articles and collections started showing up on Amazon. I recommend simply asking PediaPress if they can make the nice PDFs you want, but don't be suprised if they charge you a token amount and add certain strings. @CAnanian (WMF): do you know the answers? You seem to be the only staff assigned to . EllenCT (talk) 19:14, 2 August 2016 (UTC)

OK, but my question to you was if you actually have knowledge and/or proof that the reason for Book Creator not working properly is that people started selling PDFs on Amazon (and implying PediaPress in the process)? It appears to me that you do not, and are merely speculating / fishing in the dark. Especially since the PediaPress thing apparently began in 2007 and apparently the breaking of Book Creator occurred after that. As mentioned above by Steelpillow and as I speculated, the breaking of Book Creator ALSO breaks PediaPress as well, as Book Creator is HOW one submits files to PediaPress in addition to creating PDF files for download. But you didn't know that, did you? Anyway, it's clear to me that you do not seem to know what you're talking about, as fixing it so PediaPress would work would also fix it so I can just download a PDF. But you don't seem to get that. Anyway, as I said, mark me down as one who sees Book Creator as important and would like it to be fixed so that tables, etc. render properly. Whether for personal use, or to submit to PediaPress. LaughingVulcan 19:31, 2 August 2016 (UTC)

Hold on there chaps, I get a sense of talking at cross-purposes. The reason for *what* is because books started appearing on Amazon? We are effectively trying to create Print on demand books and Amazon is a popular sales outlet for the printed volumes, whether published by PediaPress or anybody else. — Cheers, Steelpillow (Talk) 14:05, 4 August 2016 (UTC)

All I remember from the time is that the works were poor quality (tables would render, but type would break across page breaks) and there was a substantial outcry that they would tend to bring the project into disrepute. The problem became substantially worse in the years following 2007. See e.g. OmniScriptum#Misplaced Pages content duplication, and . EllenCT (talk) 20:16, 4 August 2016 (UTC)

There are two obvious questions here. First, why did WMF make a bad update that broke features, then refuse to fix it? And how did we go from "This technology is of key strategic importance to the cause of free education world-wide," said Sue Gardner, Executive Director of the Wikimedia Foundation. (2007) to saying that it was not worth having a management process and intentionally breaking the feature seven years later? I mean, if strategic means "totally unwanted in seven years" then there is no strategy at all and donors shouldn't be paying for overpaid Brahmins to work it out. Wnt (talk) 00:07, 3 August 2016 (UTC)

As a relative outsider I looked into this a little. It seems that the old, relatively functional code was Misplaced Pages-specific, in an unfashionable programming language and (ironically) not easily maintainable. A more maintainable core engine was pulled in from somewhere and what I can only describe as alpha software wrapped around it and gifted to us in place of the "unmaintainable" that had basically worked. The idea was to iron out the bugs and add the missing features from here on in. But that never happened because at that point the developer walked. Maybe it had all been done for free up until then, I don't know, but the folks at WMF apparently decided to spend their money and effort elsewhere and just leave the mess hanging. Quite why they trashed Sue's strategic vision is unclear to me. — Cheers, Steelpillow (Talk) 04:06, 3 August 2016 (UTC)

I think you've got it right as to what happened. I'm willing to advocate for investing in fixing it but only if we have some indication that it was actually being used by many people. It is entirely possible that upon release Sue thought it was going to be "of key strategic importance" but within a few months time it may have become apparent that it wasn't important at all. These things happen, and no one can really be blamed for it. But if a decision was made to deprioritize it to the point that broken software has been left in place for years, well, that's not good - better to just remove it completely I would imagine.--Jimbo Wales (talk) 13:52, 3 August 2016 (UTC)

The Misplaced Pages:Books page was created at the tail end of 2008 and Help:Books in 2009 around the time of Sue's vision statement. Browsing Category:Misplaced Pages books gives some idea of how much Book Creator had been used up until 2014. Another way is to browse the PediaPress website, although I don't know if any download/purchase stats exist. I can't imagine the usage stats could have been all that bad after say 2012, or a long-term maintainable rewrite would never have been kicked off in 2014. To me, the key question is whether WMF should care about the likes and ambitions of PediaPress any more, and if the answer is "yes" then the management process needs resurrecting if nothing else. Let that process decide whether to share or to shaft. Or, if "no", then can the whole thing. For my part, some of the moans on the feedback page give me the feeling that that the 'press momentum was beginning to create a self-perpetuating marketplace in which academics were improving articles to publishable quality so they could provide better books in class. Is there a critical mass there to be sought for? As I say, does the WMF care? — Cheers, Steelpillow (Talk) 16:02, 3 August 2016 (UTC)

I almost agree with that, except that I'd make the case the likes and ambitions of companies that monetize the use of it should take a distant second place behind those of us who would use it without commercial ambitions but rather for continuing learning for when we don't have internet connections. But the other thing I'd note is that I was frightened away from the warning above, only to find that while parts of it are broken, parts aren't as well. It still put a decent book together for me of Messier Objects, even as it borked the "List of Messier Objects" article/chapter because it is one big table. I think the creators of that announcement went a tiny amount Chicken Little - then again maybe it does just accurately describe the problem. The other question is, if the Foundation doesn't have the resources to create and maintain it, is it possible to crowdsource development of it? (Just whistling in the dark there.) LaughingVulcan 01:49, 4 August 2016 (UTC)

Hi everyone, the WMDE's software development team is also currently looking into that issue, since adding tables to pdfs was one of the wishes of the German Community Wishlist. So far, our investigations have shown that it would take an enormous engineering effort (comparable to software companies that produce layout software) to add tables to the current latex layout in a way that 80%-90% of the tables display correctly. 10%-20% would always be off due to the different capabilities of the two media (printed, layouted page versus HTML). Therefore we will probably add another option to the page that appears when you click on "download as pdf", which allows you to download a pdf that looks more or less like the web page you see. On the plus side, it will contain all tables, images etc. that are present in the article, on the down side, it will not be as concise and nicely layouted as the latex version. Therefore we would add this as a new option that you can choose depending on what you want. Wikibooks however would probably need a "print page" which includes all chapters for the new rendering service to work, which is not included in our initial plans. In general, the hope is that by moving towards a browser based rendering service (which takes the web page as its basis) we will get more people to join in in improving the layout that comes out of there, making it a more maintainable solution to the pdf creation problem. --Lea Voget (WMDE) (talk) 17:00, 4 August 2016 (UTC)

Offline Content Generator (OCG)

As the current default maintainer of the Collection extension, PDF export, plaintext export, and (soon) ePub and ZIM export, let me give a (short version of a) longish history. The Book Creator/Collection extension was originally created by Pediapress in 2008. Part of the service was hosted on WMF servers in our data center in Tampa, but if you actually ordered a printed book the request got bundled up and passed over to Pediapress' servers, which ran a similar version of the code but interfaced with their print-on-demand service. Pediapress made enough money from the print-on-demand service (apparently) to fund continued development of the service, which benefited all those who generated PDFs but did the printing themselves, and this mutually-beneficial arrangement persisted for a number of pleasant years.

However, the buglist grew over time. Pediapress did not invest much effort in internationalization, and support for non-roman-script languages was poor-to-nonexistent. Pediapress maintained their own bug tracking system, which grew to contain thousands of bugs. It *appears* that Pediapress was no longer making enough money from print-on-demand to fund their continued development and maintenance of the code base, and development stalled. No effort was made on the code base for a number of years, but the system "worked enough" (for European languages, at least) that things muddled on.

Unfortunately, the day came when WMF had to move out of its Tampa datacenter. The Pediapress code was literally the last thing running in Tampa, and it was costing the Foundation $1,000/day to keep that one server running ($30K/month). Worse, no one had written down how that server had been installed and there was no one who could recreate its configuration in our new datacenter. It looked like we were going to have to turn off Book Creator.

Matt Walker was passionate about Book Creator, however, and pulled in a skunkworks group of WMF folks to save the service, rewriting it in what was a state-of-the-art architecture at the time. We rebuilt it from scratch, documenting the process and installing it on modern server infrastructure, and were able to keep things going. The project had the support of Erik Moeller, and I was pulled in to provide support from the Parsoid side, eventually writing the PDF backend and a plaintext backend. As these things go, however, the new project had a different feature set -- it was much better at Indic and non-latin languages (thanks to XeLaTeX), had clickable hyperlinks, included enough license information to actually comply with our Creative Commons attribution requirements, etc -- but was missing some features. Tables and infoboxes are particularly hard, and those aren't particularly strong points for LaTeX either.

I don't need to recap the organizational struggles at the foundation in the following years. Suffice it to say that all the original participants in the skunkworks project, including Erik who had provided C-level support, have since left the foundation, leaving me as the last member of the original skunkworks. Further, the engineering reorganization which occurred toward the end of the Lila era left OCG homeless. OCG should rightly be part of the "reading" team, but it's only remaining developer (me) is on the "editing" team. La la la. We don't generally let these sorts of things get in the way of actually doing good work, but they are relevant when deciding who to petition for additional resources...

We actually had a great Wikimania this year, with a lot of focus on the "Offline Content Generator" (as the architecture behind Book Creator, PDF export, the Collection extension, etc, is formally named). In fact, we had ZIM export and ePub export capabilities developed during the hackathon. Unfortunately, the code hasn't actually been submitted yet to me/the WMF, so we can't deploy it. :( But it exists, I've seen it running, and for the first time we had more-than-just-me working on OCG.

In addition, as the WMDE team above explained, the German Wikimedia chapter has adopted "tables in PDFs" as one of their feature development goals. The first part of this is https://gerrit.wikimedia.org/r/290417 . And I wrote basic support for tables a few years ago; see https://gerrit.wikimedia.org/r/107587 -- the problem is that my patch doesn't *always* work, and can in some cases cause the entire page to fail to render. At this level of support I judged it best to keep suppressing tables and get *some* output, rather than risk getting *no* output for many pages. (This is really a fault of LaTeX's limited table support, which prefers to fail when it sees something unexpected or unexpectedly wide, and requires semi-heroic measures to work around.) There are ways around the problem we can discuss. (Gabriel posted some phabricator links below.)

One final wrinkle is that the architecture which was state-of-the-art in 2014 is already looking a little dated in 2016. The "services" team here at WMF has standardized on a services architecture and the use of cassandra for storage, and in general we would like to use browser technologies to render the page more directly from the HTML DOM rather than use a LaTeX intermediary. In addition, we made some architecture compromises to maintain compatibility with the pediapress POD service, which are looking less wise (we still support the pediapress POD but we send a high-level description of the page to them now, so we don't need to maintain compatibility at lower levels in the stack). We could really use some help (a) modernizing the backend, and (b) working with modern CSS technologies to make browser output on par with the LaTeX output, so we can eventually remove the LaTeX backend. Sometimes discussions of OCG spiral off into tangents along these lines; some even suggesting that further investment in features on the LaTeX backend is a waste of time.

So. Yes, OCG is starved for resources. It is also sitting at an awkward place both in the org chart and in the overall services architecture of the foundation. As long as I am the only one working on OCG, it will continue to make slow progress, but there are in fact several useful improvements on the immediate horizon. The usage statistics are also available; the short version is that we generate about 10 PDFs a second currently. That's an order of magnitude less than the number of pageviews/second of our article web pages, but still quite a large number of users. C. Scott Ananian (talk) 21:55, 4 August 2016 (UTC)

Links to related tasks @Cscott: mentioned: Table support in PDFs, Options for browser-based PDF rendering. To gauge quality of browser-based rendering, we have set up an instance of a Chrome based third party render service (Electron) in labs. Example URL: https://pdf-electron.wmflabs.org/pdf?accessKey=secret&url=https://en.wikipedia.org/Barack_Obama

Wikimedia Germany is considering to use this for improving table & other complex content support for the "This page as PDF" feature. -- GWicke (talk) 22:07, 4 August 2016 (UTC)

Thank you Cscott for the update (not to mention for hanging in there). Chicken Little has now updated the warning template accordingly. — Cheers, Steelpillow (Talk) 09:30, 5 August 2016 (UTC)

I'm interested in hearing more about "missing math support"--OCG should actually be on par or better than the previous service on this regard, as they both use the native math support of LaTeX. If someone could chase down more details on this I'd appreciate it. C. Scott Ananian (talk) 15:26, 5 August 2016 (UTC)

I think it's more to do with sensible layout. Some longer equations do not fit in a two-column layout. For example try downloading the Grassmannian article as pdf and check out section 6 on the Plücker embedding - one equation runs right across both columns. Worse, a long equation in the second column has nowhere to run off to. The no-brainer answer is to allow selection of single-column, full-width layout. More sophisticated solutions might be to split the equation across multiple lines or to shrink the font size to fit. — Cheers, Steelpillow (Talk) 19:40, 5 August 2016 (UTC)

This already exists if you use the Book Creator. Single-column layout is one of the options available. What's needed is some way for an article to embed a hint that it looks better in a single-column layout, via a category or some such. C. Scott Ananian (talk) 14:28, 7 August 2016 (UTC)

How come Lea Voget (WMDE)'s prognoses seem bleaker than Cscott's? Or am I missing something? Lea's seem like "forget it", something that looks like steering for just taking the service off-line, while Cscott's rather looks like, "baby steps, but we're progressing and have prospects", and at least shows someone kinda managing the process (even from a somewhat awkward position that doesn't leave too much wiggle room). --Francis Schonken (talk) 15:53, 5 August 2016 (UTC)

Hi, in my own mediawiki2latex compiler linked in the above template I can handle tables correctly, as you can easily check by just running the exe file on the examples of your choice. Still I must agree it was extremely hard for me to write that software and I was driven by an extremely passionate hate on the economic system I happen to live in. If you want to pay someone to do it, it will be quite expensive I think, since people working for money never reach such a level passion. I personally can not help you with the development, since I got a permanent position at university now. Still I will try to keep my software available so that anyone in need of the LaTeX source of wikipedia articles or their respective PDF version will have access to them. Also I must say the the process I developed needs lots of computational resources, so that the above mentioned cost of 10000$/day might be realistic if you wanted to use my software as default renderer on wikipedia. Its quite simple you create 10 pdf a second. My software needs 300s per PDF on a current i3 desktop. So thats 3000 i3s you need to run the software wikipedia wide, which is not affordable. And of course I will get myself a t-shirt: "Semi-Hero of LaTeX OCG table rendering" Yours --Dirk Hünniger (talk) 16:53, 6 August 2016 (UTC)

Tx. Is there a place to continue this conversation somewhere centralized? Misplaced Pages:Offline Content Generator (WP:OCG)? Or some place at meta? --Francis Schonken (talk) 04:58, 7 August 2016 (UTC)

Also, is there a compelling reason why the computing power should be server-side? Can't the conversion to PDF be done client-side with a script? --Francis Schonken (talk) 04:35, 8 August 2016 (UTC)

If you want to stick with LaTeX you need a quite exhaustive LaTeX installation which needs several gigabytes, which you can not easily transfer to the client, just to create a PDF. I tried hard to reduce the amount of software that needs to be installed, but was not successful. Furthermore the LaTeX compiler and some other auxiliary tools are binaries compiled to run on a PC, and cannot easily be turned into java script. I in deed offer binary releases of mediawiki2latex for download on sourceforge which can run stand alone on a client PC, but the are just standard binary executables and have nothing to do which scripts running in a web browser. Yours Dirk Hünniger (talk) 06:12, 8 August 2016 (UTC)

Misplaced Pages, we have a problem

NeilN is the person who claims that a picture of a man in a pulpit in a mosque addressing a congregation of six is actually Muhammad seated on a camel on a mountaintop explaining to thousands of pilgrims how he is going to reform the calendar. 400,000 people signed a petition saying that the picture was not relevant to the article subject and that under the principle of least astonishment it should be removed. All that has happened is that the article has been protected permanently so that nobody can edit it.

This is what Jimbo has said:

In general I would prefer that we be more lenient than normal on my user talk page, particularly when someone is upset about something they view as an incorrect application of the rules. If he posts it again, I'd rather we discuss it than revert it - to a point, of course. Possibly something useful will come of it.--Jimbo Wales (talk) 20:15, 27 November 2011 (UTC)

That relates to an editor who was blocked. Jimbo has said many times that it is permissible for blocked or banned users to post here provided they do not use the page as an infinite soapbox to harass other editors. He opposes semi - protection for that reason, and also because the page has the function of being a place where whistleblowers can report abuse.

WMF operates on the same basis. Functionaries (e.g. Philippe Beaudette) have given instructions that posts to their en:wiki talk pages are not to be removed, because they want to see them, even if the posters are blocked or banned. 188.220.247.11 (talk) 15:47, 5 August 2016 (UTC) — Preceding unsigned comment added by 82.18.244.238 (talk)

Zzuuzz has provided a link to an LTA report. Clicking on the link it would appear to be a spoof report. Severity is described as "high", commencement date is given as 5 March 2010. So if the traffic is so bad that the instruction is "semi – protect the affected pages" why did it take 2,081 days for the report to be created?

The rant begins with the claim

She regularly edit wars with Jc3s5h

According to this analysis Special:Diff/733643012#Your first half - century. Happy birthday! all she is doing is removing vandalism by Jc3s5h. 86.136.230.45 (talk) 05:38, 9 August 2016 (UTC)

I’ve reverted Favonian’s deletion of this section made on the arrogant assumption that only he gets to decide what appears on Jimbo’s talk page per WP:INVOLVED as he features in the report:

176.In his propaganda Favonian says "This user knows that EVERYBODY LIES". Really? He also says he is "not a communist" then adds that he is a liar and a communist. He's quietly blocked me although the SPI remains open. 78.145.26.194 (talk) 16:10, 23 July 2011 (UTC)

194.Anyone can criticise someone else from the safety of the edit summary. ... isn't man enough to come down into the thread and make his claims there because they're ... there are no diffs. - 86.164.126.212 20:38, 18 August 2011. — Preceding unsigned comment added by 86.146.168.159 (talk)

Commons may be broken

Jimbo, what do you think of this photo that's been on Commons for years? For (removed link to copyvio), too. - 72.78.244.41 (talk) 01:05, 7 August 2016 (UTC)

The one you saw was vandalism. But the image itself was a copyright violation. Now deleted. It happens. We miss a lot of images on Commons that shouldn't be there. Since it was a copyvio I have also removed the archive link from your post. --Majora (talk) 01:25, 7 August 2016 (UTC)

Who owned the legitimate copyright on the image? - 72.78.244.41 (talk) 01:36, 7 August 2016 (UTC)

A few people. It was a composite image. The background was owned by a news agency and the person was something else. --Majora (talk) 01:37, 7 August 2016 (UTC)

Do you have any evidence to support these conclusions? Was the image of "the person" a single image, or was it a composite (face, plus body)? - 72.78.244.41 (talk) 01:43, 7 August 2016 (UTC)

Of course I had evidence. To delete something as a copyvio without evidence would be stupid. The background was taken from http://mbdtv.com/khou-houston-tx-2/ (photoshopped to remove the logo and everything). As to the person it doesn't matter. One copyrighted piece equals copyvio equals delete. --Majora (talk) 01:52, 7 August 2016 (UTC)

Okay, so the person doesn't matter. So, if I have a non-copyvio background, and I want to photoshop a known woman's head onto a half-nude body of some other unknown woman, slowly spreading open her jacket to reveal a goodly portion of her breasts, and then publish the image to Commons with a file name that is exactly the known woman's real name, it will be okay for that file to sit on Commons for a few years, because the person doesn't matter. Got it. - 72.78.244.41 (talk) 02:11, 7 August 2016 (UTC)

That is not what I meant, that is not what I said, and you know that. So right now you seem like you are trolling. I said I didn't check where the woman came from since the background was copyvio and the rest didn't matter. One copyvio piece means delete. As for your ludicrous hypothetical that would be out of commons scope and a vandalism image and would be deleted on sight. Just like this one was deleted on sight. Just because it took a few years to "see" it doesn't mean it won't be deleted on sight. --Majora (talk) 02:16, 7 August 2016 (UTC)

It was brought up several days ago on a website frequented by several Misplaced Pages admins, and after 40 hours of inaction, a reminder was posted again. Still nothing. So after a couple more days, I decided to post it here to Jimbo's Talk page, because I know that he shares my belief that much of what goes on over at Wikimedia Commons is downright disgraceful. I think your dismissive response here is also somewhat disgraceful. The biggest problem here isn't that Modular Broadcast Design's copyrighted photo of a newsroom set was wrongfully copied; the biggest problem is that for TWO FLIPPING YEARS, Wikimedia Commons hosted a file with the name of a real newscaster, presenting her face on some stranger's body, showing off her tits in a come-hither pose, and thanks to Google, this became a high-ranking result in Image searches for the newscaster's name. This should not ever, ever happen on a publicly-funded charitable site that is exempt from taxation because of its supposed "educational" mission. It's a disgrace, and the fact that this hasn't been fixed after over a decade's worth of time to implement some restraints, it's grossly negligent. So, I don't give a flying fig if you want to call this "trolling". It's what you need to hear. - 72.78.244.41 (talk) 02:27, 7 August 2016 (UTC)

Please feel free to watch the feeds and tag whatever copyvios come up as copyvios. Complaining about it to Jimbo is not going to do anything about it. If you want to clean up Commons, go help clean up Commons. There are millions of images there and very few people who feel the need to mark copyvios. It was brought up you say? Why wasn't it tagged for deletion by the person who brought it up? Commons needs more people to help them tag images for removal. Everything on here is done by volunteers. So volunteer. --Majora (talk) 02:32, 7 August 2016 (UTC)

"Complaining about it to Jimbo is not going to do anything about it." You say that without any hint of self-awareness. After that file sat for years, something was finally done about 20 minutes after complaining about it to Jimbo. Amusing. - 72.78.244.41 (talk) 03:50, 7 August 2016 (UTC)

Touché. --Majora (talk) 04:04, 7 August 2016 (UTC)

This sounds like a major lapse on the part of Commons. A picture as described above is bad whether it violates any copyrights or not; it at least violates https://commons.wikimedia.org/Commons:Photographs_of_identifiable_people#Defamation . Saying "fine, it violates copyright, so we've deleted it" gives the impression that Commons is trying to find an excuse to delete this particular image without having to admit the seriousness of the lapse. Ken Arromdee (talk) 19:10, 7 August 2016 (UTC)

It's not the complaint to Jimbo that did something about it - any way you might have reported it would have had the same effect. Commons deletes a lot of files, too many I'd say, and from the description above this one certainly wouldn't have survived a call for deletion.

The real question is whether you want to allow users to upload content or not. If they do, inevitably there is abuse; this is true for any site with user generated content on the Internet. You can say that someone ought to review it, but that only works if someone does - and if the person who uploads it can't arrange for a sock to do the review. You can say that someone must review it, but then the content becomes backlogged, and when people realize it's a waste of time to upload images because they won't get passed, they'll stop contributing, and stop reviewing, and the whole process will grind to a halt and we'll end up using links to external image servers for any new illustration in a Misplaced Pages article. You can propose having professionals do it, but then images become expensive, so there's a submission process and again people don't bother. No, the fact is, you either make a decision that freedom is really awful and it is worth paying any price, including giving up on illustrating Misplaced Pages, in order to avoid an occasional naughty parody being seen by a few hundred people before somebody complains -- or you don't. Which is it? Wnt (talk) 20:22, 7 August 2016 (UTC)

So many false premises in your argument, it's not worth responding. - 72.78.244.41 (talk) 01:13, 8 August 2016 (UTC)

That wasn't an "occasional naughty parody," that was a sexually harassing image of a living person that was seemingly intentionally dropped as a google-bomb, using Commons as the witting or unwitting but certainly no more than semi-competently operated vehicle. Carrite (talk) 10:52, 8 August 2016 (UTC)

You can "Google-bomb" with any image anywhere, 8chan or what have you. That doesn't change the fact that every person who saw the image had a chance to report it to a deletion discussion that would have done something about it, which is more than you can say for some of the other upload alternatives. No, Commons is not as robustly protected against photo pranks as, say, a paper copy of a hardcover book from a publisher on Printer's Row licensed by the Crown Censor and locked firmly under the glass cover of a coffee table behind velvet ropes in the Royal Museum. But Commons does what it's for, and despite what some people seem to imagine, preventing mischief before anybody much sees it is not what it's for. It matters more to have an archive that gets stuff out for our articles than to run around tearing your hair out because 50 people saw an apparently pretty obvious fake of the kind that come up when you do any image search for any celebrity. Wnt (talk) 16:18, 8 August 2016 (UTC)

Indeed, while things can always be improved by correcting problems sooner so that you won't even get the one in a million case where a problem has persisted for a few years in some dark corner, you cannot have a system that is guaranteed to never ever have problems without restricting freedom. And why would we care more about a remote probability of encountering an offending picture than e.g. becoming a victim of a crime? We accept that living in a free society carries with it small risks, we don't think it's worth living in a less free society even if crime could be totally eliminated. Count Iblis (talk) 18:45, 8 August 2016 (UTC)

Buying accounts

I have found an account apparently run by two different people. One editor performs trivial edits every now and then. Moves some text around in a random article, adds articles to trivial categories, that sort of thing. The usual pattern is nothing for months and then a lot of edits in a short time. The sort of thing you'd do if you were keeping a sleeper account and wanted it to look reasonably active.

The other editor began a series of quite different edits on the day an IBAN came into play. This editor attacks one of the parties to the IBAN, reverts his edits, makes reports to admin boards, !votes the other way in RfCs and does his best to be annoying by doing all the things an IBAN disallows.

A silly, petty game to avoid a ban.

There are some similarities in behaviour between the IBAN editor and the "active" editor in this account. However, checkuser doesn't pick up any IP similarity, because care is taken to use IP addresses in China.

My question is twofold:

How widespread is this sort of thing?
What can be done to prevent this behaviour? — Preceding unsigned comment added by 2001:470:fd:3::40 (talk) 23:52, 7 August 2016 (UTC)

Why not give us the username, so that various people can look into your allegations? Cullen Let's discuss it 02:32, 8 August 2016 (UTC)

No, let's not give a name without any shred of evidence. --Floquenbeam (talk) 13:23, 8 August 2016 (UTC)

What about these Users?

Jimbo, would you take a look at User:Sippublicity and User:96.49.155.125? Seems quite clearly the accounts of a paid advocate for the subject -- a subject you recently took an editorial interest in. Since you are always willing to admonish paid advocacy editing, might you comment here publicly, to admonish Ms. Rafati for ever using such an unethical PR firm? Or does this one get a pass, since it was way back in 2012? - 72.78.244.41 (talk) 01:16, 8 August 2016 (UTC)

Ah, the old mysterious IP posting on Jimbo's talk page trick. Thanks for the pointer, though yes, a fish was wrapped in that newspaper four years ago, and there is of course no evidence of payment. Still, as a Protector Of The Wiki I'll jump at your suggestion. Drmies (talk) 01:19, 8 August 2016 (UTC)

Check the "Contact", Drmies. - 72.78.244.41 (talk) 01:45, 8 August 2016 (UTC)

Ha, why didn't you say so immediately? At any rate, I already blocked the account. Now, why do you want old Jimbo to respond to some four-year old matter? Is he dating this person? Is BroadbandTV taking over Misplaced Pages? (All these things may well be possible--I'm out of touch.) Drmies (talk) 01:59, 8 August 2016 (UTC)
Hmm, the way you say my name, that tone of voice...has a familiar ring to it. Drmies (talk) 02:02, 8 August 2016 (UTC)

Are you implying, Drmies, that the IP editor is not an innocent newcomer who has decided out of the blue to offer Jimbo some exceptionally good advice? I'm shocked. Cullen Let's discuss it 02:37, 8 August 2016 (UTC)

What's better, a familiar ring, or an unfamiliar ring? North America 12:57, 8 August 2016 (UTC)

I don't know anything about this and I'm not very inclined to care. The person behind this ip address is well known for stalking my every edit and commenting on it. It's a very sad life he lives, I'm afraid.--Jimbo Wales (talk) 15:37, 8 August 2016 (UTC)

Misplaced Pages being open to all, if you work on building the encyclopedia for any length of time, you have the possibility of attracting your own personal stalker who considers pretty much anything you do a personal affront, and who considers it their sacred duty to "expose" the person they fixate on. It's really quite pathetic, but for some reason they just can't quite seem to figure out why no one else sees their actions as heroic. --Guy Macon (talk) 16:00, 8 August 2016 (UTC)

Yeah, there's plenty of those, Jimbo. But imagine what those poor people would be doing if Al Gore hadn't invented the internet... Drmies (talk) 00:44, 9 August 2016 (UTC)

Drmies please don't joke about it or otherwise encourage this person in any way. If you recognize the usual banned editor, please just delete all of his comments, or just stay out of the way and let other people do it. He's been banned about 10 years now, so what's the point in giving him a voice here?Jimmy's been very clear multiple times that he is not welcome on this page. Smallbones_(smalltalk) 02:18, 9 August 2016 (UTC)