Wikipedia talk:WikiProject AI Cleanup/Archive 2

‹ The template below (Archive) is being considered for merging with Annual archive. See templates for discussion to help reach a consensus. ›

This is an archive of past discussions on Wikipedia:WikiProject AI Cleanup. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Links to AI-generated translation

A translation produced by ChatGPT of Tzetzes's commentary on Lycophron's Alexandra has been linked on 175 pages related to Greek mythology. [1] The translation itself is, suffice it to say, highly problematic, and shouldn't be linked on Wikipedia. Is there an effective automated method for removing these links en masse? Thanks, Michael Aurel (talk) 23:02, 15 November 2024 (UTC)

While something like AWB could "naively" remove the links themselves, it could be better to look at the articles individually to see whether the material already has good sourcing and the link can be safely removed, or if a substitute translation should be found and added instead. You could also drop a note at WP:RSN so editors can look at the wider website (https://topostext.org) to see if other similar translations are present. That way, the extent of the problem could be more accurately assessed, and future editors will be able to find it in the archives. Chaotic Enby (talk · contribs) 17:37, 18 November 2024 (UTC)

@Chaotic Enby: Thanks for your reply. Unfortunately, the work hasn't been translated into English by a scholar yet (or out of the original ancient Greek at all, I don't believe), so the only replacement link which we could really provide would to be an old edition of the work in ancient Greek (eg. [2] or [3]), and I imagine adding such links wouldn't be possible with automated tools. A discussion at WP:RSN might be useful, and could help to establish a consensus around how such translations ought to be handled, although I do note that a google search for "chatgpt site:topostext.org" only brings up this translation, which would seem to indicate that this is the only AI-generated translation hosted at that website. (Also, these links were all added by one editor I believe, in good faith but unwittingly, who I contacted before starting this discussion, so hopefully this translation, once removed, won't be linked again.) So, given this, would you say an automated method of removal, while possible, is likely not preferable to a manual approach? Or perhaps someone familiar with AWB could remove the links, and I could go through each page afterwards and manually link a Greek edition, or find a secondary source? – Michael Aurel (talk) 22:44, 18 November 2024 (UTC)

I would say it is still way preferable to look individually at each use of the source. By the way, especially when dealing with medieval or ancient texts, more recent secondary sources are very much preferred. Tzetzes's commentary might be "secondary" with respect to Lycophron's Alexandra, but given the age of the source, it is indeed best to treat it as a primary document from a historiographical perspective, and to cite secondary sources that discuss it in context. Chaotic Enby (talk · contribs) 23:04, 18 November 2024 (UTC)

Alright, fair enough. And yes, secondary sources are of course always preferred when dealing with ancient texts. Tzetzes' work, while in some sense "secondary" to Lycophron's I suppose, is functionally a primary source, at least as far as Wikipedia is concerned; my suggestion to replace these with links to a Greek edition was only because in most instances there is almost certainly no secondary source which contains the cited information, due to the obscurity of Tzetzes' text, and its relative insignificance to Greek mythological study. – Michael Aurel (talk) 23:23, 18 November 2024 (UTC)

175 articles is quite a lot to check. I think we need to find out if the foundation is valid first. A chat at RSN could kick that off. We also need to find out if the translations are accurate, which is the core of it. If this doesn't answer, then they need to be removed. scope_creep^Talk 08:13, 19 November 2024 (UTC)

Posted a note to RSN. scope_creep^Talk 08:42, 19 November 2024 (UTC)

Thanks. I suppose I came here under the assumption that this sort of source wasn't considered acceptable, but perhaps the use of AI-generated translations isn't something which has actually been discussed before, so a precedent-setting discussion could certainly be helpful. – Michael Aurel (talk) 09:25, 19 November 2024 (UTC)

Yes, it could be. I'm new to this board but not heard anything about LLM translations been used as sources. scope_creep^Talk 10:08, 19 November 2024 (UTC)

Speaking of, this could be a good use case for the potential inline {{AI-generated source}} tag discussed above at #Article written based on an AI-generated source. While we can't automatically remove all 175 references without checking them, semi-automated tagging could help us get them in a tracking category. Chaotic Enby (talk · contribs) 11:19, 19 November 2024 (UTC)

Yes, could do. That would be an ideal testing ground for it. scope_creep^Talk 12:30, 19 November 2024 (UTC)

Just created it! I've added it to Hermes#Lovers, victims and children (the first result in the search) so we can see it in use. Chaotic Enby (talk · contribs) 12:47, 19 November 2024 (UTC)

@Chaotic Enby: What cat does it go it? Couldn't locate it. Found a couple of others incuding Category:Articles containing suspected AI-generated texts from November 2024. There is already 24 artices for Novemeber. scope_creep^Talk 14:21, 19 November 2024 (UTC)

Currently goes to Category:All articles lacking reliable references, although I would be open to making a new cat for it. Chaotic Enby (talk · contribs) 14:25, 19 November 2024 (UTC)

Interesting, this could certainly be a useful way of flagging the pages containing this source (and other such sources). Perhaps a new cat for pages containing this tag could be something along the lines of "Articles containing suspected AI-generated sources", as a specific tracking category for this seems as though it could be of use to this WikiProject, seeing as AI-generated sources are presumably only going to crop up more and more frequently. – Michael Aurel (talk) 16:47, 19 November 2024 (UTC)

Done, it now goes to Category:All articles containing suspected AI-generated sources! And now I'm wondering if "with" instead of "containing" would've been more concise.... Chaotic Enby (talk · contribs) 18:08, 19 November 2024 (UTC)

Nice! – Michael Aurel (talk) 18:11, 19 November 2024 (UTC)

I'll start monitoring it. I also see there is now 172 article now in the Articles containing suspected AI-generated texts category. scope_creep^Talk 07:51, 26 November 2024 (UTC)

To clarify here (as the RSN discussion has now been archived), is the idea to, in an automated manner, add these tags across all of the pages with this source? I've removed around fifty of the links so far (a decent start I suppose), but tagging these would allow this to be designated as an outstanding task, visible and open to others. – Michael Aurel (talk) 09:19, 26 November 2024 (UTC)

Yep, while removing references in a (semi-)automated way shouldn't be done, tagging them automatically so editors can look more closely at individual instances is definitely helpful. Chaotic Enby (talk · contribs) 12:32, 26 November 2024 (UTC)

When I was reviewing article in that cat "Ai-generated texts", I sent several articles to draft, in effect an NPP review. I think I did about 6 of them went. One was really bad. scope_creep^Talk 12:51, 26 November 2024 (UTC)

Just noting that these are two different cats, "AI generated text" (when the articles themselves are AI-written) and "AI generated sources" (when they cite sources that are AI-written), the tag mentioned earlier puts articles in the latter category. Chaotic Enby (talk · contribs) 13:01, 26 November 2024 (UTC)

That sounds good to me, then. Anyone adept with the requisite tools, feel free to enact this mass tagging (I wouldn't know how). – Michael Aurel (talk) 19:57, 26 November 2024 (UTC)

Sounds like a job for AutoWikiBrowser! Chaotic Enby (talk · contribs) 20:03, 26 November 2024 (UTC)

Ah, that's good to know. Though, hmm, would it potentially be easier for you to do it, as you're no doubt experienced with AWB, and I'm assuming it wouldn't take all that long (maybe?) to add tags to this many pages? Though if I'm wrong on either count (or you think it would be better I do it), I'm willing to give it a go. – Michael Aurel (talk) 23:14, 26 November 2024 (UTC)

I'll try to give it a go! Chaotic Enby (talk · contribs) 23:58, 26 November 2024 (UTC)

Thanks! – Michael Aurel (talk) 00:14, 27 November 2024 (UTC)

Editors may be interested to see the continuation of this discussion at Talk:Lycaon (king of Arcadia)#ToposText. – Michael Aurel (talk) 22:27, 27 November 2024 (UTC)

Discussion at Wikipedia:Village pump (policy) § LLM/chatbot comments in discussions

You are invited to join the discussion at Wikipedia:Village pump (policy) § LLM/chatbot comments in discussions, which is within the scope of this WikiProject. jlwoodwa (talk) 07:12, 2 December 2024 (UTC)

how to join?

how can I join Skeletons are the axiom (talk) 14:18, 5 December 2024 (UTC)

Adding your name to the list of participants is enough to join! By the way, you can sign with ~~~~, which adds your name and the current time automatically. Chaotic Enby (talk · contribs) 15:31, 5 December 2024 (UTC)

is there an unser infobox saying something like "this user is part of ai clean up"

and if not how would I make one Skeletons are the axiom (talk) 20:20, 5 December 2024 (UTC)

We have one, it's {{User WP AI Cleanup}}! It and all other templates we use are in the Resources tab! Chaotic Enby (talk · contribs) 20:22, 5 December 2024 (UTC)

Cleanup technique

It seems like the most effective way to clean up articles, going through the category of articles tagged as possibly ai-generated, is to just wholesale delete any uncited content, then spot-check sources to see if they support the content. If they don't, then they can be removed and if enough don't, the article can be stubbed as they probably all don't (this is useful when it is impossible to access all of the sources). If they do, the best available option seems to be to just delete the AI tag and presume it's good if the history isn't too suspicious.

This might be helpful to add to the guide. The main problem in fixing possibly AI-generated articles seems to be source access, where AI (possibly) can cite a source you can't access and it's impossible to check. Mrfoogles (talk) 00:58, 6 December 2024 (UTC)

Feel free to add it to the guide! Important emphasis on the fact that if AI-generated text cites inaccessible sources, it's pretty much guaranteed that the model didn't have access to these sources either, so it can be safely treated as unsourced. Chaotic Enby (talk · contribs) 11:34, 6 December 2024 (UTC)

Editor with 1000+ edit count blocked for AI misuse

User:Jeaucques Quœure. See [4]. I do wonder if a WP:CCI-like process for poor AI contributions could be made. Ca ^{talk to me!} 13:02, 26 October 2024 (UTC)

Wow, I think that would be a quagmire if we were specifically looking for LLM text, as detection would be slow and ultimately questionable in many instances. We could go through and verify that the info added in those edits is verifiable, but I wouldn’t go beyond that, nor do I think there is a need to go beyond that. — rsjaffe 🗣️ 14:28, 26 October 2024 (UTC)

I checked the last 50 edits, and the problematic edits appear to have been taken care of. Ca ^{talk to me!} 14:55, 26 October 2024 (UTC)

Unfortunately this user's pattern of LLM use goes a lot further back. I've already started cleaning up Specific kinetic energy and Specific potential energy; I've also tagged the two sections he added to Molecular biology (which appear to be LLM-generated summaries of the linked main articles, they'll probably turn out to be OK as long as someone with subject matter knowledge can review and source them).

While this isn't how I found these pages (was following up on this user's non-AI-assisted bad edits), it's notable that Molecular_biology#Meselson–Stahl_experiment (added in 17 April) was a 100% AI match on gptzero. I don't think that automated detection is reliable enough to justify straight-up banning people, but it's probably reliable enough to justify flagging repeat offenders for manual review. Preimage (talk) 12:39, 6 December 2024 (UTC)

owl party

i believe the OWL Party page is partly ai written so if one could check if it's accurate that would be great

also I feel it doesn't line up with Wikipedia's purely analytical tone

I don't know if this is how this things are done so if there's something wrong about this tell me :) Skeletons are the axiom (talk) 20:50, 5 December 2024 (UTC)

Yep, it definitely reads like ChatGPT's attempts at "quirky" humor. There's {{ai-generated}} as a tag you can add if you want. If you have more time, you can look at the history, revert the addition and message the user (either yourself, or Wikipedia:Twinkle has ready-made warnings for that matter). Chaotic Enby (talk · contribs) 21:38, 5 December 2024 (UTC)

added the tag! Skeletons are the axiom (talk) 13:41, 6 December 2024 (UTC)

Edits that need evaluation

See this thread at the Administrators' noticeboard. XOR'easter (talk) 03:49, 12 December 2024 (UTC)

Image looks off to me; 2nd opinion?

Something about File:May-Li Khoe.jpg, on new article May-Li Khoe, looks unreal to me, especially in comparison to the photos of the same person visible through Google image search [5]. Am I imagining things? —David Eppstein (talk) 23:08, 17 December 2024 (UTC)

I don't think this is AI-generated. I can't see any details that are strange, the focus seems relatively consistent, and it looks a lot like her, which is rare for someone who isn't that famous. Sam Walton (talk) 23:18, 17 December 2024 (UTC)

File:May-Li Khoe headshot 5.jpg looks like it was from the same photo session. Could have been touched up, but probably not AI. Apocheir (talk) 02:43, 18 December 2024 (UTC)

Ok, that one I believe, so I guess I have to believe the other one as well. Thanks for finding this! —David Eppstein (talk) 05:55, 18 December 2024 (UTC)

How can I help?

Hi all- As a website owner that has been using ChatGPT for years, I believe I can spot signs of AI-generated content pretty quickly. I have a full-time job but would love to assist (to ensure the truth remains true and for my own personal development.)

Thanks! Chris Aisavestheworld (talk) 21:09, 2 January 2025 (UTC)

Hello! A good start would be to install Wikipedia:Twinkle, which allows you to tag articles (including, in this case, with the {{AI-generated}} tag). You can tag pages that you encounter, or look for new additions in Special:RecentChanges! If you see users adding AI-generated content with clear issues (which for now is the vast majority of visible AI-generated content), you can warn them with {{uw-ai1}}. Chaotic Enby (talk · contribs) 21:23, 2 January 2025 (UTC)

Thanks very much! I'll do that. Aisavestheworld (talk) 16:15, 6 January 2025 (UTC)

@Aisavestheworld: Also have a go at servicing the Category:Articles containing suspected AI-generated texts catgeory where they end up, to clean the stuff up and remove the article content entries. Be bold and remove the stuff if you see it. This is the greatest literary/encyclopeadic project since the Library of Alexandria, so its worth the time. If your in the NPP/AFC group, post it back on the NPP queue and anything else if you find its troublesome, for example if there is autopatrolled editor is who is using it. If its draft under the 90 day limit, then redraft it and put a clear reason why its been drafted. Speak to the editor and tell them why is not acceptable to post AI slop. Explain it clearly so they realise its not whats wanted, and tell them there is stormy weather ahead if they continue. Be soft, considerate, kind, responsive and helpful. But if you warning them and they don't comply after the four warnings, e.g. disruptive editing, send them to WP:ANI, or here where we can have a group chat e.g. coin. If it doesn't work, out then its ANI. It is far too early to use AI effectively, seems to be the wide consensus, although I think its probably going to be good for diagrams, for example medical diagrams, and physical illustrations but not BLP's portraits or any BLP. Hope that helps. scope_creep^Talk 16:48, 6 January 2025 (UTC)

Thank you @Scope creep - Can you help me get started here? I think I just need to know where to go and I can get started: "Category:Articles containing suspected AI-generated texts catgeory". Aisavestheworld (talk) 18:29, 6 January 2025 (UTC)

@Aisavestheworld: I never realised you've been only been on Wikipedia for a very short time. I would ignore that advice I gave you for at least a year or two until your well established. scope_creep^Talk 18:36, 6 January 2025 (UTC)

Understood. Thanks again! Aisavestheworld (talk) 18:40, 6 January 2025 (UTC)

Talk:Intelligent_design#Intelligent_Design_and_the_Law

I learned in this thread that there are AI bias checkers. My knee-jerk reaction is, for WP-purposes, kill with fire. Gråbergs Gråa Sång (talk) 21:29, 6 January 2025 (UTC)

AI-touched-up images?

Sofronio Vasquez currently uses the image File:Sofronio P. Vasquez III in 2025 (Enhanced) (3).png, which has the rubbery, weirdly lit appearance of AI-generated images, but was extracted from this youtube video and then "digitally enhanced". (I verified that the scene actually appears in the video.) I asked User:HurricaneEdgar, who touched it up, what "digitally enhanced" meant but he didn't respond. Are AI-touching-up tools available, and do they have the same issues as other AI generation? Apocheir (talk) 23:28, 16 January 2025 (UTC)

Yes, AI-enhancing/upscaling tools definitely exist. In this case, the article should be tagged with {{Upscaled images}}, and the file should be flagged on Commons with {{AI upscaled}}. On the English Wikipedia, it is preferable to use the original picture rather than any AI-upscaled version. @HurricaneEdgar, if you still have the original (non-enhanced) image, it could be helpful to upload it so it can be used instead. Chaotic Enby (talk · contribs) 00:21, 17 January 2025 (UTC)

The pre-ChatGPT era

We may want to be more explicit that text from before ChatGPT was publicly released is almost certainly not the product of an LLM. For example, an IP editor had tagged Hockey Rules Board as being potentially AI-generated when nearly all the same text was there in 2007. (The content was crap, but it was good ol' human-written crap!) Maybe add a bullet in the "Editing advice" section along the lines of "Text that was present in an article before December 2022 is very unlikely to be AI-generated." Apocheir (talk) 00:57, 25 October 2024 (UTC)

This is probably a good idea. I'm sure they were around before then, but definitely not publicly. Symphony Regalia (talk) 01:42, 25 October 2024 (UTC)

Definitely a good idea, also agree with this. Just added a slightly edited version of it to "Editing advice", feel free to adjust it if you wish! Chaotic Enby (talk · contribs) 01:59, 25 October 2024 (UTC)

So far, I haven’t seen anything that I thought could be GPT-2 or older. But I did run into a few articles that seem to make many of the same mistakes as ChatGPT, except a decade earlier.

If old pages like that could be mistaken for AI because it makes the mistakes that we look for in AI text, that does still mean that’s a problematic find; maybe we should recommend other cleanup tags for these cases. 3df (talk) 22:53, 25 October 2024 (UTC)

I think that's very likely an instance of "bad writing". Human brains have very often produced analogous surface-level results! Remsense ‥ 论 23:05, 25 October 2024 (UTC)

Yes, I have to say, ChatGPT's output is a lot like how a lot of first- or second-year undergraduate students write when they're not really sure if they have any ideas. Arrange some words into a nice order and hope. Stick an "in conclusion" on the end that doesn't say much. A lot of early content on Wikipedia was generated by exactly this kind of person. (Those people grew out of it; LLMs won't.) -- asilvering (talk) 00:31, 26 October 2024 (UTC)

I ran this text from 2017 version. GPT Zero said 1% chance of AI.

FIH was founded on 7 January 1924 in Paris by Paul Léautey, who became the first president, in response to field hockey's omission from the programme of the 1924 Summer Olympics. First members complete to join the seven founding members were Austria, Belgium, Czechoslovakia, France, Hungary, Spain and Switzerland. In 1982, the FIH merged with the International Federation of Women's Hockey Associations (IFWHA), which had been founded in 1927 by Australia, Denmark, England, Ireland, Scotland, South Africa, the United States and Wales. The organisation is based in Lausanne, Switzerland since 2005, having moved from Brussels, Belgium. Map of the World with the five confederations. In total, there are 138 member associations within the five confederations recognised by FIH. This includes Great Britain which is recognised as an adherent member of FIH, the team was represented at the Olympics and the Champions Trophy. England, Scotland and Wales are also represented by separate teams in FIH sanctioned tournaments. Graywalls (talk) 00:03, 6 November 2024 (UTC)

There's probably more bad than good writing on the Internet, and all LLMs have been extensively trained on all this bad writing, that's why they are prone to be like it 5.178.188.143 (talk) 14:23, 17 January 2025 (UTC)

Idea: one of four categories added to images automatically during upload

Hi everyone! I have been thinking for a while about a solution regarding separating AI-generated content and human-made content and came up with my idea about the four categories. I would like to ask for your opinions and especially ideas and solutions on commons:Commons talk:AI-generated media#Four categories for images! Kind regards, S. Perquin (talk) – 21:31, 24 January 2025 (UTC)

Provisional AI-generated analysis of documentation (January 2025)

On the Dutch-language Wikipedia, we are currently also having a discussion about the use of AI in Wikipedia articles, including the use of AI-generated images. I have the impression that this topic is being considered more actively and thoroughly on the English-language Wikipedia than on the Dutch-language Wikipedia. If you don't mind, I would like to join the discussion here as well, to gather inspiration for the policy on the Dutch-language Wikipedia.

In any case, a great idea to keep documentation on which AI-generated images are removed and which are not. Because I was curious about what conclusion ChatGPT would draw from these findings, I gave it the following prompt: Analyze why AI-generated images were removed from the English-language Wikipedia based on this documentation and write only a conclusion in up to ten sentences. This was the outcome:

“

AI-generated images were removed from English-language Wikipedia articles due to concerns about their accuracy, quality, and relevance in non-AI contexts. Such images often fail to represent historical or real individuals authentically, risking misinformation. They can introduce errors or unrealistic portrayals that conflict with Wikipedia's goal of providing reliable, evidence-based content. Quality concerns also stem from inconsistencies in detail and fidelity, particularly in facial depictions or historical recreations. Where authentic photographs or verified illustrations exist, they are preferred, making AI-generated visuals unnecessary. The lack of proper sourcing for AI-generated content further complicates their verification, undermining Wikipedia's strict citation and credibility standards. Ethical considerations, such as misrepresentation of subjects or contexts, also played a role in their removal. The community has emphasized prioritizing verified, human-curated content over AI-generated media to maintain trust. Additionally, concerns about potential copyright ambiguities surrounding AI-generated works influenced this decision. These measures align with Wikipedia's principles of neutrality, reliability, and transparency in public knowledge dissemination.

”

One of the questions that comes to my mind is: aren't human-made images sometimes also just inaccurate, irrelevant and misleading? Take the picture on the right as an example. The article about the Secernosaurus provides this picture. However, according to a user on Wikimedia Commons, it would not be accurate because hindquarters would lack musculature and fat. If the same image were made by AI, and it is many times better, could and should it be in the article? Doesn't this apply to all dinosaurs in principle? They might have been bigger and skinnier than previously thought. The same likely applies to many other artistic impressions. Exoplanets and stars might also look different than we think. I'm curious about how you think about, for example, artistic impressions on the English-language Wikipedia. Kind regards, S. Perquin (talk) – 09:16, 25 January 2025 (UTC)

If human-made images are inaccurate, they should also be removed. We do have WP:PALEOART and WP:DINOART for reviewing reconstructions of extinct animals. If you believe that this image of Gryposaurus (not Secernosaurus, despite it being used there) is inaccurate, it should be submitted there for review and removed from the article. I haven't seen any AI-generated reconstructions of dinosaurs that are many times better than this slightly skinny hadrosaur and don't introduce blatant inaccuracies, but yes, on principle, we don't have any guidelines specifically excluding AI-images for paleoart reconstructions (or anywhere beyond BLPs). However, we also shouldn't give more latitude to errors in AI-generated images either, even if the process is often more error-prone and less consistent with the paleontological data than human reconstructions. Chaotic Enby (talk · contribs) 14:17, 25 January 2025 (UTC)

Apparently, this image has already been reviewed (thus the tag on Commons), with the consensus being that it's too slim but not terribly inaccurate. Still, I've replaced it with a more plump reconstruction. Chaotic Enby (talk · contribs) 14:29, 25 January 2025 (UTC)

I handle extinct buildings rather than extinct animals, but similar discussions arise as to whether we should use a photo or a drawing, with one side saying the photo should always be preferred, and my side saying such prejudice has little value. My example is the extinct Bronx Borough Hall for which we have good drawings, and poor contemporary photos, and my own photos of the remnants. I had no trouble pushing my opinion that the best drawing we had was the best illustration, and it seems to me every time, it will be a judgement call. There are general arguments for preferring plain photos over retouched photos, over paintings and drawings by people, over AI renderings, but when it comes down to cases, we have to decide as best we can among what's actually available. A good AI will surely beat a bad illustration from another source, if those are what are available. Jim.henderson (talk) 16:34, 29 January 2025 (UTC)

Discussion at Wikipedia talk:Large language models § LLM-generated content

You are invited to join the discussion at Wikipedia talk:Large language models § LLM-generated content, which is within the scope of this WikiProject. Chaotic Enby (talk · contribs) 11:24, 31 January 2025 (UTC)

Bot request discussion

I've opened a thread at Wikipedia:Bot requests/Archive 87#Bot to track usage of AI images in articles to suggest a bot that detects when AI and AI-upscaled images are being used in articles (not in any clever deductive way, just using the Commons categories), outputting a list in the style of the currently hand-crafted Wikipedia:WikiProject AI Cleanup/AI images in non-AI contexts.

If anybody has any thoughts on that or expertise to share, please drop by. Belbury (talk) 15:57, 22 January 2025 (UTC)

That could be great indeed! If the bot can directly add them to the page, it could be even more practical! Chaotic Enby (talk · contribs) 20:38, 22 January 2025 (UTC)

User:Vanderwaalforces has now kindly set up User:DreamRimmer's script to run as a bot update every Sunday, adding a list of AI-affected files to Wikipedia:WikiProject AI Cleanup/VWF bot log. I'll check in occasionally and see whether anything on there needs an {{upscaled images}} template, or adding to Wikipedia:WikiProject_AI_Cleanup/AI_images_in_non-AI_contexts. --Belbury (talk) 09:46, 3 February 2025 (UTC)

citeturn0search0

I deleted a couple of spam pages, likely AI-generated, and noticed that in both cases, each section of text ended in citeturn0search0 – anyone know where that comes from? I'm guessing some sort of AI tool, but don't know. When I tried googling it (didn't find anything particularly useful, BTW), that square symbol turned into a 'hamburger' stack; no idea what character it's actually meant to be. -- DoubleGrazing (talk) 08:55, 20 February 2025 (UTC)

Definitely an artefact of ChatGPT, and maybe other models. If I get an answer with grey button external links at the ends of sentences, those become turn0search0 when I click the "Copy" button to put the response into my clipboard. I've also found that if ChatGPT returns an answer with some example images at the top, those images become iturn0image0turn0image1turn0image4turn0image5.

I'm not seeing a huge amount of this out there on the web, so maybe it's just a recent bug in how ChatGPT's interface renders markup to the clipboard. Belbury (talk) 10:06, 20 February 2025 (UTC)

Thanks, good to know. -- DoubleGrazing (talk) 10:10, 20 February 2025 (UTC)

is there a way to state that only the lastest Version is ai

I think the latest edit on Quantum Markov chain is ai made based on how unsually long it is for one edit, the facts that none of the new references are normal cites and the fact that "citeturn0search0"(an ai artifact) is at the end Skeletons are the axiom (talk) 16:34, 26 February 2025 (UTC)

In that case, the best thing to do is to revert to the previous version. However, if someone has time and is knowledgeable in that domain, it could be helpful to take a look at the references (especially the third and fourth ones which are linked) to see if there's any material in the article that they support. Chaotic Enby (talk · contribs) 17:35, 26 February 2025 (UTC)

AI catchphrases

I'm thinking about having that page's title changed to something along the lines of [Signs or Indicators] of (likely) [AI or ChatGPT] authorship, but I can't decide which words should be used.

Signs or Indicators?
AI or ChatGPT?
Should likely be included?

If you have any better title ideas, feel free to share your alternative proposals. – MrPersonHumanGuy (talk) 14:40, 3 February 2025 (UTC)

AI (or LLM) should be better than ChatGPT, as we should also have catchphrases indicating other large language models. Best to also add "likely". Not sure about "Signs" vs "Indicators", both are good although "Signs" might be more concise. Chaotic Enby (talk · contribs) 12:39, 20 February 2025 (UTC)

"Signs", "AI" and "likely" are all good ideas.

I've just added a section on markup (the turn0search0 issue noted below, plus a ?utm_source=chatgpt.com one I just encountered for the first time), which seem worth tracking but definitely aren't "catchphrases". Belbury (talk) 17:27, 27 February 2025 (UTC)

Great job! Regarding ?utm_source=chatgpt.com, there was a discussion at Wikipedia talk:Large language models/Archive 7#LLM-generated content regarding making an edit filter for that purpose, although it hasn't lead to a concrete implementation yet. Chaotic Enby (talk · contribs) 17:35, 27 February 2025 (UTC)

Possible AI article?

a friend of mine notified me of this article 1 nm process, which they suspect might be written using an LLM. I am personally not good at figuring out this kind of stuff so I'm passing it on to here so that ppl here can check. ―Howard • 🌽³³ 00:23, 3 March 2025 (UTC)

Indeed. Nuked the parts that looked AI-generated (and were unsourced, anyway). Diverging Diamond (is Queen of Hearts's alt; talk) 00:27, 3 March 2025 (UTC)

RfC on banning AI-generated images

This was recently relisted with the broader scope. JoelleJay (talk) 22:15, 4 March 2025 (UTC)

Wikipedia:Computer-generated content listed at Requested moves

A requested move discussion has been initiated for Wikipedia:Computer-generated content to be moved to Wikipedia:AI-generated content. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. —RMCD bot 19:41, 5 March 2025 (UTC)

To opt out of RM notifications on this page, transclude {{bots|deny=RMCD bot}}, or set up Article alerts for this WikiProject.

Old Gods of Appalachia

I believe the episode summaries in Old Gods of Appalachia are AI generated. It looks like a large number of summaries were added in a single edit by an editor who has previously been warned for using AI generated content. It looks like someone else has also questioned whether it's AI generated content on the talk page. I'm looking for a second opinion, guidance on what to do, or assistance in cleaning it up. TipsyElephant (talk) 00:17, 16 March 2025 (UTC)

Some of them definitely sound like AI to me. In the first one alone: The narrative delves into, The prologue highlights the interconnectedness... Chaotic Enby (talk · contribs) 00:58, 16 March 2025 (UTC)

Likely AI contents scraping, but also likely public relations editing

This maybe of interest for members here Wikipedia:Conflict_of_interest/Noticeboard#User_Hifisamurai and https://commons.wikimedia.org/wiki/Special:Log/Hifisamurai Graywalls (talk) 09:24, 16 March 2025 (UTC)

Chatbot additions to VG (nerve agent)

This is being discussed by members of the chemistry project at WT:WikiProject Chemicals#Use of chatbot in VG (nerve_agent) but may be of wider interest. Please comment there, not here. Mike Turnbull (talk) 15:32, 16 March 2025 (UTC)

User rapidly creating long bios that GPTZero says are 100% probability AI-generated

Please see Special:Contributions/HRShami. I tested the first paragraph of Calin Belta § Career and the first paragraph of David L. Woodruff § Career and got a 100% AI-generated score from GPTZero in both cases, but the likelihood of AI generation is also suggested by the speed at which these articles are being generated. Sourcing quality is poor: many opinions about what the subjects have accomplished, mostly sourced to the publications of the subjects themselves; spot-checking the references in the Woodruff article found that they backed up maybe 1/3 of the claims in the text they purported to be references for. —David Eppstein (talk) 07:34, 27 February 2025 (UTC)

I have been writing articles pretty much the same way since pre-GPT era. It's a very standard Wikipedia way. The thought of checking my writing against GPTZero did not even occur to me because I absolutely despise AI generated writing. After your message I checked three articles on GPT Zero and it declared "moderately confident that writing is human" and "certainly human writing" on all three. In any writing, if you pick a very small part of it, no machine can tell correctly whether it is AI or human. You must check the whole writing. Even checking single paragraphs of my writing generated "human content" on GPT Zero for most of the paragraphs. If just one paragraph in an article with 8 or 9 paragraph returns AI Generated, with the rest of the paragraphs returning "Human Content", I think we should accept the writing as human content. I don't know what you mean by speed. I have written a total of 10 articles in February and edited one article completely. If I use AI, I can easily generate 10 articles a day. I might have misplaced references in the Woodruff article, which is a human error. Sometimes, other editors point out that the reference is not correct for the preceding information and I fix it with the correct reference. I asked ChatGPT to generate the same Woodruff article. I suggest you do the same. Even after multiple prompts, the article generated by ChatGPT was nowhere near my writing.HRShami (talk) 10:05, 27 February 2025 (UTC)

Please don't accuse people of using AI based on GPTZero -- it is often wrong, to the point that its wrongness has made the news. Especially, as the person above says, if you only test certain paragraphs. It also might be better to ask first if someone is using AI before making a public accusation -- I don't image you'd like it either if someone called your articles AI-generated. Mrfoogles (talk) 06:07, 26 March 2025 (UTC)

Elkmont, Alabama

I'm not sure where the threshold is for the outright removal of AI generated text. At Elkmont, Alabama, an editor has stated--when asked if they are using AI--"I am using something to help me edit the text". I reverted their edit twice, because the tone was extremely formal and out of line with Wikipedia's voice. The input of others would be appreciated! Thanks. Magnolia677 (talk) 15:26, 23 March 2025 (UTC)

In this case, I would say that WP:NOTEVERYTHING and WP:INDISCRIMINATE apply, and that it is reasonable to revert the edits. I mean, these are all delightful:

Farmers were diligently planting corn, with hopes for a bountiful harvest if conditions remained favorable, while wheat and oat crops showed promise. The cotton market was active, and concerns arose over potential losses in the peach crop due to recent frosts
T. O. Bridgforth celebrated his 55th birthday with a large family reunion and dinner, which was described as one of the most sumptuous meals enjoyed since the end of a severe drought
The article closed with lighthearted local anecdotes, including a humorous mix-up involving a wheelbarrow and an umbrella

but not remotely encyclopedic. There are also some instances of external URLs in the content body, which violates WP:NOELBODY. You might politely point them in the direction of WP:LLM too, and if they must continue to use an LLM assistant, to add well-cited encyclopedic content in smaller chunks, so that each addition may be considered on its own merit. Rather than one huge swathe of text. Cheers, SunloungerFrog (talk) 16:08, 23 March 2025 (UTC)

Went in and deleted some text with fake citations -- if someone adds unsourced content, you have the right to challenge it, and if they can't source it (and it's not "the sky is blue") then it is reasonable to remove it. I've had that happen to me before (it was annoying but you know, lacking a source, I didn't try to put it back). And at the point where it has fake citations like^[11], which could only have been added by an AI, it is definitely reasonable to delete it. Mrfoogles (talk) 06:15, 26 March 2025 (UTC)

If they continue to add the same unsourced content, that sounds like WP:Disruptive editing. See that page for guidance with how to deal with it. Mrfoogles (talk) 06:16, 26 March 2025 (UTC)

Free play

Do you think that Free play is AI- generated? See Talk:Free play for more context. GenericUser24 (talk) 01:46, 27 March 2025 (UTC)

It's possible, but it's also possibly a certain sociology/psychology style (that corpus might be where llms gets some of their flair). Both possibilities are likely due to how the article seems to have been written as an essay, rather than built from sources. The resulting tonal issues have already been raised on the talkpage. CMD (talk) 06:03, 27 March 2025 (UTC)

Passive or active cleanup?

I'm interested and excited to help with this effort. I'm curious how folks here practice AI cleanup. Do you actively look for AI slop or are you passively aware of it while doing other tasks?

I spent some time this AM reviewing Special:RecentChanges expecting to find more instances of potentially AI generated content given the lengthy policy discussions on Village pump. I'm in tune with some of the quirks and language tendencies of popular chat models in other context so I guess I was surprised not to find anything obvious. I'm not an experienced editor by any means... Does anyone have any tips related to visual queues they look for in edit history summaries that merit a closer look? Zentavious (talk) 14:44, 20 March 2025 (UTC)

I would say I'm doing a mix of passive cleanup (cleaning it up while doing other tasks such as new page patrolling), semi-active cleanup (cleaning articles reported by other users as potentially AI-generated), and behind-the-scenes technical work. Regarding history and edit summary alone, there's often less to work with, but two clues are long, structured edit summaries (often generated by LLMs, although humans can also take care of writing good edit summaries!), and repeated long additions by the same user in a short time, especially on different articles. That last one is particularly telling: if the same editor makes 5000 bytes additions every five minutes, they likely haven't written everything by themselves. Chaotic Enby (talk · contribs) 17:37, 20 March 2025 (UTC)

Thank you much for the tips. The structured summaries note is a great suggestion. Cheers, Zentavious (talk) 14:29, 25 March 2025 (UTC)

If you're trying to find suspicious articles more easily, Category:Articles_containing_suspected_AI-generated_texts is a good place to start. In a sense I guess it's a combination of active and passive -- passively, articles are tagged, and people who feel like being active try to fix them. I'm not surprised, given AI isn't that common, that you didn't find much at recent changes, though. Mrfoogles (talk) 06:11, 26 March 2025 (UTC)

Is the tag intended to only mark AI content that is not acceptable and or constructive? Or is it intended to disclose the use of AI universally, including above the bar AI-assisted edits? Zentavious (talk) 13:49, 27 March 2025 (UTC)