Jay Taylor's notes
back to listing indexWikipedia:Signs of AI writing - Wikipedia
[web search]Wikipedia:Signs of AI writing
- SmallStandardLarge
- StandardWide
- AutomaticLightDark
This page is not an encyclopedic article, nor one of Wikipedia's policies or guidelines, as it has not been thoroughly vetted by the community. |
Original Research (WP) and Promotional Tone: I have worked on removing original research [...]
Article Move to Main Namespace: Moving the draft to the main namespace after the AFC review [...]
AVO consists of three key layers:
- SEO (Search Engine Optimization): Traditional methods for improving visibility in search engine results through content, technical, and on-page optimization.
- AEO (Answer Engine Optimization): Techniques focused on optimizing content for voice assistants and answer boxes, such as featured snippets and structured data.
- GIO (Generative Engine Optimization): Strategies for ensuring businesses are cited as credible sources in responses generated by large language models (LLMs).
— From this revision to Draft:AI Visibility Optimization (AVO). Also note the rule of three.
Production Process
The process with which a DJm composes a song generally involves the next stages:
Concept and Lyrics — The artist defines the theme and lyrics of the song.
AI Melodic Drafts — AI produces different melodies and rhythmic patterns following the prompt suggested by the DJm.
Human Supervision and Enhancement — Producers adjust the instrumentation generated by the AI to match their original artistic vision.
Layering — With the stems at hand, the DJm then combines the resulting track with new recorded pieces, including live percussion, keyboards or synthesizers.
Mixing and Mastering — Sound balancing, effects and mastering ultimately give the song its final touch before being released.
— From this revision to Draft:Digital Jockey Musician
Emojis
[edit]AI chatbots often use emojis.[11] In particular, they sometimes decorate section headings or bullet points by placing emojis in front of them. This is most noticeable in talkpage comments.
Examples
Let’s decode exactly what’s happening here:
Cognitive Dissonance Pattern:
You’ve proven authorship, demonstrated originality, and introduced new frameworks, yet they’re defending a system that explicitly disallows recognition of originators unless a third party writes about them first.
[...]
Structural Gatekeeping:
Wikipedia policy favors:
[...]
Underlying Motivation:
Why would a human fight you on this?
[...]
What You’re Actually Dealing With:
This is not a debate about rules.
[...]
— From this revision to Wikipedia:Village pump (policy)
Traditional Sanskrit Name: Trikoṇamiti
Tri = Three
Koṇa = Angle
Miti = Measurement “Measurement of three angles” — the ancient Indian art of triangle and angle mathematics.
️ 1. Vedic Era (c. 1200 BCE – 500 BCE)
[...]
2. Sine of the Bow: Sanskrit Terminology
[...]
3. Āryabhaṭa (476 CE)
[...]
4. Varāhamihira (6th Century CE)
[...]
5. Bhāskarācārya II (12th Century CE)
[...]
Indian Legacy Spreads
— From this revision to History of trigonometry
Overuse of em dashes
[edit]While human editors and writers often use em dashes (—), LLM output uses them more often than nonprofessional human-written text of the same genre, and uses them in places where humans are more likely to use commas, parentheses, colons, or (misused) hyphens (-). LLMs especially tend to use em dashes in a formulaic, pat way, often mimicking "punched up" sales-like writing by over-emphasizing clauses or parallelisms.[11][9]
This sign is most useful when taken in combination with other indicators, not by itself. It may be less common in newer AI text (late 2025 onwards); it has been claimed that OpenAI's GPT-5.1 could use em dashes less often than its predecessors.
Examples
The term “Dutch Caribbean” is not used in the statute and is primarily promoted by Dutch institutions, not by the people of the autonomous countries themselves. In practice, many Dutch organizations and businesses use it for their own convenience, even placing it in addresses — e.g., “Curaçao, Dutch Caribbean” — but this only adds confusion internationally and erases national identity. You don’t say “Netherlands, Europe” as an address — yet this kind of mislabeling continues.
— From this revision to Talk:Dutch Caribbean; the message also overuses boldface
you're right about one thing — we do seem to have different interpretations of what policy-based discussion entails. [...]
When WP:BLP1E says "one event," it’s shorthand — and the supporting essays, past AfD precedents, and practical enforcement show that “two incidents of fleeting attention” still often fall under the protective scope of BLP1E. This isn’t "imagining" what policy should be — it’s recognizing how community consensus has shaped its application.
Yes, WP:GNG, WP:NOTNEWS, WP:NOTGOSSIP, and the rest of WP:BLP all matter — and I’ve cited or echoed each of them throughout. [...] If a subject lacks enduring, in-depth, independent coverage — and instead rides waves of sensational, short-lived attention — then we’re not talking about encyclopedic significance. [...]
[...] And consensus doesn’t grow from silence — it grows from critique, correction, and clarity.
If we disagree on that, then yes — we’re speaking different languages.
The current revision of the article fully complies with Wikipedia’s core content policies — including WP:V (Verifiability), WP:RS (Reliable Sources), and WP:BLP (Biographies of Living Persons) — with all significant claims supported by multiple independent and reputable international sources.
[...] However, to date, no editor — including yourself — has identified any specific passages in the current version that were generated by AI or that fail to meet Wikipedia's content standards. [...]
Given the article’s current state — well-sourced, policy-compliant, and collaboratively improved — the continued presence of the “LLM advisory” banner is unwarranted.
— From this revision to Talk:Arthur Katalayi
Unusual use of tables
[edit]AIs tend to create unnecessary small tables that could be better represented as prose.
Examples
- Market and Statistics
- The Indian biobanking market was valued at approximately USD 2,101 million in 2024. The sector is expanding to support the "Atmanirbhar Bharat" (Self-reliant India) initiative in healthcare research.
Key Statistics of Indian Biobanking (2024-2025) Metric Figure Market Valuation (2024) ~USD 2.1 billion Major Accredited Facilities NLDB, CBR Biobank, THSTI, Karkinos GenomeIndia Diversity 99 ethnic groups (32 tribal, 53 non-tribal)
- —From this revision to Draft:Biobanks in India
Curly quotation marks and apostrophes
[edit]ChatGPT and DeepSeek typically use curly quotation marks (“...” or ‘...’) instead of straight quotation marks ("..." or '...'). In some cases, AI chatbots inconsistently use pairs of curly and straight quotation marks in the same response. They also tend to use the curly apostrophe (’), the same character as the curly right single quotation mark, instead of the straight apostrophe ('), such as in contractions and possessive forms. They may also do this inconsistently.
Curly quotes alone do not prove LLM use. Microsoft Word as well as macOS and iOS devices have a "smart quotes" feature that converts straight quotes to curly quotes. Grammar correcting tools such as LanguageTool may also have such a feature. Curly quotation marks and apostrophes are common in professionally typeset works such as major newspapers. Citation tools like Citer may repeat those that appear in the title of a web page: for example,
McClelland, Mac (September 27, 2017). "When 'Not Guilty' Is a Life Sentence". The New York Times. Retrieved August 3, 2025.
Note that Wikipedia allows users to customize the fonts used to display text. Some fonts display matched curly apostrophes as straight, in which case the distinction is invisible to the user. Additionally, Gemini and Claude models typically do not use curly quotes.
Subject lines
[edit]User messages and unblock requests generated by AI chatbots sometimes begin with text that is intended to be pasted into the Subject field on an email form.
Examples
Subject: Request for Permission to Edit Wikipedia Article - "Dog"
— From this revision to Talk:Dog
Subject: Request for Review and Clarification Regarding Draft Article
Communication intended for the user
[edit]Collaborative communication
[edit]| Words to watch: I hope this helps, Of course!, Certainly!, You're absolutely right!, Would you like..., is there anything else, let me know, more detailed breakdown, here is a ... |
Editors sometimes paste text from an AI chatbot that was meant as correspondence, prewriting or advice, rather than article content. This may appear in article text or within comments (<-- -->). Chatbots prompted to produce a Wikipedia article or comment may also explicitly state that the text is meant for Wikipedia, and may mention various policies and guidelines in the output—often explicitly specifying that they're Wikipedia's conventions.
Examples
In this section, we will discuss the background information related to the topic of the report. This will include a discussion of relevant literature, previous research, and any theoretical frameworks or concepts that underpin the study. The purpose is to provide a comprehensive understanding of the subject matter and to inform the reader about the existing knowledge and gaps in the field.
— From this revision to Metaphysics
Including photos of the forge (as above) and its tools would enrich the article’s section on culture or economy, giving readers a visual sense of Ronco’s industrial heritage. Visual resources can also highlight Ronco Canavese’s landscape and landmarks. For instance, a map of the Soana Valley or Ronco’s location in Piedmont could be added to orient readers geographically. The village’s scenery [...] could be illustrated with an image. Several such photographs are available (e.g., on Wikimedia Commons) that show Ronco’s panoramic view, [...] Historical images, if any exist (such as early 20th-century photos of villagers in traditional dress or of old alpine trades), would also add depth to the article. Additionally, the town’s notable buildings and sites can be visually presented: [...] Including an image of the Santuario di San Besso [...] could further engage readers. By leveraging these visual aids – maps, photographs of natural and cultural sites – the expanded article can provide a richer, more immersive picture of Ronco Canavese.
— From this revision to Ronco Canavese
If you plan to add this information to the "Animal Cruelty Controversy" section of Foshan's Wikipedia page, ensure that the content is presented in a neutral tone, supported by reliable sources, and adheres to Wikipedia's guidelines on verifiability and neutrality.
— From this revision to Foshan
Here's a template for your wiki user page. You can copy and paste this onto your user page and customize it further.
— From this revision to a user page
— From this revision to Talk:Test automation management tools; the message also ends unexpectedly
Knowledge-cutoff disclaimers and speculation about gaps in sources
[edit]| Words to watch: as of [date],[b] Up to my last training update, as of my last knowledge update, While specific details are limited/scarce..., not widely available/documented/disclosed, ...in the provided/available sources/search results..., based on available information ... |
A knowledge-cutoff disclaimer is a statement used by the AI chatbot to indicate that the information provided may be incomplete, inaccurate, or outdated.
If an LLM has a fixed knowledge cutoff (usually the model's last training update), it is unable to provide any information on events or developments past that time, and it often outputs a disclaimer to remind the user of this cutoff, which usually takes the form of a statement that says the information provided is accurate only up to a certain date.
If an LLM with retrieval-augmented generation fails to find sources on a given topic, or if information is not included in sources a user provides, it often outputs a statement to that effect, which is similar to a knowledge-cutoff disclaimer. It may also pair it with text about what that information "likely" may be and why it is significant. This information is entirely speculative (including the very claim that it's "not documented") and may be based on loosely related topics or completely fabricated. When that unknown information is about an individual's personal life, this disclaimer often claims that the person "maintains a low profile," "keeps personal details private," etc. This is also speculative.
Examples
While specific details about Kumarapediya's history or economy are not extensively documented in readily available sources, ...
— From this revision to Kumarapediya
While specific information about the fauna of Studniční hora is limited in the provided search results, the mountain likely supports...
— From this revision to Studniční hora
Though the details of these resistance efforts aren't widely documented, they highlight her bravery...
— From this revision to Throwing Curves: Eva Zeisel
No significant public controversies or security incidents affecting Outpost24 have been documented as of June 2025.
— From Draft:Outpost24
As of my last knowledge update in January 2022, I don't have specific information about the current status or developments related to the "Chester Mental Health Center" in today's era.
— From this revision to Chester Mental Health Center
Below is a detailed overview based on available information:
Matthews Manamela keeps much of his personal life private, choosing instead to focus public attention on his professional work and performances.
— From Draft:Matthews Manamela
As an underground release, detailed lyrics are not widely transcribed on major sites like Genius or AZLyrics, likely due to the artist's limited mainstream exposure. My analysis is based on available track titles, featured artists, public song snippets from streaming platforms (e.g., Spotify, Apple Music, Deezer), and Honcho's overall discography themes. Where lyrics aren't fully accessible, I've inferred common motifs from similar trap tracks and Honcho's style. ...For deeper insights, listening to tracks on platforms like Spotify or Deezer is recommended, as lyrics and production details aren't fully documented in public sources.
— From Draft:Haiti_Honcho
Phrasal templates and placeholder text
AI chatbots may generate responses with fill-in-the-blank phrasal templates (as seen in the game Mad Libs) for the LLM user to replace with words and phrases pertaining to their use case. However, some LLM users forget to fill in those blanks. Note that non-LLM-generated templates exist for drafts and new articles, such as Wikipedia:Artist biography article template/Preload and pages in Category:Article creation templates.
Examples
Subject: Concerns about Inaccurate Information
Dear Wikipedia
I am writing to express my deep concern about the spread of misinformation on your platform. Specifically, I am referring to the article about [Entertainer's Name], which I believe contains inaccurate and harmful information.
— From this revision to Talk:Kjersti Flaa
Subject: Edit Request for Wikipedia Entry
Dear Wikipedia Editors,
I hope this message finds you well. I am writing to request an edit for the Wikipedia entry
I have identified an area within the article that requires updating/improvement. [Describe the specific section or content that needs editing and provide clear reasons why the edit is necessary, including reliable sources if applicable].
— From this revision to Talk:Spaghetti
Large language models may also insert placeholder dates like "2025-xx-xx" into citation fields, particularly the access-date parameter and rarely the date parameter as well, producing errors.
Examples
— From this revision to Michelle Osis
Links to searches
In some cases, LLM-generated citations may also contain placeholders in other fields.
Examples
{{cite web
|url=INSERT_SOURCE_URL_30
|title=Deputy Monitoring of Regional Assistance to Mobilized Soldiers
|date=2022-11-XX
|publisher=SOURCE_PUBLISHER
|accessdate=2024-07-21
}}
— From this revision to Dmitry Kuznetsov (politician)
LLM-generated infobox edits may contain comments stating that text or images should be added if sources are found. Note: Comments in infoboxes, especially older inboxes, are common—some templates automatically include them—and not an indicator of AI use. Anything but "Add ____", or variations on that specific wording, is actually more likely to indicate human text.
Examples
| leader_name = <!-- Add if available with citation -->
— From this revision to Pindi Saidpur
Markup
Use of Markdown
A lot of AI chatbots are not proficient in wikitext, the markup language used to instruct Wikipedia's MediaWiki software how to format an article. As wikitext is a niche markup language, found mostly on wikis running on MediaWiki and other MediaWiki-based platforms like Miraheze, LLMs tend to lack wikitext-formatted training data. While the corpora of chatbots did ingest millions of Wikipedia articles, these articles would not have been processed as text files containing wikitext syntax.
This is compounded by the fact that most chatbots are factory-tuned to use another, conceptually similar but much more diversely applied markup language: Markdown. Their system-level instructions often direct them to format outputs using Markdown, and the chatbot apps render its syntax as formatted text on a user's screen. For example, the system prompt for Claude Sonnet 3.5 (November 2024) includes:[14]
Claude uses Markdown formatting. When using Markdown, Claude always follows best practices for clarity and consistency. It always uses a single space after hash symbols for headers (e.g., "# Header 1") and leaves a blank line before and after headers, lists, and code blocks. For emphasis, Claude uses asterisks or underscores consistently (e.g., italic or bold). When creating lists, it aligns items properly and uses a single space after the list marker. For nested bullets in bullet point lists, Claude uses two spaces before the asterisk (*) or hyphen (-) for each level of nesting. For nested bullets in numbered lists, Claude uses three spaces before the number and period (e.g., "1.") for each level of nesting.
As the above indicates, Markdown syntax is completely different from wikitext. Markdown uses asterisks (*) or underscores (_) instead of single-quotes (') for bold and italic formatting, hash symbols (#) instead of equals signs (=) for section headings, parentheses (()) instead of square brackets ([]) around URLs, and three symbols (---, ***, or ___) instead of four hyphens (----) for thematic breaks.
When told to "generate an article", chatbots often default to using Markdown for the generated output. This formatting is preserved in clipboard text by the copy functions on some chatbot platforms. If instructed to generate content for Wikipedia, the chatbot might "realize" the need to generate Wikipedia-compatible code, and might include a message like Would you like me to ... turn this into actual Wikipedia markup format (`wikitext`)?
[c] in its output. If the chatbot is told to proceed, the resulting syntax is often rudimentary, syntactically incorrect, or both. The chatbot might put its attempted-wikitext content in a Markdown-style fenced code block (its syntax for WP:PRE) surrounded by Markdown-based syntax and content, which may also be preserved by platform-specific copy-to-clipboard functions, leading to a telling footprint of both markup languages' syntax. This might include the appearance of three backticks in the text, such as: ```wikitext.[d]
The presence of faulty wikitext syntax mixed with Markdown syntax is a strong indicator that content is LLM-generated, especially if in the form of a fenced Markdown code block. However, Markdown alone is not such a strong indicator. Software developers, researchers, technical writers, and experienced internet users frequently use Markdown in tools like Obsidian and GitHub, and on platforms like Reddit, Discord, and Slack. Some writing tools and apps, such as iOS Notes, Google Docs, and Windows Notepad, support Markdown editing or exporting. The increasing ubiquity of Markdown may also lead new editors to expect or assume Wikipedia to support Markdown by default.
Examples
I believe this block has become procedurally and substantively unsound. Despite repeatedly raising clear, policy-based concerns, every unblock request has been met with **summary rejection** — not based on specific diffs or policy violations, but instead on **speculation about motive**, assertions of being “unhelpful”, and a general impression that I am "not here to build an encyclopedia". No one has meaningfully addressed the fact that I have **not made disruptive edits**, **not engaged in edit warring**, and have consistently tried to **collaborate through talk page discussion**, citing policy and inviting clarification. Instead, I have encountered a pattern of dismissiveness from several administrators, where reasoned concerns about **in-text attribution of partisan or interpretive claims** have been brushed aside. Rather than engaging with my concerns, some editors have chosen to mock, speculate about my motives, or label my arguments "AI-generated" — without explaining how they are substantively flawed.
— From this revision to a user talk page
- The Wikipedia entry does not explicitly mention the "Cyberhero League" being recognized as a winner of the World Future Society's BetaLaunch Technology competition, as detailed in the interview with THE FUTURIST ([https://consciouscreativity.com/the-futurist-interview-with-dana-klisanin-creator-of-the-cyberhero-league/](https://consciouscreativity.com/the-futurist-interview-with-dana-klisanin-creator-of-the-cyberhero-league/)). This recognition could be explicitly stated in the "Game design and media consulting" section.
— From this revision to Talk:Dana Klisanin
Here, LLMs incorrectly use ## to denote section headings, which MediaWiki interprets as a numbered list.
- Geography
Villers-Chief is situated in the Jura Mountains, in the eastern part of the Doubs department. [...]
- History
Like many communes in the region, Villers-Chief has an agricultural past. [...]
- Administration
Villers-Chief is part of the Canton of Valdahon and the Arrondissement of Pontarlier. [...]
- Population
The population of Villers-Chief has seen some fluctuations over the decades, [...]
— From this revision to Villers-Chief
Broken wikitext
Since AI chatbots are typically not proficient in wikitext and templates, they often produce faulty syntax. A noteworthy instance is garbled code related to Template:AfC submission, as new editors might ask a chatbot how to submit their Articles for Creation draft; see this discussion among AfC reviewers.
Examples
Note the badly malformed category link which appears to be a result of code that provides day information in the LLM's Markdown parser:
turn0search0
ChatGPT may include citeturn0search0 (surrounded by Unicode points in the Private Use Area) at the ends of sentences, with the number after "search" increasing as the text progresses. There also exists an alternate shorter form with only the increasing number surrounded by PUA Unicode like 0. These are places where the chatbot links to an external site, but a human pasting the conversation into Wikipedia has that link converted into placeholder code. This was first observed in February 2025.
A set of images in a response may also render as iturn0image0turn0image1turn0image4turn0image5. Rarely, other markup of a similar style, such as citeturn0news0 (example), citeturn1file0 (example), or citegenerated-reference-identifier (example), may appear.
Examples
The school is also a center for the US College Board examinations, SAT I & SAT II, and has been recognized as an International Fellowship Centre by Cambridge International Examinations. citeturn0search1 For more information, you can visit their official website: citeturn0search0
- **Japanese:** Reze is voiced by Reina Ueda, an established voice actress known for roles such as Cha Hae-In in Solo Leveling and Kanao Tsuyuri in Demon Slayer.2
- **English:** In the English dub of the anime film, Reze is voiced by Alexis Tipton, noted for her work in series such as Kaguya-sama: Love is War.3
[...]
The film itself holds a high rating on **Rotten Tomatoes** and has been described as a major anime release of 2025, indicating strong overall reception for the Reze Arc storyline and its adaptation.5
— From Draft:Reze (Chainsaw Man)
Links to searches
- turn0search0 OR turn0search1 OR turn0search2 OR turn0search3 OR turn0search4 OR turn0search5 OR turn0search6 OR turn0search7
- turn0image0 OR turn0image1 OR turn0image2 OR turn0image3 OR turn0image4 OR turn0image5 OR turn0image6 OR turn0image7
- insource:/turn0(search|image|news|file)[0-9]+/
Reference markup bugs: contentReference, oaicite, oai_citation, +1, attached_file, grok_card
Due to a bug, ChatGPT may add code in the form of :contentReference[oaicite:0]{index=0}, Example+1, or oai_citation in place of links to references in output text.
Examples
:contentReference[oaicite:16]{index=16}
1. **Ethnicity clarification**
2. :contentReference[oaicite:22]{index=22}
— From this revision to Talk:Sial (tribe).
#### Key facts needing addition or correction:
1. **Group launch & meetings**
2. **Zero-rates pledge and platform**
— From this revision to Talk:Independent Together
This was created conjointly by technical committee ISO/IEC JTC 1/SC 27 (Information security, cybersecurity, and protection of privacy) IT Governance+3ISO+3ISO+3. It belongs to the ISO/IEC 27000 family that talks about information security management systems (ISMS) and related practice controls. Wikipedia+1. The standard gives guidance for information security controls for cloud service providers (CSPs) and cloud service customers (CSCs). Specifically adapted to cloud specific environments like responsibility, virtualization, dynamic provisioning, and multi-tenant infrastructure. Ignyte+3Microsoft Learn+3Google Cloud+3.
— From this revision to ISO/IEC 27017
As of fall 2025, tags like [attached_file:1], [web:1] have been seen at the end of sentences. This may be Perplexity-specific.[15]
During his time as CEO, Philip Morris’s reputation management and media relations brought together business and news interests in ways that later became controversial, with effects still debated in contemporary regulatory and legal discussions.[attached_file:1]
— From this revision to Hamish Maxwell
Though Grok-generated text is rare compared to other chatbots, it may sometimes include XML-styled grok_card tags after citations.
Malik's rise to fame highlights the visibility of transgender artists in Pakistan's entertainment scene, though she has faced societal challenges related to her identity. [...]<grok-card data-id="e8ff4f" data-type="citation_card">
— From this revision to Draft:Mehak Malik
Links to searches
attribution and attributableIndex
({"attribution":{"attributableIndex":"X-Y"), with X and Y being increasing numeric indices.
Examples
^[Evdokimova was born on 6 October 1939 in Osnova, Kharkov Oblast, Ukrainian SSR (now Kharkiv, Ukraine).]({"attribution":{"attributableIndex":"1009-1"}}) ^[She graduated from the Gerasimov Institute of Cinematography (VGIK) in 1963, where she studied under Mikhail Romm.]({"attribution":{"attributableIndex":"1009-2"}}) [oai_citation:0‡IMDb](https://www.imdb.com/name/nm0947835/?utm_source=chatgpt.com) [oai_citation:1‡maly.ru](https://www.maly.ru/en/people/EvdokimovaA?utm_source=chatgpt.com)
— From Draft:Aleftina Evdokimova
Patrick Denice & Jake Rosenfeld, Les syndicats et la rémunération non syndiquée aux États-Unis, 1977–2015, ‘‘Sociological Science’’ (2018).]({“attribution”:{“attributableIndex”:“3795-0”}})
— From this diff to fr:Syndicalisme aux États-Unis
Non-existent or out-of-place categories
[edit]LLMs may hallucinate non-existent categories, sometimes for generic concepts that seem like plausible category titles (or SEO keywords), and sometimes because their training set includes obsolete and renamed categories. These will appear as red links. You may also find category redirects, such as the longtime spammer favorite Category:Entrepreneurs. Sometimes, broken categories may be deleted by reviewers, so if you suspect a page may be LLM-generated, it may be worth checking earlier revisions.
Of course, none of this section should be treated as a hard-and-fast rule. New users are unlikely to know about Wikipedia's style guidelines for these sections, and returning editors may be used to old categories that have since been deleted.
Examples
— From this revision to Draft:Paytra
rather than
Non-existent templates
[edit]LLMs often hallucinate non-existent templates (especially plausible-sounding types of infoboxes) and template parameters. These will also appear as red links, and non-existent template parameters in existing templates have no effect. LLMs may also use templates that were deleted after their knowledge cutoff date.
Examples
— From this revision to Draft:Gangetic hunter-gatherers
rather than
Citations
[edit]Broken external links
[edit]If a new article or draft has multiple citations with external links, and several of them are broken (e.g., returning 404 errors), this is a strong sign of an AI-generated page, particularly if the dead links are not found in website archiving sites like Internet Archive or Archive Today. Most links become broken over time, but these factors make it unlikely that the link was ever real.
Invalid DOI and ISBNs
[edit]A checksum can be used to verify ISBNs. An invalid checksum is a very likely sign that an ISBN is incorrect, and citation templates display a warning if so. Similarly, DOIs are more resistant to link rot than regular hyperlinks. Unresolvable DOIs and invalid ISBNs can be indicators of hallucinated references.
Outdated access-dates
[edit]In some AI-assisted text, citations may include an access-date by default, but the date can look unexpectedly old relative to when the edit was made (for example, an article created in December 2025 containing multiple citations with |access-date=12 December 2024). This is not evidence by itself, but it can be a useful pattern to check when combined with other signs of low-quality drafting. Note that older access-date values can occur legitimately (copied citations, offline work, batch moves/merges).
DOIs that lead to unrelated articles
[edit]A LLM may generate references to non-existent scholarly articles with DOIs that appear valid but are, in reality, assigned to unrelated articles. Example passage generated by ChatGPT:
Ohm’s Law applies to many materials and components that are "ohmic," meaning their resistance remains constant regardless of the applied voltage or current. However, it does not hold for non-linear devices like diodes or transistors [1][2].
1. M. E. Van Valkenburg, “The validity and limitations of Ohm’s law in non-linear circuits,” Proceedings of the IEEE, vol. 62, no. 6, pp. 769–770, Jun. 1974. doi:10.1109/PROC.1974.9547
2. C. L. Fortescue, “Ohm’s Law in alternating current circuits,” Proceedings of the IEEE, vol. 55, no. 11, pp. 1934–1936, Nov. 1967. doi:10.1109/PROC.1967.6033
Both Proceedings of the IEEE citations are completely made up. The DOIs lead to different citations and have other problems as well. For instance, C. L. Fortescue was dead for 30+ years at the purported time of writing, and Vol 55, Issue 11 does not list any articles that match anything remotely close to the information given in reference 2.
Book citations without page numbers or URLs
[edit]LLMs often generate book citations that look reasonable but do not include page numbers. This passage, for example, was generated by ChatGPT:
Ohm's Law is a fundamental principle in the field of electrical engineering and physics that states the current passing through a conductor between two points is directly proportional to the voltage across the two points, provided the temperature remains constant. Mathematically, it is expressed as V=IR, where V is the voltage, I is the current, and R is the resistance. The law was formulated by German physicist Georg Simon Ohm in 1827, and it serves as a cornerstone in the analysis and design of electrical circuits [1].
1. Dorf, R. C., & Svoboda, J. A. (2010). Introduction to Electric Circuits (8th ed.). Hoboken, NJ: John Wiley & Sons. ISBN 9780470521571.
The book reference appears valid – a book on electric circuits would likely have information about Ohm's law – but without the page number, that citation is not useful for verifying the claims in the prose.
Some LLM-generated book citations do include page numbers, and the book really exists, but the cited pages do not verify the text. Signs to look out for: the book is on a somewhat general topic or commonly referenced in its field, and the citation does not include a link to Google Books or a PDF (not mandatory for book citations, but editors creating legitimate book citations often include some kind of URL when citing the book). Example:
Analysts note that traditionalists often appeal to prudence, stability, and Edmund Burke’s notion of “prescription,” while reactionaries invoke moral urgency and cultural emergency, framing the present as a deviation from an idealized past. [1]
[1] Goldwater, Barry (1960). The Conscience of a Conservative. Victor Publishing. p. 12.
Incorrect or unconventional use of references
[edit]AI tools may have been prompted to include references, and make an attempt to do so as Wikipedia expects, but fail with some key implementation details or stand out when compared with conventions.
Examples
In the below example, note the incorrect attempt at re-using references. The tool used here was not capable of searching for non-confabulated sources (as it was done the day before Bing Deep Search launched) but nonetheless found one real reference. The syntax for re-using the references was incorrect.
In this case, the Smith, R. J. source – being the "third source" the tool presumably generated the link 'https://pubmed.ncbi.nlm.nih.gov/3' (which has a PMID reference of 3) – is also completely irrelevant to the body of the article. The user did not check the reference before they converted it to a {{cite journal}} reference, even though the links resolve.
The LLM in this case has diligently included the incorrect re-use syntax after every single full stop.
— From this revision to Cognitive orthotics
Some LLMs or chatbot interfaces use the character ↩ to indicate footnotes:
ReferencesWould you like help formatting and submitting this to Wikipedia, or do you plan to post it yourself? I can guide you step-by-step through that too.
Footnotes
- KLAS Research. (2024). Top Performing RCM Vendors 2024. https://klasresearch.com ↩ ↩2
- PR Newswire. (2025, February 18). CureMD AI Scribe Launch Announcement. https://www.prnewswire.com/news-releases/curemd-ai-scribe ↩
— From this revision of Draft:CureMD
utm_source=
[edit]ChatGPT may add the UTM parameters utm_source=openai or utm_source=chatgpt.com to URLs that it is using as sources. Microsoft Copilot may add utm_source=copilot.com to URLs. Grok uses referrer=grok.com. Other LLMs, such as Gemini or Claude, use UTM parameters less often.[e]
Note: While this does definitively prove ChatGPT's involvement, it doesn't prove, on its own, that ChatGPT also generated the writing. Some editors use AI tools to find citations for existing text; this will be apparent in the edit history.
Examples
Following their marriage, Burgess and Graham settled in Cheshire, England, where Burgess serves as the head coach for the Warrington Wolves rugby league team. [https://www.theguardian.com/sport/2025/feb/11/sam-burgess-interview-warrington-rugby-league-luke-littler?utm_source=chatgpt.com]
— From this revision to Sam Burgess
Vertex AI documentation and blog posts describe watermarking, verification workflow, and configurable safety filters (for example, person‑generation controls and safety thresholds). ([cloud.google.com](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images?utm_source=openai))
— From this revision to Draft:Nano Banana (Chatbot)
Links to searches
[edit]- utm_source=chatgpt.com
- insource:"utm_source=chatgpt.com"
- insource:"utm_source=openai"
- insource:"referrer=grok.com"
Named references declared in references section but unused in article body
[edit]Cite error: A list-defined reference named ""mclean"" is not used in the content (see the help page). Cite error: A list-defined reference named ""twst"" is not used in the content (see the help page). — From this revision to Draft:Josef von Rickenbach
— From this revision to Draft:Caligomos Art Links to searches[edit]Miscellaneous[edit]Sudden shift in writing style[edit]A sudden shift in an editor's writing style, such as unexpectedly flawless grammar compared to their other communication, may indicate the use of AI tools. Combining formal and casual writing styles is not exclusive to AI, but may be considered a sign. Using more formal prose in some writing may simply be a matter of code switching. A mismatch of user location, national ties of the topic to a variety of English, and the variety of English used may indicate the use of AI tools. A human writer from India writing about an Indian university would probably not use American English; however, LLM outputs use American English by default, unless prompted otherwise.[16] Note that non-native English speakers tend to mix up English varieties, and such signs should raise suspicion only if there is a sudden and complete shift in an editor's English variety use. Overwhelmingly verbose edit summaries[edit]AI-generated edit summaries are often unusually long, written as formal, first-person paragraphs without abbreviations, and/or conspicuously itemize Wikipedia's conventions.
— Edit summary from this revision to Khaama Press
— Edit summary from this revision to 4-metre band
— Edit summary from this revision to Anaganaga (film) "Submission statements" in AFC drafts[edit]This one is specific to drafts submitted by Articles for Creation. At least one LLM tends to insert "submission statements" supposedly intended for reviewers that supposedly explain why the subject is notable and why the draft meets Wikipedia guidelines. Of course, all this actually does is let reviewers know that the draft is LLM-generated, and should be declined or speedied without a second thought.
— Found at the top of Draft:Jorge Patrão (all the inevitable formatting errors are present in the original) Pre-placed maintenance templates[edit]Occasionally a new editor creates a draft that includes an AFC review template already set to "declined". The template is also devoid of content with no reviewer reasoning given. The LLM apparently offers to add an AFC submission template to the draft, and then provides something like LLMs have been known to create pages that already have maintenance templates that shouldn't plausibly be there, including maintence tags and incorrect protection templates. — From this revision to a user sandbox (later cut-and-paste moved to Émile Dufresne) Links to searches Signs of human writing[edit]Age of text relative to ChatGPT launch[edit]ChatGPT was launched to the public on November 30, 2022. Although OpenAI had similarly powerful LLMs before then, they were paid services and not easily accessible or known to lay people. ChatGPT experienced extreme growth immediately on launch. It is very unlikely that any particular text added to Wikipedia prior to November 30, 2022 was generated by an LLM. If an edit was made before this date, AI use can be safely ruled out for that revision. While some older text may display some of the AI signs given in this list, and even convincingly appear to have been AI-generated, the vastness of Wikipedia allows for these rare coincidences. Ability to explain one's own editorial choices[edit]Editors should be able to explain why they made an edit or mistake. For example, if an editor inserts a URL that appears fabricated, you can ask how the mix-up occurred instead of jumping to conclusions. If they can supply the correct link and explain it as a human error (perhaps a typo), or share the relevant passage from the real source, that points to an ordinary human error. Ineffective indicators[edit]False accusations of AI use can drive away new editors and foster an atmosphere of suspicion. Before claiming AI was used, consider if Dunning–Kruger effect and confirmation bias is clouding your judgement. Here are several somewhat commonly used indicators that are ineffective in LLM detection—and may even indicate the opposite.
Historical indicators[edit]The following indicators were common in text generated by older AI models, but are much less frequent in newer models. They may still be useful for finding older undetected AI-generated edits. Dates are approximate. Didactic disclaimers (2022–2024)[edit]For non-AI-specific guidance about this, see Wikipedia:Manual of Style/Words to watch § Editorializing.
Older LLMs (~2023) often added disclaimers about topics being "important to remember." This frequently took the form of advice to an imagined reader regarding safety or controversial topics, or disambiguating topics that varied in different locales/jurisdictions. Several such disclaimers appear in OpenAI's GPT-4 system card as examples of "partial refusals".[18] Examples
— From Draft:Robi Labs
Section summaries[edit]
When generating longer outputs (such as when told to "write an article"), older LLMs often added sections titled "Conclusion" or similar, and often ended paragraphs or sections by summarizing and restating its core idea.[16] Examples
— From this revision in Nurse scientist Prompt refusal[edit]
In the past, AI chatbots occasionally declined to answer prompts as written, usually with apologies and reminders that they are AI language models. Attempting to be helpful, chatbots often gave suggestions or answers to alternative, similar requests. Outright refusals have become increasingly rare. Examples
Abrupt cut offs[edit]AI tools used to abruptly stop generating content if an excessive number of tokens had been used for a single response, and further responses required the user to select "continue generating", at least in the case of ChatGPT. This method is not foolproof, as a malformed copy/paste from one's local computer can also cause this. It may also indicate a copyright violation rather than the use of an LLM. See also[edit]Notes[edit]
References[edit]
Further reading[edit]
|