AI Scanners – How Reliable Are They?

How do AI detectors work? And how reliable are their results? For this guest post, scientific editor Merle-Sophie Lösing has tested the reliability of AI scanners and shares the results.

Detecting AI-Generated Texts

The creation of texts with AI has been booming for some time now. It’s no wonder – almost everyone has realized how convenient it is to have a text written with the help of AI. Whether it’s a blog post, scientific paper, or business plan, the results often sound so good that we’re almost convinced: No one can tell that this text wasn’t written by a human. But is that really the case? Are there tricks to reliably detect AI usage? Could AI detectors be the solution? The following blog post aims to answer these questions.

What Are AI Scanners and What Do They Check?

AI scanners or detectors focus on patterns. These tools are designed to determine if AI was involved in text creation. The principle behind it: The programs have been fed with numerous AI-generated texts and trained to recognize patterns. Typical for AI-generated content are certain text structures, specific words, word repetitions, sentence lengths, and various phrases and expressions. When working with AI, it’s often noticeable that certain formulations and phrases are repeatedly used, which humans wouldn’t typically use. Human language is more varied and contains more diversity. The continuity of the text is also examined. Typical of AI are logical jumps, false statements, and frequent summaries – especially when the text was generated with a set word count. It’s important to note: Scanners usually don’t answer the question of whether a text was AI-generated with a simple “yes” or “no,” but rather provide a probability in the form of a score.

No scanner provider advertises 100% accuracy. It’s about identifying tendencies. The basis for the assessment is algorithms and statistical models developed based on the training material. Interestingly, scanners often trigger when texts contain few or no linguistic errors – as if saying, “Mistakes are human.”

How Reliable Are AI Detectors?

Perhaps you’ve had this experience: You’ve written a text yourself, and yet the AI detector flags it and even assigns a relatively high score. This can unfortunately happen, especially with factual texts with technical content. In such cases, there’s often not much room for variation in formulations and technical vocabulary. The programs then assess the text as AI-generated – because the AI would likely write it almost the same way. The following example illustrates this issue:

If you were to describe in three sentences how a light bulb works, the content might look something like this:

A light bulb works by electric current passing through a thin wire inside the bulb. The wire heats up due to resistance, causing it to glow and emit light. The glass casing of the bulb protects the glowing wire from oxygen, preventing it from burning out and allowing the generated light to be emitted outward.

If we input this text into a common AI detector, it would likely trigger a strong response, being almost 100% sure that AI was involved:

AI Scanner finds 100% AI score
The AI detector ZeroGPT is convinced that the light bulb text was generated by AI.

However, there’s a simple way to lower the score: just introduce a few spelling errors into the text! For example, let’s examine this version:

A ligh buld work by electric current passin throgh a thin wire inside the buld. The wire heats up due to resistnce, causing it to glow and emit ligh. The glass casing of the buld protects the glowing wire from oxygen, preventing it from burning out and allowing the generated ligh to be emitted outward.

Here, several spelling errors have been introduced – and suddenly, the scanner is 100% sure that the text was written by a human:

AI Scanner finds 0% AI score
Minimal changes are enough for the AI detector to classify the text as “human written.”

This example shows that the reliability of AI detectors can certainly be questioned. How well a scanner can assess a text depends on the algorithms used and the quality of the training data. While various tests show that some scanners are significantly more reliable than others, in summary, we can state: There is no 100% accuracy in AI detection. However, it is more likely that human-written texts will be mistakenly identified as AI-generated than the other way around. In hybrid texts, where parts are written by AI and parts by humans, the detectors’ assessments are even less reliable.

Currently, there is little indication that AI detectors will become significantly more reliable in the future. The fundamental problem is that AI is also getting better at imitating human language. Therefore, blind trust in AI scanners is inappropriate. Even in academic contexts, such scanners may be used to identify tendencies, but legally proving the use of AI with them is not possible at this time.

It is also frequently mentioned that OpenAI could potentially embed invisible watermarks in texts generated by ChatGPT, allowing their origin to be traced. Although there are technical possibilities for this, the method is still in the testing phase, and it’s questionable whether paying subscribers would support this change.

Are Paid AI Scanners Better Than Free Versions?

There are now various AI detectors on the market, including many free solutions. However, research shows that no AI detector can be expected to have 100% accuracy, and none significantly outperforms the competition. Furthermore, the detectors are constantly being trained. As with AI tools, this means that the performance of individual programs is subject to constant change.

Paid scanners, however, often offer additional features and at least claim higher accuracy, as they are generally based on more extensive datasets and use more advanced algorithms. They are also often more convenient, as they, for example, allow a higher word count to be checked simultaneously. However: Don’t rely too much on the scanners. It’s better to thoroughly read the corresponding texts yourself and check if they sound like AI.

What Other Options Do I Have to Check a Text for AI?

In addition to conventional AI scanners, your own expertise can already provide a good assessment. To repeat – AI scanners work only with pattern recognition, and some of these patterns are easily recognizable by us. For example, you may have noticed while browsing recently that certain phrases appear repeatedly, even though you wouldn’t intuitively use them? This can be an indication that the corresponding texts were AI-generated. In the English-speaking world, it has been shown, for example, that the words “tapestry” and “reimagined” are increasingly appearing in texts. The correlation is striking, as studies have shown that GPT uses the word “reimagined” about 1,000 times more often than humans do in their texts. Phrases like “It is important to note that…” and “In summary, …” are increasingly common. If you notice such phrases, it could be a first clue that AI was used.

AI texts also often exhibit very high consistency – sentences are all similarly long, there are fewer complex sentences, and the choice of words varies little. Frequent word repetitions are another strong indication of AI usage. In contrast, you will rarely find grammatical and spelling errors or very original formulations in AI texts.

The content can also give clues about whether AI was used. Currently, AI often struggles to coherently explain topics and maintain a good reading flow. Everything sounds rather generic, and questions that arise while reading often go unanswered because only the initial question was addressed. There are also often jumps and contradictions, which indicate the use of AI. Occasionally, existing texts have been copied. Therefore, it can also be helpful to enter suspicious passages into search engines or simply ask an AI tool the questions formulated in the headings.

Dealing with AI detectors and AI texts requires some sensitivity. However, if you read a text that logically and linguistically convinces you and is formulated in a varied way, you can’t rule out that AI was used in its generation – but it likely underwent comprehensive revision.

If you feel unsure about checking for AI yourself, ACAD WRITE offers AI proofreading by experienced experts, where texts are optimized with human expertise.

This post is also available in German.

One Comment

  1. Pingback:Wie verlässlich sind KI-Detektoren? • KI Scanner im Test

Leave a Comment

Your email address will not be published. Required fields are marked *