It was tested whether scientists could distinguish between summaries of scientific papers generated by GPT and those written by a human. The AI has proven to be surprisingly reliable.
No wonder the average Kowalski can easily fall for a text written by artificial intelligence. In the flood of content we read every day, we rarely focus on the details and distinctive patterns that tell an AI publication. However, GPT turns out to be effective enough to fool even medical scientists. And that’s kind of scary.
GPT is getting harder and harder to surprise
We already know that teachers are losing the battle against Chat GPT, which is used diligently by students when writing papers using a text generator. The robot gets “smarter” with each question and gets better at even tricky problems.
Riding the wave of the tool’s popularity, it was decided to test it in somewhat hermetically sealed conditions to see if professional scientists would be able to distinguish between human-written research abstracts and those generated by a robot in a matter of seconds. Well, the results of the experiment are surprising.
AI Beats Reviewers of Medical Publications
Researchers at Northwestern University used GPT to generate 50 abstracts styled after professional publications from medical journals. They invited four reviewers to collaborate on the task of selecting human-written texts and rejecting those created by a bot. Participants in the experiment were divided into two groups of two – each receiving 25 texts – and were randomly assigned abstracts to alternately check the fake and real versions.
The reviewers managed to detect only 68% of fake AI-generated texts, while the recognition efficiency of real abstracts was 86%. This means that 32% of the texts written by GPT were considered to be human work and 14% of the actual abstracts were considered to be AI-generated text. It is worth noting that it was not a random collection, but specialists in their fields who review publications in scientific journals.
The participants testified that it was very difficult to distinguish the abstracts, mainly due to the specialized vocabulary, precise data and precise numbers that they did not expect from the AI. That’s right, “expectation” is a strong phrase, because the scientists were informed of the rules in advance, which made them more suspicious. Probably, if it weren’t for this fact, they would have given even more wrong verdicts.
The study raised many concerns among both participants and organizers. Since specialists in their field have so much trouble distinguishing AI texts from those coming from humans, what is an ordinary bread eater ignorant of industry nomenclature to say. Researchers fear that the lack of regulation of text generators may contribute to the mass distribution of fake news and fabricated medical documents.
However, he sees certain advantages in the use of artificial intelligence. They believe that artificial intelligence could – somewhat in spite of itself – verify scientific publications written by robots or simply detect plagiarism faced by universities. There are also voices supporting the position that artificial intelligence could write summaries – without real scientific value – of articles written by humans, which would make the work of researchers easier, but this possibility is not yet fully accepted by the academic community.
Stock image from Depositphotos