Exam submissions generated by artificial intelligence (AI) can not only evade detection but also earn higher grades than those submitted by university students, a real-world test has shown.
The findings come as concerns mount about students submitting AI-generated work as their own, with questions being raised about the academic integrity of universities and other higher education institutions.
It also shows even experienced markers could struggle to spot answers generated by AI, the University of Reading academics said.
Peter Scarfe, an associate professor at Reading’s School of Psychology and Clinical Language Sciences said the findings should serve as a “wake-up call” for educational institutions as AI tools such as ChatGPT become more advanced and widespread.
He said: “The data in our study shows it is very difficult to detect AI-generated answers.
“There has been quite a lot of talk about the use of so-called AI detectors, which are also another form of AI but (the scope here) is limited.”
For the study, published in the journal Plos One, Prof Scarfe and his team generated answers to exam questions using GPT-4 and submitted these on behalf of 33 fake students.
Exam markers at Reading’s School of Psychology and Clinical Language Sciences were unaware of the study.
Answers submitted for many undergraduate psychology modules went undetected in 94% of cases and, on average, got higher grades than real student submissions, Prof Scarfe said.
He said AI did particularly well in the first and second years of study but struggled more in the final year of study module.
Last year Russell Group universities, which include Oxford, Cambridge, Imperial College London and other top universities, pledged to allow ethical use of AI in teaching and assessments, with many others following suit.
But Prof Scarfe said the education sector will need to constantly adapt and update guidance as generative AI continues to evolve and become more sophisticated.
He said universities should focus on working out how to embrace the “new normal” of AI in order to enhance education.
Prof Scarfe added that reverting back to in-person sit-down exam assessments, would “be a step backwards in many ways”.
He said: “Many institutions have moved away from traditional exams to make assessment more inclusive.
“Our research shows it is of international importance to understand how AI will affect the integrity of educational assessments.
“We won’t necessarily go back fully to hand-written exams, but the global education sector will need to evolve in the face of AI.”
Study co-author Professor Etienne Roesch, of Reading’s School of Psychology and Clinical Language Sciences, added: “As a sector, we need to agree how we expect students to use and acknowledge the role of AI in their work.
“The same is true of the wider use of AI in other areas of life to prevent a crisis of trust across society.
“Our study highlights the responsibility we have as producers and consumers of information.
“We need to double down on our commitment to academic and research integrity.”
Comments & Moderation
Readers’ comments: You are personally liable for the content of any comments you upload to this website, so please act responsibly. We do not pre-moderate or monitor readers’ comments appearing on our websites, but we do post-moderate in response to complaints we receive or otherwise when a potential problem comes to our attention. You can make a complaint by using the ‘report this post’ link . We may then apply our discretion under the user terms to amend or delete comments.
Post moderation is undertaken full-time 9am-6pm on weekdays, and on a part-time basis outwith those hours.
Read the rules here