ChatGPT could not master this test, experts warn that it may outdo humans soon

ChatGPT has raised the bar for what we believe machines can do since it was unveiled in November last year. Since then, it has passed the US Medical Licensing Exam and a Wharton MBA Exam. So is there anything that it can’t do better than humans? Yes, accounting as it turns out.

In a study published in the journal American Accounting Association, researchers put GPT 4-based ChatGPT to the test with accounting exam questions.

The study had 327 co-authors from across 186 institutions in 14 countries contributing 25,181 classroom accounting exam questions. They also recruited undergraduate students to feed another 2,268 textbook test bank questions to the chatbot.

“When this technology first came out, everyone was worried that students could now use it to cheat. But opportunities to cheat have always existed. So for us, we’re trying to focus on what we can do with this technology now that we couldn’t do before to improve the teaching process for faculty and the learning process for students. Testing it out was eye-opening,” said lead study author David Wood, a professor of accounting at Brigham Young University (BYU), in a press statement.

According to BYU, ChatGPT’s performance was impressive but students performed better. Students going on the test scored an overall average of 76.7 per cent compared to ChatGPT’s score of 47.7 per cent.

ChatGPT did score higher than the student average on 11.3 per cent of the questions, doing very well in the subjects of accounting information systems and auditing. But the AI bot did worse on tax, financial and managerial assessment. This could be because it struggled with the mathematical processes required for those subjects.

Also, when it comes to question type, ChatGPT did better when it came to true or false questions and multiple-choice questions. But it struggled with short-answer questions. The chatbot did worse on higher-order questions. Interestingly, it even provided authoritative written answers that were incorrect.

But despite this, the study authors believe that GPT-4 will perform much better on accounting exams, solving the issues that the previous version of ChatGPT struggled with.


