Chatgpt Fails The Turing Test: Implications For Communicating Ideas Using Natural Language Processing

Mrs Emma Felman1

1Monash University, Melbourne, Australia

Biography:

Emma Felman is a Teaching Fellow in the Monash University Faculty of Law, and School of Medicine. She is currently a Member of the Victorian Voluntary Assisted Dying Review Board, the Australian Medical Council Cosmetic Surgery Accreditation Advisory Committee, the Consumer Affairs Victoria Motor Car Traders Claims Committee and HRECs at The University of Melbourne and Bendigo Health. Emma’s PhD thesis and research interests focus on the legal, ethical and philosophical impact of digital technologies on patient care in medicine. Previously, she worked as a litigation lawyer managing complex health law matters in the County and Supreme Courts of Victoria.

Abstract:

Aims:

The aim of this study was to examine whether ChatGPT passes the Turing test: that is, whether the automated texts it produces can be distinguished, on the basis of form or content, from those of genuine human thinkers.

Methods:

To test our hypothesis that ChatGPT does not pass the Turing test a series of questions were posed to eight pre-eminent thinkers around the world from different disciplines and their answers were formally compared with answers provided by ChatGPT to the same questions. The formal comparison involved developing a framework of analysis to assess each response. The framework was derived and adapted from published instruments within various disciplines, followed by expert review and validation. Six variables were proposed to identify potential differences between the ChatGPT and human responses, referred to as: 1) “Logicality”; 2) “Clarity”; 3) “Originality”; 4) “Personality”; 5) “Emotionality”; and 6) “Creativity”. These categories included a total of 23 questions.

Results:

Five assessors reviewed the blinded ChatGPT and human responses and applied the 23-question framework to each response. The results show that – to a high level of statistical significance, using the formal terms of the analysis – ChatGPT is unable to compare in domains relating to originality, creativity and affective expression.

Conclusions:

The results of this study demonstrate that, contrary to claims otherwise, ChatGPT does not pass the Turing test; that is, the form and content of this automated process is clearly distinguishable from human thought. The implications of these results will be further explored in this presentation.

 

 

Categories