Speaker 1: These researchers created this paper using only AI and this data set. They say that this is autonomous research, data to paper, but there's so much more to this story and this is it. So this story starts with this tweet from Roy Kishoni and it says, introducing data to paper, autonomous AI research. Now this sounds so great that it caught my interest, but this is the thing I want you to remember, that it says, we've let it play with the large data set, went to lunch, went back, it had already chosen several research topics, wrote data analysis codes, interpreted results, and wrote five transparent reproducible papers and this is an image of the front. Now, to me, that just sounds incredible, but these are papers, but are they peer-reviewable? Will they pass peer review? Is it completely autonomous? And when you delve into the details, it gets a little bit more murky. I'm not saying that this is a great little kind of exploration of AI. What I'm saying is, maybe they're lying a little bit. Okay, so this team, which is a supervisor and their student, as far as I can tell, use this data set. They looked at 253,000 survey respondents and they were looking at the diabetes health indicators and it's got all of the data that it contains. So all of these sort of like variables. So clearly, AI can really help clean up all of this data and look for the stuff that maybe researchers have missed before. That, to me, is very exciting. So what I can work out from what is released online at the moment is that they use a chaining process of prompts in good old-fashioned chat GPT, nothing crazy or fancy, just Python code, chat GPT, and a methodical approach of prompting. So that means going from getting the results, asking it to put it in a table, asking it to explain that data in that table, then looking at writing the introduction, doing a small literature review, going to get references, all of these little things one after another until you build up a paper. Now, don't get me wrong. They have managed to create a paper, but there are so many issues with this paper that we're going to delve into them right now and the details don't match up with that initial tweet. So let's take a little look at the paper that AI apparently autonomously produced. We've got a nice title and, you know, overall not a bad kind of layer. You know, it looks on the face of it like a credible paper that you could submit, but when you start reading it, you start realizing that, you know, this is AI. Surely they can just do stuff that is far and above what a researcher could do, i.e., you know, looking for the little small relationships that are surprising, that are novel, that are new, and I don't think it's done any of that. So here we can see that it says, this study addresses a gap in the literature by providing evidence on the protective effects of fruit and vegetable consumption and physical activity in relation to diabetes risk. It's got an introduction. It's got the results. It's got a nice table. It's got an explanation of the results. It's got another table, and it's got a discussion with references. It's also got a descriptive statistics of the data set that it used, and it's got methods, and that's it. It's got 10 references, which I think for this sort of paper is a little bit too low, and also it's got all of this supplementary information, data description, it's got data exploration, the code that it used, and the code description, code output, so it certainly has produced a lot of data, and it's presenting it in quite a logical fashion. Now let's have a look at what a typical paper looks like in this field. So if you go over here, you can see that this is from 2019, and a paper of this sort tends to have much more data. It has much more kind of in-depth analysis, and also it has 22 references, so it has twice as many references, and they tend to be more up-to-date. Now this tweet makes you think that they went to lunch, went back, it had chosen seven, blah, blah, blah, and it was just like done. Now I've got no reason to think that they were lying, but when you look at what they said in the Nature opinion news article, and by the way, little side tip, I couldn't access this without paying money, but if you go to archive.is and put in the URL, you end up getting access to all of it. Cheeky side tip for you, bonus, that one's free. So I think the first issue is when we look at the code. Here they said that on its first attempt, the chatbot generated code that was riddled with errors, and what they did is put in the errors to the chat GPT until it kind of worked. They composed the paper from outputs of many prompts. Every step is building on the products of the previous step. So okay, they composed the paper from many, many prompts. I am under the impression that it was like a Python code autonomous thing, but still it looks like there's a lot of human oversight before this final paper is produced. And one thing that the researchers admitted is that you did have to keep a really close eye on its output because otherwise you could end up creating complete lies, more on that in a moment. All right, so did this AI create something that could be submitted for peer review? I mean, yes, you could submit it, and on the face of it, if you look at the abstract, you can see that it says, this study addresses a gap in the literature. Is there a gap? Does it provide any new, novel, interesting addition to the literature that's already out there? So if we head over to illicit.org, you can see that here we've got loads of papers that have a very, very similar title and output. So this isn't my research field, but I can assure you that unless this paper has something completely new and different or novel about it, it will be rejected at peer review in a heartbeat, and that is what another researcher says when he says, it is not something that's going to surprise any medical experts. It's not close to being novel. So here we've given ChatGPT a load of data, and really, we're looking to use AI to find the little associations that humans just miss because we're not as creative with the data. But here, AI has gone for the easy option, and it's just presenting data that everyone knows about. So it's not a win in that case for AI, but that doesn't mean that AI is useless for research. It just means that in this case, it's not been able to find anything new that wasn't already known. I think with more prompting, it could get a little bit better, but overall, a big thumbs down for AI. If you think that's a bit strange, wait until the end of this video because it really gets a lot worse. Another thing that AI is really bad at doing is creating citations. So if we look at the paper again, we can see that it's got only 10 citations, and I put those 10 citations into Trinker, and this is Trinker's citation checker report, and you can see that overall, it's done an okay job. First of all, there's not many references, but secondly, there's also some really old ones. There's poorly cited, there's journal bias, and about six of them out of the 10 are good to use. So overall, it's not done a fantastic job, and also, the references aren't up to date. The latest one I think is 2018, 2019 maybe. So these are old papers, and it's something that ChatGPT is really bad at because it has a cutoff of data of September 2021. ChatGPT also has the tendency to hallucinate papers, so it's more of a plausibility creation thing than an actual fact-generating device or language model, and so what you end up with is a really convincing sounding set of references that unless you check every single one to make sure they exist, could get through peer review. Now that's not to say AI is rubbish at everything. I think there are some great glimpses of the AI future for science and research, and here we've got ChemCrow, augmenting large language models with chemistry tools, and if you look here at the GitHub, essentially it says it's an open source package for the accurate solution of reason-intensive chemical tasks. So overall, this has actually already produced some novel synthetic pathways to certain things, certain chemicals that are sort of like industrially interesting, but overall, I think for large language models and producing papers and analyzing data, it's just not creative enough. The last issue that I can see is p-hacking. Now with something like ChatGPT and AI research and analysis, the problem is is that p-hacking will increase. P-hacking is where you take a large data set, you run number of sort of hypotheses against that data set, and you only report on the ones that are successful or considered significant given that data. So that is another issue with this approach is that unfortunately editors and people that are peer-reviewing these papers are going to have to be extra cautious and on the lookout for AI tools for large data sets because it will become easier to do those sneaky little tactics like p-hacking. So if you're peer-reviewing papers, just make sure that at the moment you're extra diligent if there's a large data set and they're running an AI model against it. All right, so where does that leave us? Well, essentially this tweet falls flat on its face. Unfortunately, we are very far away from data in, paper out. There are so many steps in the middle that still need human oversight, intuition, and sort of knowledge that this is not a reasonable thing to expect AI to do at the moment. Now mark my words, we will be getting closer to that as AI improves and people start producing tools specifically for this. But at the moment, it still needs a lot of human intervention. But at the moment, ChatGPT can act as a really cool co-pilot for science and research. If you're in doubt, put your data into ChatGPT. If in doubt, ask it questions about your output, about what to write, about how to write it. It is there to make your writing experience easier, faster, quicker, and more productive. But you will always, always need to check it over before submitting it for peer review, for marking, or wherever you're sort of sending that final output. You have to check it. So at the moment, it's really far away from autonomous research, but it's a great little assistant. If you liked this video, remember to go check out this one where I talk about the $2 billion scandal that's happening right now in Wiley. So there we have it. That's everything you need to know about autonomous AI research and data to paper. It's not quite there yet. Let me know in the comments what you think, and also remember to sign up to my newsletter. Head over to andrewstapleton.com.au forward slash newsletter. The link is in the description, and when you sign up, you'll get five emails over about two weeks. Everything from the tools I've used, the podcasts I've been on, how to write the perfect abstract, and more. It's exclusive content available for free, so go check it out now, and also go check out academiainsider.com. It's my project where I've got my eBooks, my resource pack, a blog, a forum. Everything is over there to make academia work for you. All right then, I'll see you in the next video.
Generate a brief summary highlighting the main points of the transcript.
GenerateGenerate a concise and relevant title for the transcript based on the main themes and content discussed.
GenerateIdentify and highlight the key words or phrases most relevant to the content of the transcript.
GenerateAnalyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.
GenerateCreate interactive quizzes based on the content of the transcript to test comprehension or engage users.
GenerateWe’re Ready to Help
Call or Book a Meeting Now