How to discover plagiarism on ChatGPT: These days, chatbots are all the rage, and the most prominent of them is called ChatGPT. However, as a result of how effective and humanlike its responses are, academics, educators, and editors are all currently coping with an increasing tide of AI-generated plagiarism and cheating. It’s possible that the antiquated tools you use to detect plagiarism are insufficient to differentiate between genuine and fabricated content.
In this post, I discuss how AI chatbots might have a nightmare side, investigate several internet technologies that identify plagiarism, and investigate how terrible the problem has grown.
How to discover plagiarism on ChatGPT
The most recent release of OpenAI’s ChatGPT, which took place in November 2022, basically brought the capabilities of chatbots into the spotlight. It made it possible for any average person, as well as any specialist, to produce insightful and understandable articles or essays and to solve mathematical problems based on text. The AI-generated content can quite easily pass as a legitimate piece of writing to the reader who is naïve or inexperienced, which is why students love it and teachers despise it. However, students are the ones who love it the most.
Even if the content in question was taken from a database, AI writing tools nevertheless have the potential to create nearly one-of-a-kind pieces of writing by making use of natural language and correct syntax. This presents a significant obstacle for developers of such tools. This indicates that the competition to overcome cheating based on AI has officially begun. I’ve compiled a list below of several solutions that are readily accessible without cost right this minute.
The GPT-2 Output Detector is a demonstration tool created by OpenAI, the company that developed ChatGPT, to show that it has a bot that is able to recognize chatbot text. It is simple to use Output Detector; users simply need to enter text into a text box, and the tool will immediately deliver its evaluation of the likelihood that the content originated from a human or not.
Writer AI Content Detector and Content at Scale are two further programs that both feature clean user interfaces. You have the option of manually adding text or adding a URL that will scan the material (this option is only available to the writer). The findings are ranked according to a percentage score that indicates the likelihood that the information was generated by humans.
Edward Zen, a student at Princeton University, is the creator of the home-brewed beta tool known as GPTZero, which is hosted on Streamlit. The “algiarism” (artificial intelligence-assisted plagiarism) model utilized by it delivers its findings in a manner that is distinct from that of the other models. The measurements are divided into two categories by GPTZero: complexity and burstiness. The unpredictability of a text as a whole can be measured using burstiness, whereas the randomness of an individual sentence can be measured using perplexity. Both of these measures are also given a numerical value by the tool, and the lower the number, the greater the likelihood that the content was generated by a bot.
I thought it would be amusing to include the Giant Language Model Test Room (GLTR), which was constructed by researchers at the IBM Watson AI Lab at MIT and the Harvard Natural Language Processing Group. It is similar to GPTZero in that it does not disclose its final results with a clear difference between “humans” and “bots.” Because bots are less likely to select unexpected words than humans are, GLTR basically employs bots to recognize material that was created by bots. As a result, the findings are displayed in the form of a color-coded histogram that ranks AI-produced text in comparison to text written by humans. Text is more likely to have originated from a human when it contains a bigger quantity of uncertain text.
Testing ChatGPT Bolt
With all of these possibilities, you might get the impression that our AI detection capabilities are solid. However, in order to determine whether or not any of these tools are actually useful, I decided to experiment with them myself. I then proceeded to run a couple of examples paragraphs that I had written in response to inquiries that I had also posed to, in this instance, ChatGPT.
My initial inquiry was a straightforward one: What exactly is the problem with purchasing a prebuilt personal computer? The following table compares the responses I gave with those provided by ChatGPT.
My real writing | ChatGPT | |
GPT-2 Output Detector | 1.18% fake | 36.57% fake |
Writer AI | 100% human | 99% human |
Content at Scale | 99% human | 73% human |
GPTZero | 80 perplexity | 50 perplexity |
GLTR | 12 of 66 words likely by human | 15 or 79 words likely by human |
As you can see, the majority of these apps were able to determine that the words I was using were authentic, with the first three proving to be the most reliable. However, ChatGPT was able to trick the majority of these detector apps with its response as well. To begin, it received a rating of 99% human from the Writer AI Content Detector app, while the GPT-based detector determined that it was only 36% false. The most egregious infraction was committed by GLTR, who asserted that it was just as likely for my comments to have been written by a human as ChatGPT’s words.
However, I made up my mind to give it one more go, and the results were noticeably different this time around. I asked ChatGPT to offer a synopsis of the research conducted by the Swiss Federal Institute of Technology concerning the use of gold particles as an anti-fogging agent. In this particular illustration, the detector apps did a significantly better job of recognizing ChatGPT and validating my own response.
My real writing | ChatGPT | |
GPT-2 Output Detector | 9.28% fake | 99.97% fake |
Writer AI | 95% human | 2% human |
Content at Scale | 92% human | 0% (Obviously AI) |
GPTZero | 41 perplexity | 23 perplexity |
GLTR | 15 of 79 words were likely by human | 4 of 98 words likely by human |
The effectiveness of the top three tests was clearly demonstrated in this response. And despite the fact that GLTR had trouble recognizing my own writing as coming from a human, it did an excellent job of catching ChatGPT this time.
Conclusion
The fact that internet plagiarism detectors are not foolproof is made abundantly clear by the responses to each inquiry. When it comes to more complicated answers or pieces of writing (as in the case of my second prompt), it is a little bit easier for these apps to recognize the AI-based writing, however, it is a lot more difficult for them to discern it when it comes to the simpler responses. But it’s very obvious that it’s not exactly what I’d call dependable. These detector technologies have a tendency to incorrectly label articles or essays as being generated by ChatGPT on occasion, which presents a challenge for instructors and editors who want to rely on them for finding students who cheat.
Developers are always working to improve the system’s accuracy and reduce the number of false positives, but they are also preparing for the release of GPT-3, which promises to have a considerably expanded dataset and more advanced features than GPT-2 did (which ChatGPT is trained from).
At this stage, editors and educators will need to combine judiciousness and a little bit of human intuition with one (or more) of these AI detectors in order to identify content that was generated by AIs. And if you are a user of a chatbot and have used or are inclined to utilize a chatbot like Chatsonic, ChatGPT, Notion, or YouChat to try to pass off your “job” as legitimate, I beg you to refrain from doing so. No matter how you look at it, repurposing content that was generated by a bot (which pulls information from certain locations inside its database) is still considered plagiarism.
Would you like to read more about How to discover plagiarism on ChatGPT-related articles? If so, we invite you to take a look at our other tech topics before you leave!