The Dark Side of GPT-3: How Large Neural Language Models are Paving the Way for Machine-Paraphrased Plagiarism

Jan Philip Wahle

J.P. Wahle / 31.10.2022

5 min read––– views

In recent years, there has been a 10% increase in the average rate of copy-pasting in academic assignments. According to a survey reported by Psychological Record, over 35% of undergraduate students admit to committing plagiarism once or multiple times.

Ready access to advanced online paraphrasing tools has made the compilation of convincing plagiarized texts easier than ever before. Most of these tools (SpinBot and SpinnerChief are popular examples) apply fairly simple heuristics, such as replacing keywords with synonyms. And they already successfully bypass most plagiarism detection software.

This trend poses a real threat to the scientific community. After all, academic plagiarism represents grave misconduct on several fronts — perpetrators can unjustly advance their careers, secure research funding that could be more effectively allocated, and potentially make science less trustworthy if their deception remains uncovered.

However, online paraphrasing tools are only a drop in the ocean compared to the possibilities presented by large language models in producing realistic, high-quality paraphrases. We recently published a research paper studying this phenomenon (read here), and we will summarize the findings in today’s blog.

Large Language Models: New Threats on the Horizon

Our EMNLP paper explored using T5 and GPT-3 for machine-paraphrase generation on scientific articles taken from Wikipedia, arXiv, and student theses. We evaluated the detection capabilities of six automated solutions and one commercial plagiarism detection software system. We also conducted a human study involving 105 participants to find if they could accurately detect the occurrence of generated examples.

We used Amazon’s Mechanical Turk (AMT) service to obtain human assessments for paraphrased text classification. For the purposes of this study, we recruited participants with a higher education degree and experts who had regularly published work in the plagiarism detection domain over the past five years.

Our results clearly demonstrated that large models could rewrite text that human experts have difficulty identifying as machine-paraphrased (their mean accuracy was just 53%). Interestingly, these experts rated the quality of paraphrases generated by GPT-3 as highly as original texts. In our study, even the very best-performing detection model (GPT-3) achieved a mere 66% F1-score in detecting paraphrases.

Limitations of Machine Learning Methods to Detect Plagiarism

With the recent advances in artificial intelligence (AI) in natural language processing (NLP) applications, plagiarism detection methods increasingly depend on dense text representations and machine learning classifiers. However, machine learning techniques often fail to pick up on extensive paraphrasing from neural language models.

In fact, popular paid plagiarism detection software tools (including PlagScan and Turnitin) often struggle to identify even the most basic paraphrasing methods. Synonym replacements are easily detected using state-of-the-art neural technologies, but the use of large autoregressive models has really upped the game.

Large autoregressive language models, including GPT-3 (launched in May 2020 by OpenAI and considered one of the biggest neural networks to date) can be used to create human-like paraphrased content that is almost indistinguishable from original work — while featuring the same key ideas and messages as the source material.

Methods Employed in the Study

For paraphrasing, our study utilized three methods. SpinnerChief is an online paraphrasing tool that attempts to change every fourth word with a synonym. We used BERT as an autoencoding baseline and fixed the masking probability at 15% (as seen in Wahle et al., 2021). As a large autoregressive model, we used GPT-3 175B, which is already recognized as powerful for automated similarity metrics and fooling humans (see here).

When studying SpinnerChief’s paraphrased plagiarism, human observers achieved between 79% and 85% accuracy on average. PlagScan obtained results up to 7% over the random baseline for Wikipedia articles, but close to random performance for student theses. Autoencoding models gave significantly better results, often achieving over 80% F1.

When it came to the content produced by GPT-3 (with 175 billion parameters), all models gave a significantly weaker performance in detecting paraphrased material. The results from humans, plagiarism detection software, and autoencoders alike were hardly better than random chance, which further highlights the impressive realism of paraphrased texts generated by large autoregressive models.

By contrast, T5 and GPT-3 obtained low, but reasonable results between 60% - 63% (T5) and 64% - 66% (GPT-3) F1-macro. This finding indicates that scaling and training strategies can help identify machine-paraphrased plagiarism.

Conclusion and Outlook

The difficulty of machine-paraphrase identification makes legal decisions on plagiarism cases unusually complicated. Of course, plagiarism is unethical and illegal. At the same time, false-positive verdicts can destroy the career of a wrongly accused researcher. This means that every single case must be painstakingly evaluated. It is vital that the scientific community stays on top of emerging threats to its integrity.

We consider this study as the first stepping stone in the ongoing quest to better understand how large language models can be utilized for illicit activities in the scientific field. In the future, we intend to further study the similarities and differences between human- and machine-generated paraphrases to determine the extent to which humans have trouble detecting paraphrases across the board.