Imagine building a language model that can write a fluent text in English about a topic of your choice. Language model is just a fancy name for an Artificial Intelligence system distinguished in generating synthetic text. It turns out that with enough data and training, a simple language model can grow into a powerful system apt for multitasking. More than just generating a human-like written text, you can challenge such a system with far more demanding requests. For example, you could ask it to summarize your text or request it to translate an English expression into another language. In this article we’ll briefly describe our attempts to test the paraphrasing ability of large pretrained language models designed for text generation. We challenge the system to rephrase verses from the Bible and observe its behaviour on this unfamiliar task.
Before we get to discuss our experiments, let’s spend a few more words on language models and their working mechanism. A Language Model is a machine learning model that receives some text and produces another text. As any other machine learning model, a language model needs training and more importantly a considerable amount of training data, the Holy Grail of any Machine Learning model. Gigabytes of text of varying content are collected from the web and is used to teach the model to understand the language, and ultimately to make it capable of generating text that is almost indistinguishable from the human written one. The result of a successful training is a complex system boasting linguistic abilities of a competent speaker. Not only it learns the grammar. The model internalizes the pragmatic usage that humans make of the language itself. In addition, the system is enriched with other information specific to the training data. It can be either domain-specific facts, like the year of birth of William Shakespeare or common-sense knowledge encoded in the text. All these notions build up the model’s understanding of the real world, which can be effectively used on a range of tasks, that the language model was not designed to solve in the first place.
Among different NLP tasks, automatic text paraphrasing has been receiving increasing attention lately.[1] Paraphrasing has been found a working strategy for procuring additional data useful to train intelligent systems. More data is always good news for data-hungry AI models. In our experiment, we’ve asked a language model to paraphrase verses from the Holy Scriptures. An ad-hoc system specialized on text paraphrasing would be normally used in a similar scenario. However, we wanted to see whether we could make a general-purpose language model emulate the behaviour of a task-specific system. For this purpose, we used an OpenAI release of GPT-2 [2], a large language model trained to predict the next word in a 40GB of Internet text. GPT-2 and its successors are renowned for their astonishing narrative capabilities. They are tremendous in storytelling. Being primed with an arbitrary input text, these models can adapt to the style and content, generating realistic continuation in natural language. To be able to use a pretrained language model as a paraphraser, we need to modify its usual behaviour. Any time we prompt GPT-2 with passages from the Bible, we explicitly ask it to rephrase the text, expecting the language model to play along and stick to the request.
By prompting the model in the right way, we were able to obtain surprisingly reasonable paraphrases. For instance, after reading the verse “I was talking about your own people who are immoral or greedy or worship idols or curse others or get drunk or cheat ” (Cor 1 5:11), our language model comes out with, “I’m talking about your people that are morally corrupt.”
Sometimes GPT-2 can sound quite provocative. To our request to rephrase “But if you don’t have enough self-control, then go ahead and get married. After all, it is better to marry than to burn with desire”, the model replies: “I don’t know what you are talking about, but if you can’t control yourself then go ahead and marry.”
“Jews ask for miracles, while Greeks don’t want anything that sounds foolish.”, closely restates the original passage, “Jews ask for miracles, and Greeks want something that sounds wise.” ( Cor. 1 1:11)
If you keep prompting the model with the same verse, it becomes chatty and can bless you with some personal statement, like the following one. “Yes, I think it is a good idea to ask for miracles.”
So far so good! We could access high quality paraphrases for free without spending time and resources on a specialized paraphraser. Our paraphrasing exercise shows once again that language models are self-taught systems. They can learn to perform on tasks without receiving any task-specific training. The source of their intelligence resides in terabytes of varying linguistic data available online, that feed the system’s knowledge and determine the extent of their potential application. It’s possible to think of a manifold of other language tasks that can benefit from the knowledge accumulated by large language models, such as reading comprehension, machine translation, question answering and summarization to name a few. The very idea of transferring knowledge across tasks and domains makes this technology so appealing. Hundreds of commercial applications leverage pretrained language models to deliver advanced AI features. With larger language models being released over time and companies growing interested in discovering new potential use cases, a natural question comes to mind.
What else can we expect from these models?
[1] Intrinsic and Extrinsic Automatic Evaluation Strategies for Paraphrase Generation Systems. (Journal of Computer and Communications), 2020.