
Comparative Analysis of Transformer based Language Models
Author(s) -
Aman Pathak
Publication year - 2021
Publication title -
computer science and information technology (cs and it)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5121/csit.2021.110111
Subject(s) - computer science , transformer , artificial intelligence , language model , benchmark (surveying) , natural language processing , natural language , machine learning , architecture , natural language understanding , engineering , art , geodesy , voltage , geography , electrical engineering , visual arts
Natural language processing (NLP) has witnessed many substantial advancements in the past three years. With the introduction of the Transformer and self-attention mechanism, language models are now able to learn better representations of the natural language. These attentionbased models have achieved exceptional state-of-the-art results on various NLP benchmarks. One of the contributing factors is the growing use of transfer learning. Models are pre-trained on unsupervised objectives using rich datasets that develop fundamental natural language abilities that are fine-tuned further on supervised data for downstream tasks. Surprisingly, current researches have led to a novel era of powerful models that no longer require finetuning. The objective of this paper is to present a comparative analysis of some of the most influential language models. The benchmarks of the study are problem-solving methodologies, model architecture, compute power, standard NLP benchmark accuracies and shortcomings.