z-logo
open-access-imgOpen Access
Comparative Analysis of Transformer based Language Models
Author(s) -
Aman Pathak
Publication year - 2021
Publication title -
computer science and information technology (cs and it)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5121/csit.2021.110111
Subject(s) - computer science , transformer , artificial intelligence , language model , benchmark (surveying) , natural language processing , natural language , machine learning , architecture , natural language understanding , engineering , art , geodesy , voltage , geography , electrical engineering , visual arts
Natural language processing (NLP) has witnessed many substantial advancements in the past three years. With the introduction of the Transformer and self-attention mechanism, language models are now able to learn better representations of the natural language. These attentionbased models have achieved exceptional state-of-the-art results on various NLP benchmarks. One of the contributing factors is the growing use of transfer learning. Models are pre-trained on unsupervised objectives using rich datasets that develop fundamental natural language abilities that are fine-tuned further on supervised data for downstream tasks. Surprisingly, current researches have led to a novel era of powerful models that no longer require finetuning. The objective of this paper is to present a comparative analysis of some of the most influential language models. The benchmarks of the study are problem-solving methodologies, model architecture, compute power, standard NLP benchmark accuracies and shortcomings.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here