VIOLET: Vectorized Invariance Optimization for Language Embeddings Using Twins | Zendy

Mikhail E. Ram | Zendy; G Manju | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

VIOLET: Vectorized Invariance Optimization for Language Embeddings Using Twins

Author(s) -

Mikhail E. Ram,

G Manju

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3590971

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

We present VIOLET, a novel positive pair-based information maximisation strategy for fine-tuning BERT to generate robust, invariant, and semantically meaningful sentence embeddings. VIOLET extends the Barlow Twins framework by addressing both redundancy reduction and invariance preservation within the embedding space. This is achieved through a combination of text-specific augmentations— tailored for the nuances of natural language—and a mixup-based regularisation mechanism that promotes smoother representation learning. Unlike conventional contrastive learning methods that rely on large batch sizes and hard negative mining to achieve performance, VIOLET operates exclusively on positive pairs. This eliminates the need for complex sampling strategies and significantly reduces training overhead. A key strength of VIOLET is its ability to perform consistently and robustly even with smaller batch sizes, making it an appealing choice for training on limited computational resources. Empirical results on the Semantic Textual Similarity Benchmark (STS-B) demonstrate that VIOLET achieves correlation scores on par with or exceeding several state-of-the-art sentence embedding models. These findings underscore the method’s effectiveness, scalability, and practical utility in a wide range of downstream natural language understanding tasks, particularly in settings where efficiency and stability are critical. Our implementation is provided at (https://github.com/mikhail-ram/VIOLET).

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research