Revolutionizing Language Model Fine-Tuning: Achieving Unprecedented Gains with NEFTune’s Noisy Embeddings

October 24, 2023

0 Shares

Instruction fine-tuning is the process of training an LLM on a small curated instruction dataset, which allows the model to achieve high performance on instruction-based tasks. It offers numerous advantages, such as better interpretability, reduced bias, and enhanced task performance. Instruction fine-tuning is, therefore, vital in harnessing the full potential of LLMs, and as such, it becomes essential to improve the outcome of the process.

The authors of this research paper have proposed a new method called NEFTune (Noisy Embedding Instruction Fine Tuning) to improve model performance on instruction-based tasks. They have shown that by adding random noise to the embedding vectors of training data at the time of forward-pass of fine-tuning, the model’s performance could be improved significantly without requiring extra computational resources or additional data. NEFTune leads to a surprising increase in the performance of the LLM on conversational tasks while at the same time maintaining the factual question-answering performance.

The researchers have conducted most of their experiments using 7B parameter LLMs like LLaMA-1, LLaMA-2, and OPT-6.7B and using fine-tuning datasets like Alpaca, ShareGPT, etc. The results were evaluated using the AplacaEval dataset to calculate the Win Rate- the rate at which the LLM is preferred over OpenAI’s Text-Davinci-003 model, as determined by the evaluator, GPT-4.

Results show that training these models with NEFT significantly increases conversational ability and answer quality. When fine-tuned with noisy embeddings, the performance of LLaMA-2 7B increased considerably from 29.8% to 64.7%, and the average performance of all the models increased by around 15%. Along with evaluating the performance using an LLM, the researchers also used human annotators. NEFT was preferred on 88 occasions, and 22 instances were a draw, corresponding to around 74% win score for NEFT.

In one of the experiments, LLaMA-2 was trained on Alpaca with and without NEFT and was asked a prompt on quantum computing. The response in the second stage, i.e., using noisy embeddings, was much more fluid, explaining complex concepts like superposition and quantum entanglement more clearly.

The researchers hypothesize that by introducing noise to the embeddings at the time of training, the model becomes less prone to overfitting. Instead of focusing on exact information distribution, such as formatting details, text length, and exact wording, the model provides answers encompassing the knowledge and behaviors in the pre-trained base model.

Given the importance of instruction fine-tuning, many models and methods have been introduced by researchers over the years. NEFT is not the first method to improve the performance using noisy embeddings. However, it can significantly improve the performance of LLMs on conversational tasks, providing a more detailed and clear explanation of complex topics like quantum computing. The most important aspect is that the method does not require additional computational resources, and the authors of this paper have termed it a “free lunch” for fine-tuning LLMs. NEFTune has the potential to be widely used in the future to develop LLMs, making it a promising tool for future development in enhancing LLMs’ capabilities across a wide range of real-world tasks.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

Arham Islam

+ posts

I am a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I have a keen interest in Data Science, especially Neural Networks and their application in various areas.

Vote

Flip

0 Shares

🔥 Check out this Video on Retouch4me: A Family of Artificial Intelligence-Powered Plug-Ins for Photography Retouching

Revolutionizing Language Model Fine-Tuning: Achieving Unprecedented Gains with NEFTune’s Noisy Embeddings

Arham Islam

Trending

Meta AI Introduces Habitat 3.0, Habitat Synthetic Scenes Dataset, and HomeRobot: 3 Major Advancements...

Meet FreeU: A Novel AI Technique To Enhance Generative Quality Without Additional Training Or...

Meet Gradio-lite: A JavaScript Library Elevating Interactive Machine Learning-Based Library (Gradio) to the Browser...

Researchers from the University of Washington and NVIDIA Propose Humanoid Agents: An Artificial Intelligence...

Researchers from Yale and Google DeepMind Unlock Math Problem-Solving Success with Advanced Fine-Tuning Techniques...

The 14% Conversion Rate Growth Story: Unravelling JOE & THE JUICE’s Dynamic Partnership with...

A Deep Dive into the Safety Implications of Custom Fine-Tuning Large Language Models

This AI Paper Presents Video Language Planning (VLP): A Novel Artificial Intelligence Approach that...

Meet LAMP: A Few-Shot AI Framework for Learning Motion Patterns with Text-to-Image Diffusion Models