Introduction
Welcome to the world of Large Language Models (LLM). In the old days, transfer learning was a concept mostly used in deep learning. However, in 2018, the “Universal Language Model Fine-tuning for Text Classification” paper changed the entire landscape of Natural Language Processing (NLP). This paper explored models using fine-tuning and transfer learning.
LLAMA2 is one of the best LLM models used for text generation. In this guide, we will explore the automatic process of fine-tuning the LLAMA2 model using personal data. All of this is powered by Gradient AI. Gradient AI is a cloud platform that provides a Python SDK, allowing us to create, test, and easily manage models.
This process is going to take a long time! So let’s get started and get ready!
Learning Objective
- Understand LLAMA2 and its key features and use cases.
- Explore Gradient AI, understanding its key features, use cases, and making comparisons.
- Gain knowledge of modular coding concepts to increase the organisation’s productivity and reuse of your code.
- Acquire knowledge about transfer learning with LLAMA2, with model initialization and fine-tuning.
- Learn the concept of Gradient AI, like creating IDs and keys.
- Learn Streamlit to create interactive and user-friendly UI for machine-learning applications.
This article was published as a part of the Data Science Blogathon.
Table of contents
- What is LLAMA2?
- What is Gradient AI Cloud
- Creating Workspace ID and Access Token
- Building Automated Fine Tuning App Using Modular Coding
- Project Architecture Diagram
- Fine Tune Process Diagram
- Step-by-Step Project Setup
- Creating Looger and Exception
- Creating Samples
- Creating Constants
- Creating fine_tune.py
- Creating Streamlit Application(app.py)
- Frequently Asked Questions
- Resources and Further Learning
What is LLAMA2?
LLAMA2, or the Large Language Model of Many Applications, belongs to the category of Large Language Models (LLM). Developed by Facebook (Meta), this model is designed to explore a wide range of natural language processing (NLP) applications. In the earlier series, the ‘LAMA’ model was the starting face of development, but it utilized outdated methods.
As I mentioned in the intro, the pivotal moment came in 2018 with the paper ‘Universal Language Model Fine-tuning for Text Classification.’ This paper revolutionized the field of NLP through the techniques of deep learning and pre-training methods, greatly improving performance across different NLP applications.
Key Features:
- Versatility: LLAMA2 is a powerful model capable of handling diverse tasks with high accuracy and efficiency
- Contextual Understanding: In sequence-to-sequence learning, we explore phonemes, morphemes, lexemes, syntax, and context. LLAMA2 allows a better understanding of contextual nuances.
- Transfer Learning: LLAMA2 is a robust model, benefiting from extensive training on a large dataset. Its quick adaptability to specific tasks is facilitated by transfer learning.
- Open-Source: In Data Science, a key aspect is the community. This is made possible when models are open source, allowing researchers, developers, and communities to explore, adapt, and integrate them into their projects.
Use Cases:
- LLAMA2 can help in creating text-generation tasks, such as story-writing, content creation, etc.
- We know the importance of zero-shot learning. So, we can use LLAMA2 for question-answering tasks, similar to ChatGPT. It provides relevant and accurate responses.
- For language translation, in the market, we have APIs, but we need to subscribe. but LLAMA2 provides language translation for free, making it easy to utilize.
- LLAMA2 is easy to use and an excellent choice for developing chatbots.
Comparison with Other Platforms:
Model | Key Characteristics | Strengths |
---|---|---|
LLAMA2 | – Versatility across applications. | – Strong contextual understanding. |
– Adaptable with transfer learning. | – Effective in the different tasks of NLP. | |
– Context-aware responses. | ||
BERT (Bidirectional Encoder Representations from Transformers) | – Bidirectional context understanding. | – Excellent for tasks requiring deep contextual understanding. |
– Pre-trained on a large corpus. | – Effective in question answering, and more. | |
GPT (Generative Pre-trained Transformer) | – Focuses on generating coherent, contextually relevant text. | – Ideal for creative text generation and language understanding. |
– Autoregressive training methods. | – Strong performance in language modeling tasks. | |
XLNet | – Permutation language modeling objective. | – Achieves bidirectional context understanding. |
– Considered a hybrid model. | – Strong in different domains of NLP benchmarks. |
What is Gradient AI Cloud
Gradient AI is a cloud platform that offers versatile tools for users to easily build, test, and update models. Utilizing such tools is a common method, as many industries leverage cloud infrastructure for model creation and testing. The platform streamlines the processes of building, training, and deploying models, providing test cases. This offers a convenient solution for users, researchers, and enterprises.
Key Features:
- Scalability: In a cloud platform, scalability is crucial to provide easily scalable services on demand. Gradient AI is a powerful cloud platform that can easily offer such services.
- Ease of Use: Gradient AI’s UI is very user-friendly. Users can easily create IDs and keys for model creation. The UI is designed for ease of use, especially for new users.
- Collaboration: The platform supports collaboration by providing shared workspaces, version control, and collaboration tools, fostering teamwork in machine learning or GenAI projects.
- Diverse Framework Support: Gradient AI Cloud supports a variety of machine-learning frameworks, allowing users to work with popular libraries such as TensorFlow, PyTorch, and scikit-learn.
Use Cases:
- We can create models using the Python SDK and easily train them. Additionally, models can be created using the UI for simple training. This helps optimize computational resources.
- The platform is suitable for fine-tuning pre-trained models, enabling users to adapt models to specific tasks or domains.
- Gradient AI Cloud simplifies the deployment and hosting of machine learning models, providing infrastructure for serving predictions in real time.
- Gradient AI Cloud supports end-to-end data science workflows, from data preparation to model training and deployment.
Comparison with Other Platforms:
Platform | Key Characteristics | Strengths |
---|---|---|
Gradient AI Cloud | – Comprehensive features and resources. | – Scalability for machine learning tasks. |
– User-friendly interfaces. | – Simplified deployment of machine learning models. | |
– Support for various frameworks. | – Collaboration features for teamwork. | |
Google Colab | – Free access to GPU resources for Jupyter notebooks. | – Quick access for experimenting with machine learning code. |
– Limited features compared to paid cloud platforms. | – Suitable for educational and personal projects. | |
AWS SageMaker | – Provides similar machine learning capabilities. | – Integration with other AWS services for seamless workflows. |
– Extensive suite of tools for end-to-end ML workflows. | – Scalability and flexibility with AWS infrastructure. | |
Azure Machine Learning | – Azure’s cloud-based machine learning platform. | – Integration with Azure services for comprehensive solutions. |
– Support for different ML frameworks. | – Seamless collaboration |
Creating Workspace ID and Access Token
Creating GRADIENT_WORKSPACE_ID and GRADIENT_ACCESS_TOKEN involves obtaining the necessary credentials from the Gradient AI Cloud platform. Below are the steps to create these variables:
1. Workspace ID (GRADIENT_WORKSPACE_ID):
- Log in to your Gradient AI account.
- Navigate to the workspace or project for which you want to obtain the ID.
- Look for the workspace ID in the URL. It typically appears as a long alphanumeric string.
- Copy the id and paste it somewhere we want in the coding part.(.env)
Fig: UI Of Gradient AI (Workspace)
Fig: UI Of Gradient AI (Authentication KEY)
2. Access Token (GRADIENT_ACCESS_TOKEN):
- Check the right-side option Access Tokens and click it
- Copy the Key and past it somewhere we want in the coding part.(.env)
Fig: UI Of Gradient AI (Authentication KEY)
Building Automated Fine Tuning App Using Modular Coding
Building an automated fine-tuning app involves multiple steps, and for a streamlined process, we establish a structured workflow. A core element of modular coding is the creation of a logger and exception script, responsible for capturing logs and errors. Here’s a high-level overview of the coding structure. Finally, we integrate the Streamlit application for a user-friendly UI, simplifying the component in visually so anyone can test the application.
Project Structure
project_root
│
├── configs
│
├── research
│ └── trials.ipynb
├── logs
│
├── src
│ └── lama2FineTune
│ ├── component
│ │ └── fine_tune.py
│ ├── constant
│ │ └── env_variable.py
│ ├── exception
│ ├── logger
│ ├── utils
│ │ └── main_utils.py
│ └── __init__.py
│
├── venv
├── .env
├── .gitignore
├── init_setup.sh
├── params.yaml
├── app.py
├── README.md
├── requirements.txt
├── setup.py
Fig: UI Of Coding Structure
Project Architecture Diagram
Fig: project architecture
- The Streamlit application provides a user interface with a “Fine-tune” button.
- When the button is pressed, the Streamlit application triggers the FineTuner class.
- The FineTuner class initializes the LLAMA2 model using Gradient AI, creates or loads the model, fine-tunes it, and saves the fine-tuned model locally.
- The fine-tuned model can be uploaded to the Gradient AI platform for further deployment and management.
- The local machine can save and load both the basic and fine-tuned models.
- Gradient AI on the cloud handles model serving, resource management, scalability, and collaboration.
This architecture allows for the efficient fine-tuning of the LLAMA2 model and seamless integration with the Gradient AI platform.
Fine Tune Process Diagram
Fig: Diagram of Fine tune the LLM2
The Diagram integrates a Streamlit app for user interaction, FineTuner class for LLAMA2 fine-tuning, Gradient SDK for cloud communication, and modular coding components, ensuring a streamlined process of customizing and deploying the LLAMA2 model on Gradient AI.
Step-by-Step Project Setup
Step-1 Clone the GitHub repo
git clone https://github.com/SuyodhanJ6/Fine-Tune-LLAMA2.git
Step-2 Change the Directory
ls
o/p : Fine-Tune-LLAMA2
cd Fine-Tune-LLAMA2
Step-3 Creating a virtual environment
- Python Installation: Ensure Python is installed on your machine. You can download and install Python from the official
- Virtual Environment Creation: Create a virtual environment
conda create -p ./venv python=3.9 -y
Step-4 Virtual Environment Activation
- Activating the ./venv (make sure the venv folder is present in out current directory.)
conda activate ./venv
Step-5 Dependency Installation
- To install the required packages listed in the requirements.txt file, you can use the following command in your terminal or command prompt:
pip install -r requirements.txt
Step-6 Create a .env file and Edit the .env
- Creating .env: Open the terminal (Ubuntu) or bash (Windows) type below the command.
touch .env
- Updating the API Key (.env)
GRADIENT_WORKSPACE_ID=Past your key
GRADIENT_ACCESS_TOKEN=Past your key
Creating Looger and Exception
lama2FineTune
│ ├── exception
│ │ └── __init__.py
│ ├── logger
│ │ └── __init__.py
Logger File:
The logger file is important for recording and storing code(function, class, script name) information, serving multiple crucial functions:
- Debugging: Provides detailed logs of events during program performance, aiding in identifying and resolving issues.
- Performance Monitoring: racks application performance, assisting in code optimization and efficiency improvement.
- Error Tracing: Enables efficient tracing of errors, leading to faster troubleshooting and resolution.
- Audit Trail: Serves as a record of important system events and actions.
- Real-time Monitoring: Facilitates real-time monitoring of the application’s behaviour, contributing to proactive issue detection.
import logging
import os
from datetime import datetime
import os
LOG_FILE = f"{datetime.now().strftime('%m_%d_%Y_%H_%M_%S')}.log"
logs_path = os.path.join(os.getcwd(), "logs", LOG_FILE)
os.makedirs(logs_path, exist_ok=True)
LOG_FILE_PATH = os.path.join(logs_path, LOG_FILE)
logging.basicConfig(
filename=LOG_FILE_PATH,
format="[ %(asctime)s ] %(lineno)d %(name)s - %(levelname)s - %(message)s",
level=logging.INFO,
)
Exception File:
The exception file is designed to manage unexpected events or errors during the program run and the key importance:
- Error Handling: Captures and manages errors, preventing abrupt program termination.
- User Feedback: Offers a mechanism for meaningful error messages to users providing the line number which has error occurred also and which script and understanding.
- Root Cause Analysis: Aids in identifying the root cause of issues, and guiding developers in making necessary improvements.
import sys
def error_message_detail(error, error_detail: sys):
"""
Method Name : error_message_detail
Description : Format and return an error message with traceback details.
Return : str
Args :
error (Exception): The error object or message.
error_detail (sys): The traceback information from the error.
"""
_, _, exc_tb = error_detail.exc_info()
file_name = exc_tb.tb_frame.f_code.co_filename
error_message = "Error occurred in python script name
[{0}] at line number [{1}]. Error message: {2}".format(
file_name, exc_tb.tb_lineno, str(error)
)
return error_message
class McqGeneratorException(Exception):
"""
Custom exception class for handling money laundering-related errors.
"""
def __init__(self, error_message, error_detail: sys):
"""
Method Name : __init__
Description : Initialize the MoneyLaunderingException exception.
Return : None
Args :
error_message (str): The main error message.
error_detail (sys): Additional details about the error.
"""
super().__init__(error_message)
self.error_message_detail = error_detail
def __str__(self):
"""
Method Name : __str__
Description : Return a string representation of the
MoneyLaundering exception.
Return : str
Args : None
"""
return str(self.error_message_detail)
Creating Samples
1. RESPONSE_JSON
lama2FineTune
│ ├── config
│ │ └── __init__.py
Note: Write the all samples you want in the above section I provide.
SAMPLES = [
{
"inputs": "### Instruction: Who is Prashant Malge? \n\n### Response:
Prashant Malge is a dedicated fourth-year Computer Science student
at DPCOE - Dhole Patil College Of Engineering Pune, with a strong
passion for data science. He has a solid foundation in programming
, databases, and machine learning."
},
{
"inputs": "### Instruction: Tell me about Prashant Malge's academic
journey. \n\n### Response: Prashant Malge's academic journey has
equipped him with a solid foundation in programming, databases, and
machine learning. He is currently a fourth-year student at DPCOE -
Dhole Patil College Of Engineering Pune."
},
{
"inputs": "### Instruction: What skills does Prashant Malge possess?
\n\n### Response: Prashant Malge possesses skills in statistical
analysis, machine learning, and data visualization. As a Data
Scientist, he leverages his expertise to derive insights and make
informed decisions."
},
{
"inputs": "### Instruction: Where has Prashant Malge gained hands-on
experience in data science? \n\n### Response: Prashant Malge gained
hands-on experience in data science through two internships at
inuron.ai. During these internships, he addressed complex data
problems and collaborated effectively with stakeholders."
},
{
"inputs": "### Instruction: How does Prashant Malge approach data
problems? \n\n### Response: Prashant Malge excels in addressing
complex data problems and is known for his collaborative approach.
He works effectively with stakeholders to deliver scalable and
secure data solutions."
},
{
"inputs": "### Instruction: What are Prashant Malge's interests
outside of data science? \n\n### Response: Beyond his technical
pursuits, Prashant Malge has a deep love for tea and a passion for
sports. He has played softball for years, representing Kolhapur in
state-level competitions three times."
},
{
"inputs": "### Instruction: Can you share Prashant Malge's
personal website? \n\n### Response: Prashant Malge's personal
website is available at https://suyodhanj6.github.io/"
}
]
Creating Constants
lama2FineTune
│ ├── constants
│ │ └── env_variable.py
└── __init__.py
- First import the .env variable in the env_variable.py script
# Loading OPENAI_API_KEY
from dotenv import load_dotenv
import os
# take environment variables from .env.
load_dotenv()
""" Workspace Constants """
GRADIENT_WORKSPACE_ID=os.getenv("GRADIENT_WORKSPACE_ID")
""" Access token of gradients"""
GRADIENT_ACCESS_TOKEN=os.getenv("GRADIENT_ACCESS_TOKEN")
- In the __init__.py script import project constants that we are using pipeline.
# Other constants from params.yaml
MODEL_ADAPTER_NAME = "PrashantModelAdapter"
NUM_EPOCHS = 3
Creating fine_tune.py
lama2FineTune
│ ├── component
│ │ └── fine_tune.py
import os
import sys
import logging
from datetime import datetime
from gradientai import Gradient
from lama2FineTune.constants.env_varaible import GRADIENT_WORKSPACE_ID, GRADIENT_ACCESS_TOKEN
from lama2FineTune.logger import logging
from lama2FineTune.exception import Llama2Exception
class FineTuner:
def __init__(self, model_name, num_epochs):
self.model_name = model_name
self.num_epochs = num_epochs
self.gradient = None
self.model_adapter = None
def initialize_gradient(self):
# Initialize Gradient AI Cloud with credentials
self.gradient = Gradient(workspace_id=GRADIENT_WORKSPACE_ID,
access_token=GRADIENT_ACCESS_TOKEN)
def create_model_adapter(self):
# Create model adapter with the specified name
base_model = self.gradient.get_base_model(base_model_slug="nous-hermes2")
model_adapter = base_model.create_model_adapter(name=self.model_name)
return model_adapter
def fine_tune_model(self, samples):
# Fine-tune the model using the provided samples and number of epochs
for epoch in range(self.num_epochs):
for sample in samples:
query = sample["inputs"]
response = sample["response"]
self.model_adapter.fine_tune(inputs=query, targets=response)
def fine_tune(self):
try:
# Initialize logging
# Initialize Gradient AI Cloud
self.initialize_gradient()
# Create model adapter
self.model_adapter = self.create_model_adapter()
logging.info(f"Created model adapter with id {self.model_adapter.id}")
# Fine-tune the model
self.fine_tune_model(SAMPLES)
except Exception as e:
# Handle exceptions using custom exception class and logging
raise Llama2Exception(e, sys)
finally:
# Clean up resources if needed
if self.model_adapter:
self.model_adapter.delete()
if self.gradient:
self.gradient.close()
# if __name__ == "__main__":
# # Example usage
# fine_tuner = FineTuner(model_name=MODEL_ADAPTER_NAME, num_epochs=NUM_EPOCHS)
# fine_tuner.fine_tune()
Creating Streamlit Application(app.py)
# app.py
import streamlit as st
from lama2FineTune.components.fine_tune import FineTuner
from lama2FineTune.constants import MODEL_ADAPTER_NAME, NUM_EPOCHS
def main():
st.title("LLAMA2 Fine-Tuning App")
# Get user input for model name and number of epochs
model_name = st.text_input("Enter Model Name", value=MODEL_ADAPTER_NAME)
num_epochs = st.number_input("Enter Number of Epochs", min_value=1, value=NUM_EPOCHS)
# Display fine-tuning button
if st.button("Fine-Tune Model"):
fine_tuner = FineTuner(model_name=model_name, num_epochs=num_epochs)
# Perform fine-tuning
st.info(f"Fine-tuning model {model_name} for {num_epochs} epochs.
This may take some time...")
fine_tuner.fine_tune()
st.success("Fine-tuning completed successfully!")
# Display generated output after fine-tuning
sample_query = "### Instruction: Who is Prashant Malge? \n\n ### Response:"
completion = fine_tuner.model_adapter.complete(query=sample_query,
max_generated_token_count=100).generated_output
st.subheader("Generated Output (after fine-tuning):")
st.text(completion)
if __name__ == "__main__":
main()
- Run This script
streamlit run app.py
Fig: UI Of Fine-Tune app
Fig: UI Of Gradient AI (Model Creation)
Note: If you want a step-by-step explanation, please refer to the GitHub repo: Link
In this notebook, I provide the code that u can run collab or Jupyter Notebook (local)
Conclusion
In conclusion, we explored the project structure, developed a personalized model through transfer learning, and constructed a modular coding approach. This project employs a structured and organized process for refining LLAMA2 language models with personalized data. Key elements include a Streamlit application (app.py), a fine-tuning pipeline (fine_tune.py), and further modules for constants, exceptions, logging, and utilities. The design prioritizes clarity, ease of maintenance, and an improved user experience.
Key Takeaways
- Perform iterative testing to evaluate the fine-tuned LLAMA2 model.
- Utilize the Gradient AI cloud for both model training and deployment.
- Integrate the Gradient AI cloud with the Python SDK.
- Understand the concept of transfer learning and its application in LLAMA2.
- Recognize the benefits of modular coding and learn industry-standard code structuring in projects.
- Explore the creation of a project layout and establish an automated pipeline for efficient development.
Frequently Asked Questions
A: Modular community is crucial for code clarity, maintainability, and scalability, achieved by segregating branches based on specific functionalities.
A: The Streamlit app delivers a UI interface for interacting with the LLAMA2 fine-tuning process. Users can input parameters and initiate automated fine-tuning through the interface.
A: LLAMA2 is a large language model designed for natural language processing tasks. It supports transfer learning by allowing fine-tuning on specific domains or tasks using personal datasets.
A: Transfer learning with LLAMA2 involves initializing the model with pre-trained weights and fine-tuning it on domain-specific or task-specific data, adapting its knowledge to the target application
A: The project emphasizes logging for improved runtime visibility and employs business exceptions to improve error reporting, contributing to a more robust system.
Resources and Further Learning
- GitHub Repository: Link
- LLAMA2 Documentation: Link
- Gradient AI Platform: Link
- LLAMA2 Research Paper: Link
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.