Introduction
Ever since OpenAI launched ChatGPT, the internet hasn’t stopped speculating about the future of technology or humanity in general. ChatGPT has emerged as a revolutionary product that has the potential to impact almost every sphere of human work. OpenAI also has released APIs for the underlying models of chatGPT, “gpt-3.5-turbo” and “gpt-4” (currently on the waitlist). For developers, integrating these APIs represents a new frontier of innovation. In this article, we will build GPT chatbot with Gradio and OpenAI chat models.
Learning Objectives
- Understand GPT models and their capability
- Basic building blocks of Gradio
- Integrating OpenAI chat models with Gradio chatbot
This article was published as a part of the Data Science Blogathon.
Table of contents
What are GPT models?
GPT stands for “Generative Pre-trained Transformers”. An autoregressive deep learning model that generates natural language text. GPT models are probabilistic large language models. Each possible output token is assigned a probability. The probability of each output token is conditioned on the previous token, thus making the models auto-regressive. These models have been trained on massive amounts of text data with sophisticated learning techniques allowing them to generate human-like text output and understand the context.
OpenAI has previously released GPT 1, 2 and 3, but the current large language models GPT 3.5 and GPT 4 are far superior to their predecessors. The linguistic capabilities of these models have stunned the entire world. They are capable of a multitude of tasks, including text summarization, language translations, conversation and sentiment analysis, etc.
Gradio
It is an open-source tool written in Python. Gradio provides machine learning developers with a convenient way to share their models. It provides a simple, user-friendly web interface to share machine learning models with everyone from anywhere. The unique selling point of Gradio is it doesn’t require developers to write Javascript, HTML, or CSS to build web interfaces. It provides some flexibility to add front-end codes for customization. This makes it ideal for developers with little front-end knowledge to share their models with their team members or audiences.
In order to build a web app for your machine learning model, you will need to familiarize yourself with the basic building blocks of Gradio. It lets you design web apps in two ways, Interfaces, and Blocks.
Gradio Interface
- The interface is a high-level class that lets you build components with a few lines of code. You can build input/output components for texts, images, audios and videos. This has low design flexibility. A simple example of Gradio interface.
import gradio as gr
def sketch_recognition(img):
pass# Implement your sketch recognition model here...
gr.Interface(fn=sketch_recognition, inputs="sketchpad", outputs="label").launch()
This will create a simple web interface that has a sketchpad as an input component and a label as an output component. The function sketch_recognition is responsible for the outcome.
Gradio Block
The Gradio block provides a more low-level way of building interfaces. With increased flexibility, this allows developers to go further deep into building complicated web interfaces. Block has advanced features that let you place components flexibly anywhere on the screen, improved control of data flows, and event handlers for interactive user experience.
import gradio as gr
def greet(name):
return"Hello " + name + "!"
with gr.Blocks() as demo:
name = gr.Textbox(label="Name")
output = gr.Textbox(label="Output Box")
greet_btn = gr.Button("Greet")
greet_btn.click(fn=greet, inputs=name, outputs=output, api_name="greet")
demo.launch()
- A “with” clause is required to define a Gradio block.
- Components inside the with clause are added to the app.
- Components will render vertically in the order they are defined.
OpenAI API
Before building the chat interface, we need access to OpenAI API endpoints. So, the first thing we will need to do is create an OpenAi account and generate our API key. A new OpenAI account comes with $5. If you are already using chatGPT then you must already have an account. Now visit this site to generate an API key.
Store the key somewhere safe. Now, the next thing is to create a virtual environment. We will use Poetry; feel free to use any other virtual environment tool you wish. Follow this article to set up the project environment in different virtual environment tools.
These are our dependencies.
[tool.poetry.dependencies]
python = "^3.10"
gradio = "^3.27.0"
openai = "^0.27.4"
Install dependencies with Poetry
poetry add gradio, openai
Before building it, let’s have a look at the OpenAI API request and response structure.
Below is an example of a typical request to the chatGPT API endpoint that fetches a response.
#API_ENDPOINT = "https://api.openai.com/v1/chat/completions"
# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
The message is a list of dictionaries with respective roles and their contents. A system role is configured beforehand to provide some contexts for the model to behave in a particular way. A user role stores the user prompts and an assistant role holds the response from the model. And this list of messages is responsible for maintaining the context of the conversation.
The model parameter can be set to “gpt-3.5-turbo” or “gpt-4” if you have API access.
Now let’s see our response format.
{
'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
'object': 'chat.completion',
'created': 1677649420,
'model': 'gpt-3.5-turbo',
'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
'choices': [
{
'message': {
'role': 'assistant',
'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field.'},
'finish_reason': 'stop',
'index': 0
}
]
}
The response is in JSON format. It returns the total tokens used and the model’s text response. We will be using these data for our chatbot.
Building GPT Chatbot
Create a main Python file for your app. Import the below libraries.
import gradio as gr
import openai
import os
import json
Create a .env file to store your API key.
OPENAI_API_KEY = “your_api_key”
Load it in your environment using python_dotenv and os libraries.
from dotenv import load_dotenv #install python_dotenv
load_dotenv()
key = os.environ.get('KEY')
openai.api_key = key
App Front End
To add more flexibility in designing the web app, we will use Gradio’s Blocks class. Gradio has a pre-built chatbot component that renders a chat interface.
with gr.Blocks() as demo:
chatbot = gr.Chatbot(value=[], elem_id="chatbot").style(height=650)
Now, run the application with “gradio app.py”. This will start the server at “http://localhost:7860”. You can now see a simple chat interface. You can adjust the styling with style attributes.
Steps
Now, we need a textbox so we can pass prompts. Gradio has Row and Column classes that let you add components vertically and horizontally. These are very helpful when you want to customize our web app. We will add a textbox component that takes text input from end users. This is how we can do it.
with gr.Row():
with gr.Column(scale=0.85):
txt = gr.Textbox(
show_label=False,
placeholder="Enter text and press enter",
).style(container=False)
Save and reload the page. You will see a textbox below the chat interface.
- With gr.Row() container, we created a layout block. This creates a row for other components to be placed inside it horizontally in a single row.
- In the 2nd line, we created another layout block inside the previous container with gr.Column(). In contrast to the Row, this stacks other components or blocks vertically.
- Inside the column container, we defined a textbox component. This will take any text input from users. There are a few parameters we can configure to make it more user-friendly.
- The scale parameter inside the column container scales the components inside. A value of 0.85 means it will occupy 85% of the screen in a single row.
Add Other Components
If you wish to add any other components, you can use a combination of Row and Column containers. Let’s say we add a radio button to switch between models. This can be done as follows.
with gr.Blocks() as demo:
radio = gr.Radio(value='gpt-3.5-turbo', choices=['gpt-3.5-turbo','gpt-4'], label='models')
chatbot = gr.Chatbot(value=[], elem_id="chatbot").style(height=650)
with gr.Row():
with gr.Column(scale=0.70):
txt = gr.Textbox(
show_label=False,
placeholder="Enter text and press enter, or upload an image",
).style(container=False)
Added a Radio component with a default value of ‘gpt-3.5-turbo’ and a choices parameter with both models.
You can also add a component to show the total usage amount. We can add a textbox component that only displays the usage amount in dollars.
with gr.Blocks() as demo:
radio = gr.Radio(value='gpt-3.5-turbo', choices=['gpt-3.5-turbo','gpt-4'], label='models')
chatbot = gr.Chatbot(value=[], elem_id="chatbot").style(height=650)
with gr.Row():
with gr.Column(scale=0.90):
txt = gr.Textbox(
show_label=False,
placeholder="Enter text and press enter, or upload an image",
).style(container=False)
with gr.Column(scale=0.10):
cost_view = gr.Textbox(label='usage in $',value=0)
App Backend
With this, we have successfully built the front end of our web application. Now, the remaining part is to make it operational. The first thing we need to do is preprocess the prompts. We need to format the prompts and responses in a way that is appropriate for API to consume and the Gradio chat interface to render.
Define a function add_text(). This function will be responsible for formatting messages in the appropriate way.
def add_text(history, text):
global messages #message[list] is defined globally
history = history + [(text,'')]
messages = messages + [{"role":'user', 'content': text}]
return history, ""
Here, the history argument is a list of lists or tuples and text is the prompt value by the user.
Next, define a function that returns a response.
def generate_response(history, model ):
global messages, cost
response = openai.ChatCompletion.create(
model = model,
messages=messages,
temperature=0.2,
)
response_msg = response.choices[0].message.content
cost = cost + (response.usage['total_tokens'])*(0.002/1000)
messages = messages + [{"role":'assistant', 'content': response_msg}]
for char in response_msg:
history[-1][1] += char
#time.sleep(0.05)
yield history
As you can see above, we are sending the OpenAI API endpoint with a model name, messages and a temperature value. We receive a response and token usage stats to calculate the total cost. The loop is responsible for rendering texts sequentially as they receive to improve user experience.
Add a functionality that triggers these functions when a user submits a prompt. This is how you can do it.
with gr.Blocks() as demo:
radio = gr.Radio(value='gpt-3.5-turbo', choices=['gpt-3.5-turbo','gpt-4'], label='models')
chatbot = gr.Chatbot(value=[], elem_id="chatbot").style(height=650)
with gr.Row():
with gr.Column(scale=0.90):
txt = gr.Textbox(
show_label=False,
placeholder="Enter text and press enter",
).style(container=False)
with gr.Column(scale=0.10):
cost_view = gr.Textbox(label='usage in $',value=0)
txt.submit(add_text, [chatbot, txt], [chatbot, txt], queue=False).then(
generate_response, inputs =[chatbot,],outputs = chatbot,).then(
calc_cost, outputs=cost_view)
demo.queue()
When a user submits the texts, the add_user function begins. It takes a chatbot object and the prompt as input. The output of the same is then sent to the chatbot component. After this, the generate_response function will be triggered. This will render responses sequentially in the chatbot. Finally, the cost will be calculated and displayed in the text box. For sequential rendering, enable demo.queue().
Now, the chat web app is ready.
Possible Improvements
This is great. But there can be a few more improvements in the app, such as
- Persistent Storage: The app only keeps the messages for a single session. To store data, connect it to a database.
- Deploy to the Cloud: To share your chatbot, deploy it on a cloud server such as AWS or GCP.
- Authentication: You can add user authentication if you want to have more users using your app.
- Multimodality: As GPT 4 is multimodal, you can add features to render images as input and output as well.
Conclusion
We covered a lot of things, from GPT models to the basic building blocks and creating a chatbot. This is the beginning; with these tools and knowledge, you can build more exciting apps, such as question-answering bots, multi-modal chatbots and many more.
So, let’s quickly summarize the article
- GPT stands for Generative Pre-trained Transformers. The gpt-3.5-turbo and gpt-4 language models power Chatgpt.
- It is an open-source tool that lets us quickly share machine learning models from anywhere with everyone.
- The interface and blocks class allows us to build interactive ML web apps. The interface is a high-level abstraction of underlying design elements with no design flexibility. The Block class allows a low-level implementation of containers. This adds more flexibility.
- The GPT chatbot component allows for the quick build of a chat interface.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
Frequently Asked Questions
A. Yes, GPT-3 can be used for chatbots. It’s a powerful language model that can generate human-like responses based on input.
A. No, GPT-3 chatbot is not free. It requires a subscription and usage fees based on the number of tokens processed.
A. While GPT-3 can generate code snippets, relying solely on it for writing production-level code is not recommended. It’s best suited for assisting with code generation or providing examples.
A. GPT-3 itself is not a chatbot but can be used to create chatbots. It’s a language model designed to understand and generate text, making it a useful tool for building conversational agents.