Connecting Pandas to a Database with SQLAlchemy

27 July 2024

2

In this article, we will discuss how to connect pandas to a database and perform database operations using SQLAlchemy.

The first step is to establish a connection with your existing database, using the create_engine() function of SQLAlchemy.

Syntax:

from sqlalchemy import create_engine

engine = create_engine(dialect+driver://username:password@host:port/database)

Explanation:

dialect – Name of the DBMS

driver – Name of the DB API that moves information between SQLAlchemy and the database.

Username, Password – DB User credentials

host: port – Specify the type of host and port number.

Database – Database name

Connecting Pandas to a Database with SQLAlchemy

Syntax: pandas.DataFrame.to_sql(table_name, engine_name, if_exists, index)

Explanation:

table_name – Name in which the table has to be stored

engine_name – Name of the engine which is connected to the database

if_exists – By default, pandas throws an error if the table_name already exists. Use ‘REPLACE’ to replace this dataset with the old one or “APPEND” to add the data to the existing table.

index – (bool), Adds index column to the table that identifies each row uniquely.

For this example, we can use a PostgreSQL database, which is one of the easiest ways to do things, but then the procedure is just the same for all the other databases supported by SQLAlchemy. You can download the sample dataset here.

Let us first Import the necessary dataset. Now, let’s Establish the connection with the PostgreSQL database and make it interactable to python using the psycopg2 driver. Next, we shall load the dataframe to be pushed to our SQLite database using the to_sql() function as shown.

Python3

# import necessary packages
import pandas
import psycopg2
from sqlalchemy import create_engine
  
# establish connection with the database
engine = create_engine(
    "dialect+driver//username:password@hostname:portnumber/databasename")
  
# read the pandas dataframe
data = pandas.read_csv("path to dataset")
  
# connect the pandas dataframe with postgresql table
data.to_sql('loan_data', engine, if_exists='replace')

Output:

This will create a table named loan_data in the PostgreSQL database.

Connecting a table to PostgreSQL database

Converting a PostgreSQL table to pandas dataframe

Like we did above, we can also convert a PostgreSQL table to a pandas dataframe using the read_sql_table() function as shown below. Here, let us read the loan_data table as shown below.

Syntax: pandas.DataFrame.read_sql_table(table_name, con = engine_name, columns)

Explanation:

table_name – Name in which the table has to be stored

con – Name of the engine which is connected to the database

columns – list of columns that has to be read from the SQL table

Python3

# import necessary packages
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
  
# establish connection with the database
engine = create_engine(
    "dialect+driver//username:password@hostname:portnumber/databasename")
  
# read the postgresql table
table_df = pd.read_sql_table(
    "loan_data",
    con=engine,
    columns=['Loan_ID',
             'Gender',
             'Married',
             'Dependents',
             'Education',
             'Self_Employed',
             'ApplicantIncome',
             'CoapplicantIncome',
             'LoanAmount',
             'Loan_Amount_Term',
             'Credit_History',
             'Property_Area',
             'Loan_Status'],
  
)
  
# print the postgresql table loaded as 
# pandas dataframe
print(table_df)

Output:

Postgresql table read as a dataframe using SQLAlchemy

Passing SQL queries to query table data

We can also pass SQL queries to the read_sql_table function to read-only specific columns or records from the PostgreSQL database. The procedure is still the same. The SQL syntax remains the same as a conventional syntax to query data from a SQL table. The below example shows how to get all records of loan_data table using SQL query.

Python3

# import necessary packages
import pandas as pd
import psycopg2
from sqlalchemy import create_engine
  
# establish connection with the database
engine = create_engine(
    "dialect+driver//username:password@hostname:portnumber/databasename")
  
# read table data using sql query
sql_df = pd.read_sql(
    "SELECT * FROM loan_data",
    con=engine
)
  
print(sql_df)

Output:

Postgresql table read as a dataframe using SQLAlchemy

Connecting Pandas to a Database with SQLAlchemy

Connecting Pandas to a Database with SQLAlchemy

Python3

Converting a PostgreSQL table to pandas dataframe

Python3

Passing SQL queries to query table data

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

Google Messages can now show your profile exactly how it’s supposed to be

Recent Comments

EDITOR PICKS

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

POPULAR POSTS

How to factory reset the Google Pixel 8a

The 2024 YouTube Music Recap could be here any day now

How to install Proton VPN on a Fire TV Stick

POPULAR CATEGORY

ABOUT US

FOLLOW US