Friday, December 27, 2024
Google search engine
HomeLanguagesSQLAlchemy ORM conversion to Pandas DataFrame

SQLAlchemy ORM conversion to Pandas DataFrame

In this article, we will see how to convert an SQLAlchemy ORM to Pandas DataFrame using Python.

We need to have the sqlalchemy as well as the pandas library installed in the python environment –

$ pip install sqlalchemy
$ pip install pandas

For our example, we will make use of the MySQL database where we have already created a table named students. You are free to use any database but you need to accordingly create its connection string. The raw SQL script for reference for this example is provided below:

CREATE DATABASE Geeks4Geeks;
USE Geeks4Geeks;

CREATE TABLE students (
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    course VARCHAR(50),
    score FLOAT
);

INSERT INTO students VALUES
('Ashish', 'Mysterio', 'Statistics', 96),
('Rahul', 'Kumar', 'Statistics', 83),
('Irfan', 'Malik', 'Statistics', 66),
('Irfan', 'Ahmed', 'Statistics', 81),
('John', 'Wick', 'Statistics', 77),
('Mayon', 'Irani', 'Statistics', 55),
('Ashish', 'Mysterio', 'Sociology', 85),
('Rahul', 'Kumar', 'Sociology', 78),
('Irfan', 'Malik', 'Biology', 92),
('Irfan', 'Ahmed', 'Chemistry', 45),
('John', 'Wick', 'Biology', 78),
('Mayon', 'Irani', 'Physics', 78);

SELECT * FROM students;

The syntax for converting the SQLAlchemy ORM to a pandas dataframe is the same as you would do for a raw SQL query, given below –

Syntax: pandas.read_sql(sql, con, **kwargs)

Where:

  • sql: The SELECT SQL statement to be executed
  • con: SQLAlchemy engine object to establish a connection to the database

Please note that you can also use pandas.read_sql_query() instead of pandas.read_sql()

Example 1:

In the above example, we can see that the sql parameter of the pandas.read_sql() method takes in the SQLAlchemy ORM query as we may have defined it without the pandas dataframe conversion. The db.select() will get converted to raw SQL query when read by the read_sql() method. In the output, we have also printed the type of the response object. The output is a pandas DataFrame object where we have fetched all the records present in the student’s table.

Python3




import pandas
import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
 
Base = declarative_base()
 
# DEFINE THE ENGINE (CONNECTION OBJECT)
engine = db.create_engine("mysql+pymysql:\
//root:password@localhost/Geeks4Geeks")
 
# CREATE THE TABLE MODEL TO USE IT FOR QUERYING
class Students(Base):
 
    __tablename__ = 'students'
 
    first_name = db.Column(db.String(50),
                           primary_key=True)
    last_name  = db.Column(db.String(50),
                           primary_key=True)
    course     = db.Column(db.String(50))
    score      = db.Column(db.Float)
 
# SQLAlCHEMY ORM QUERY TO FETCH ALL RECORDS
df = pandas.read_sql_query(
    sql = db.select([Students.first_name,
                     Students.last_name,
                     Students.course,
                     Students.score]),
    con = engine
)
 
print("Type:", type(df))
print()
print(df)


Output:

Example 2:

In this example, we have used the session object to bind the connection engine. We have also applied a filter() method which is equivalent to the WHERE clause in SQL. It consists of the condition that the picked records should contain the first name and the last name of those students who have a score of greater than 80. One thing worth noting here is that, for the queries build using the session object, we need to use the statement attribute to explicitly convert into a raw SQL query. This is required because without the statement attribute, it will be a sqlalchemy.orm.query.Query object which cannot be executed by the pandas.read_sql() method and you will get a sqlalchemy.exc.ObjectNotExecutableError error. The above output shows the type of object which is a pandas DataFrame along with the response.

Python3




import pandas
import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
 
Base = declarative_base()
 
# DEFINE THE ENGINE (CONNECTION OBJECT)
engine = db.create_engine("mysql+pymysql:\
//root:password@localhost/Geeks4Geeks")
 
# CREATE THE TABLE MODEL TO USE IT FOR QUERYING
class Students(Base):
 
    __tablename__ = 'students'
 
    first_name = db.Column(db.String(50),
                           primary_key=True)
    last_name  = db.Column(db.String(50),
                           primary_key=True)
    course     = db.Column(db.String(50))
    score      = db.Column(db.Float)
 
# CREATE A SESSION OBJECT TO INITIATE QUERY IN DATABASE
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind = engine)
session = Session()
 
# SQLAlCHEMY ORM QUERY TO FETCH ALL RECORDS
df = pandas.read_sql_query(
    sql = session.query(Students.first_name,
                        Students.last_name).filter(
      Students.score > 80).statement,
    con = engine
)
 
print("Type:", type(df))
print()
print(df)


Output:

RELATED ARTICLES

Most Popular

Recent Comments