In this article, we will see how to convert an SQLAlchemy ORM to Pandas DataFrame using Python.
We need to have the sqlalchemy as well as the pandas library installed in the python environment –
$ pip install sqlalchemy $ pip install pandas
For our example, we will make use of the MySQL database where we have already created a table named students. You are free to use any database but you need to accordingly create its connection string. The raw SQL script for reference for this example is provided below:
CREATE DATABASE Geeks4Geeks; USE Geeks4Geeks; CREATE TABLE students ( first_name VARCHAR(50), last_name VARCHAR(50), course VARCHAR(50), score FLOAT ); INSERT INTO students VALUES ('Ashish', 'Mysterio', 'Statistics', 96), ('Rahul', 'Kumar', 'Statistics', 83), ('Irfan', 'Malik', 'Statistics', 66), ('Irfan', 'Ahmed', 'Statistics', 81), ('John', 'Wick', 'Statistics', 77), ('Mayon', 'Irani', 'Statistics', 55), ('Ashish', 'Mysterio', 'Sociology', 85), ('Rahul', 'Kumar', 'Sociology', 78), ('Irfan', 'Malik', 'Biology', 92), ('Irfan', 'Ahmed', 'Chemistry', 45), ('John', 'Wick', 'Biology', 78), ('Mayon', 'Irani', 'Physics', 78); SELECT * FROM students;
The syntax for converting the SQLAlchemy ORM to a pandas dataframe is the same as you would do for a raw SQL query, given below –
Syntax: pandas.read_sql(sql, con, **kwargs)
Where:
- sql: The SELECT SQL statement to be executed
- con: SQLAlchemy engine object to establish a connection to the database
Please note that you can also use pandas.read_sql_query() instead of pandas.read_sql()
Example 1:
In the above example, we can see that the sql parameter of the pandas.read_sql() method takes in the SQLAlchemy ORM query as we may have defined it without the pandas dataframe conversion. The db.select() will get converted to raw SQL query when read by the read_sql() method. In the output, we have also printed the type of the response object. The output is a pandas DataFrame object where we have fetched all the records present in the student’s table.
Python3
import pandas import sqlalchemy as db from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() # DEFINE THE ENGINE (CONNECTION OBJECT) engine = db.create_engine("mysql + pymysql:\ / / root:password@localhost / Geeks4Geeks") # CREATE THE TABLE MODEL TO USE IT FOR QUERYING class Students(Base): __tablename__ = 'students' first_name = db.Column(db.String( 50 ), primary_key = True ) last_name = db.Column(db.String( 50 ), primary_key = True ) course = db.Column(db.String( 50 )) score = db.Column(db. Float ) # SQLAlCHEMY ORM QUERY TO FETCH ALL RECORDS df = pandas.read_sql_query( sql = db.select([Students.first_name, Students.last_name, Students.course, Students.score]), con = engine ) print ( "Type:" , type (df)) print () print (df) |
Output:
Example 2:
In this example, we have used the session object to bind the connection engine. We have also applied a filter() method which is equivalent to the WHERE clause in SQL. It consists of the condition that the picked records should contain the first name and the last name of those students who have a score of greater than 80. One thing worth noting here is that, for the queries build using the session object, we need to use the statement attribute to explicitly convert into a raw SQL query. This is required because without the statement attribute, it will be a sqlalchemy.orm.query.Query object which cannot be executed by the pandas.read_sql() method and you will get a sqlalchemy.exc.ObjectNotExecutableError error. The above output shows the type of object which is a pandas DataFrame along with the response.
Python3
import pandas import sqlalchemy as db from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() # DEFINE THE ENGINE (CONNECTION OBJECT) engine = db.create_engine("mysql + pymysql:\ / / root:password@localhost / Geeks4Geeks") # CREATE THE TABLE MODEL TO USE IT FOR QUERYING class Students(Base): __tablename__ = 'students' first_name = db.Column(db.String( 50 ), primary_key = True ) last_name = db.Column(db.String( 50 ), primary_key = True ) course = db.Column(db.String( 50 )) score = db.Column(db. Float ) # CREATE A SESSION OBJECT TO INITIATE QUERY IN DATABASE from sqlalchemy.orm import sessionmaker Session = sessionmaker(bind = engine) session = Session() # SQLAlCHEMY ORM QUERY TO FETCH ALL RECORDS df = pandas.read_sql_query( sql = session.query(Students.first_name, Students.last_name). filter ( Students.score > 80 ).statement, con = engine ) print ( "Type:" , type (df)) print () print (df) |
Output: