In this article, we are going to see how to use GroupBy and Sum in SQLAlchemy.
Installing SQLAlchemy
SQLAlchemy is available via pip install package.
pip install sqlalchemy
However, if you are using a flask you can make use of its own implementation of SQLAlchemy. It can be installed using –
pip install flask-sqlalchemy
In the examples, we will make use of common syntax.
- sqlalchemy.create_engine(URL): Creates an engine object that can either be used directly to interact with the database or can be passed to a Session object to work with the ORM. The typical form of a database URL is “dialect+driver://username:password@host:port/database”
- sqlalchemy.select(*entities): The primary construct used to generate SELECT statements. Entities is typically a series of Columns to select.
- sqlalchemy.select(*entities).group_by(column_name): The primary construct used to generate GROUP BY statements.
- sqlalchemy.func.sum(column_name): The SQL SUM() aggregate function.
- sqlalchemy.engine.execute(statement): Executes the given statement and returns a result object.
Using GroupBy and Sum in columns
Example 1:
Before we move ahead, we need to have a database and a table to work with. For this example, we are using mySQL database and have created a sales table. The table has 3 columns and 9 records as shown below.
In the above table, we will use the company column for the grouping and aggregation will be done on the no_of_invoices column.
Python
import sqlalchemy as db # Define the Engine (Connection Object) engine = db.create_engine( # Create the Metadata Object meta_data = db.MetaData(bind = engine) db.MetaData.reflect(meta_data) # Get the `sales` table from the Metadata object SALES = meta_data.tables[ 'sales' ] # SQLAlchemy Query to GROUP BY and aggregate SUM query = db.select([SALES.c.company, db.func. sum (SALES.c.no_of_invoices)]) \ .group_by(SALES.c.company) # Fetch all the records result = engine.execute(query).fetchall() # View the records for record in result: print ( "\n" , "Company:" , record[ 0 ], "| Sum of Invoices:" , record[ 1 ]) |
Output:
Explanation:
- First we import the sqlalchemy library as db for simplicity. All the sqlalchemy object, methods, etc will be imported using db prefix for better clarity.
- We then create the engine which will serve as a connection to the database to perform all the database operations.
- Create the metadata object. The metadata object `metadata` contains all the information about our database.
- Use the metadata information to fetch the sales table from database.
- We can now write an SQLAlchemy query to fetch the required records. We first group by on the basis of company name using the `group_by()` method and then find the sum of the number of invoices using the SQLalchemy’s `func.sum()` function.
- Print the output. In the output we can view that we have the distinct company names and their corresponding sum of invoices produced.
Example 2:
In this example, let us consider the following students table
In this example, we will perform a GROUP BY operation taking two columns as reference.
Python
import sqlalchemy as db # Define the Engine (Connection Object) engine = db.create_engine( # Create the Metadata Object meta_data = db.MetaData(bind = engine) db.MetaData.reflect(meta_data) # Get the `students` table from the Metadata object STUDENTS = meta_data.tables[ 'students' ] # SQLAlchemy Query to GROUP BY and aggregate SUM query = db.select([ STUDENTS.c.first_name, STUDENTS.c.last_name, db.func. sum (STUDENTS.c.score) ]).group_by(STUDENTS.c.first_name, STUDENTS.c.last_name) # Fetch all the records result = engine.execute(query).fetchall() # View the records for record in result: print ( "\n" , record[ 0 ], record[ 1 ], "| Total Score:" , record[ 2 ]) |
Output:
Explanation:
The above code is pretty similar to the one discussed in Example 1 except for the fact that in this example we performed GROUP BY operation on multiple columns, namely, the first_name and the last_name fields. Multiple fields can be mentioned inside the group_by() method as separate parameters.
Example 3:
Considering the students table mentioned in Example 2, let us look at how the output changes if we perform GROUP BY operation on the course field.
Python
import sqlalchemy as db # Define the Engine (Connection Object) engine = db.create_engine( # Create the Metadata Object meta_data = db.MetaData(bind = engine) db.MetaData.reflect(meta_data) # Get the `students` table from the Metadata object STUDENTS = meta_data.tables[ 'students' ] # SQLAlchemy Query to GROUP BY and aggregate SUM query = db.select([ STUDENTS.c.course, db.func. sum (STUDENTS.c.score) ]).group_by(STUDENTS.c.course) # Fetch all the records result = engine.execute(query).fetchall() # View the records for record in result: print ( "\n" , "Course:" , record[ 0 ], "| Total Score:" , record[ 1 ]) |
Output: