Saturday, December 28, 2024
Google search engine
HomeLanguagesHow to write Pandas DataFrame to PostgreSQL table?

How to write Pandas DataFrame to PostgreSQL table?

In this article, we will be looking at some methods to write Pandas dataframes to PostgreSQL tables in the Python.

Method 1: Using to_sql() function

to_sql function is used to write the given dataframe to a SQL database.

Syntax 

df.to_sql(‘data’, con=conn, if_exists=’replace’, index=False)

Parameters :

  • data: name of the table.
  • con: connection to the database.
  • if_exists: if table exists or not. “replace” or “append”.
  • index: True or False.

Example:

In the example demonstrated below, we import the required packages and modules, establish a connection to the PostgreSQL database and convert the dataframe to PostgreSQL table by using the to_sql() method. Finally, all rows are fetched using fetchall() method.

To access the CSV file used click here.

Python3




# import packages
import psycopg2
import pandas as pd
from sqlalchemy import create_engine
  
# establish connections
  
db = create_engine(conn_string)
conn = db.connect()
conn1 = psycopg2.connect(
    database="Airlines_Database",
  user='postgres'
  password='pass'
  host='127.0.0.1'
  port= '5432'
)
  
conn1.autocommit = True
cursor = conn1.cursor()
  
# drop table if it already exists
cursor.execute('drop table if exists airlines_final')
  
sql = '''CREATE TABLE airlines_final(id int ,
day int ,airline char(20),destination char(20));'''
  
cursor.execute(sql)
  
# import the csv file to create a dataframe
data = pd.read_csv("airlines_final.csv")
  
data = data[["id","day","airline","destination"]]
# Create DataFrame
print(data)
  
# converting data to sql
data.to_sql('airlines_final', conn, if_exists= 'replace')
  
# fetching all rows
sql1='''select * from airlines_final;'''
cursor.execute(sql1)
for i in cursor.fetchall():
    print(i)
  
conn1.commit()
conn1.close()


Output:

Method 2: Using execute_values() function

The execute_values() function from the psycopg2 library is used to get the postgres table of the given data frame.

Syntax:

psycopg2.extras.execute_values(cur, sql, argslist, template=None, page_size=100, fetch=False)

Parameters:

  • cur – the cursor that will be used to run the query.
  • sql – the query that will be run. A single percent s placeholder must be present, which will be substituted by a VALUES list.
  • argslist — a list of series or dictionaries containing the query’s arguments. The template must be followed in terms of font and content.
  • template – the snippet that will be merged into each item in the argslist to form the query
  • page size – the maximum amount of argslist items that each statement can have.
  • fetch – it’s similar to fetchall. the values can be “True” or “False”

Example:

In this example, the same CSV file is used in this method. code begins with importing packages, then we form a custom function execute_values, where the given dataframe, connection, and table name are given as arguments. The dataframe rows and values are updated into the PostgreSQL table using the execute_values() method. The defined method contains an exception handling block, if there’s no exception “execute_values() done” is printed.

Python3




# import packages
import psycopg2
import psycopg2.extras as extras
import pandas as pd
  
  
def execute_values(conn, df, table):
  
    tuples = [tuple(x) for x in df.to_numpy()]
  
    cols = ','.join(list(df.columns))
  
    # SQL query to execute
    query = "INSERT INTO %s(%s) VALUES %%s" % (table, cols)
    cursor = conn.cursor()
    try:
        extras.execute_values(cursor, query, tuples)
        conn.commit()
    except (Exception, psycopg2.DatabaseError) as error:
        print("Error: %s" % error)
        conn.rollback()
        cursor.close()
        return 1
    print("execute_values() done")
    cursor.close()
  
  
# establishing connection
conn = psycopg2.connect(
    database="Airlines_Database",
    user='postgres',
    password='sherlockedisi',
    host='127.0.0.1',
    port='5432'
)
sql = '''CREATE TABLE airlines_final1(id int ,day
char(20) ,airline char(20),destination char(20));'''
  
# creating a cursor
cursor = conn.cursor()
cursor.execute(sql)
data = pd.read_csv("airlines_final.csv")
  
data = data[["id", "day", "airline", "destination"]]
  
# using the function defined
execute_values(conn, data, 'airlines_final1')


Output:

execute_values() done

RELATED ARTICLES

Most Popular

Recent Comments