Wednesday, November 27, 2024
Google search engine
HomeData Modelling & AIOne-stop-shop for Connecting Snowflake to Python!

One-stop-shop for Connecting Snowflake to Python!

This article was published as a part of the Data Science Blogathon

In this article, we will learn to connect the Snowflake database with Python using an external browser authentication and key pair authentication. Also, we will learn how to install the snowflake python connector and snowflake SQLAlchemy package to perform the snowflake data read and write operations.

Introduction to Snowflake

If you are reading this post then you might be aware of the snowflake database. It is a cloud computing-based data warehouse that supports SQL for its operations. It is getting extremely popular due to its scalability, multi-language support, ability to handle massive data and the list goes on. It is compatible with Amazon web services, Microsoft Azure, and Google Cloud Platform.

 Connecting Snowflake to Python introduction

Image is taken from https://www.snowflake.com/

In this article, our main focus is not on the snowflake DB here but the connection of snowflake with python and few errors which we encountered while connecting snowflake to python.

Different ways of connecting Python to Snowflake DB

Here I have given a quick overview of how we can connect snowflake to python using the snowflake connector, SQLAlchemy engine, and private key. I have also provided the references(from snowflake documentation) at the end of the article if you wish to explore more on this topic. Let us check the different ways of connections in detail.

Python to Snowflake DB connection Using Snowflake Connector with external browser authentication

First of all, we need to install pandas supported python connector using the below command via Jupyter notebook

!pip install snowflake-connector-python[pandas]

If you are working on your client’s environment then you need to provide the below details in order to connect your python to snowflake DB. An important point to note here is that password is not required in the case of SSO login when we provide authenticator as ‘externalbrowser‘. Google Chrome is the most compatible browser for this operation.

import snowflake.connector
import pandas as pd
import numpy as np
ctx = snowflake.connector.connect(
    account = '<account name>',
    user = '<username>',
    schema = '<schema name>',
    warehouse='<warehouse name>',
    role = '<As specified by Snowflake Admin>',
    authenticator='externalbrowser',
)
cur = ctx.cursor()

Now we can perform multiple operations like read or write data from snowflake to python and vice versa with the help of this cursor object.

Python to Snowflake DB connection Using Snowflake Sqlalchemy with external browser authentication

Since we have installed the python connector earlier now we will install the Snowflake SQLAlchemy package using the below command via Jupyter notebook

!pip install --upgrade snowflake-sqlalchemy

Assuming you are working in the client’s environment and using SSO external browser authentication below query is handy for connection.

import pandas as pd
import numpy as np
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
url = URL(
    account = '<account name>',
    user = '<username>',
    database = '<database name> ,
    schema = '<schema name>',
    warehouse= '<warehouse name>',
    role = '<As specified by Snowflake Admin>',
    authenticator='externalbrowser',
)
engine = create_engine(url)
connection = engine.connect()
  • Data Read and Write operation

Now we can fetch the data from the snowflake DB table in python data frame via the below simple commands

query = '''select * from <snowflake DB tablename>'''
data = pd.read_sql(query, connection)

Similarly, we can also write python data to snowflake tables as below. This operation will truncate and load the snowflake table. We can use ‘append’ in place of ‘replace’ if we want delta load.

data.to_sql('<snowflake table>', engine, if_exists='replace', index=False, index_label=None)
  • Snapshot of connection using ‘externalbrowser‘ and data load

In the below snapshot, you can see how to connect python to snowflake using external browser authentication. Connection details are hidden as it was client-specific data.

 

 Connecting Snowflake to Python snapshot

Python to Snowflake DB connection Using Key Pair Authentication and reuse the existing connection to create an engine for SQLAlchemy

In addition to regular details (e.g. account name, user, database, etc) we need two other important details. One of them is of the path where we have stored the private key in snowflake and second the password to access the key

Below is the python code which is required when we need to connect snowflake using a private key generated earlier in the snowflake environment. We will use this connection to create an SQLAlchemy engine as well due to which it would be easy to read or write data using SQL query. 

 

from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.primitives.asymmetric import dsa
from cryptography.hazmat.primitives import serialization
with open(path, "rb") as key:
    p_key= serialization.load_pem_private_key(key.read(),password='<password>'.encode(),backend=default_backend())
    pkb = p_key.private_bytes(encoding=serialization.Encoding.DER,
                              format=serialization.PrivateFormat.PKCS8,encryption_algorithm=serialization.NoEncryption())
    ctx = snowflake.connector.connect(user=user,account=account,private_key=pkb,warehouse=warehouse,database=database)
if ctx:
    eng = sqlalchemy.create_engine(url,poolclass=sqlalchemy.pool.StaticPool,creator = lambda:ctx)
else:
    eng=sqlalchemy.create_engine(url,creator=get_connect)
con = eng.connect()
url = URL(account = '<accountname>',user = '<username>',database = '<databasename>',warehouse= '<warehouse>',role = '<as specified by snowflake admin>')

 

Error during key pair connection

In case you encounter the below error message after executing the above code

Could not deserialize key data“.

Without fail, you need to check the path you have provided where your RSA key resides. Sometimes it happens that your key is corrupted or not accessible due to any reason then also you can get the above error. In that case, you may notify your snowflake DBA to generate a new RSA key.

We can fetch data from snowflake tables and store them as pandas data frames using the below code.

cur = ctx.cursor()
query = '''select * from <snowflake DB tablename>'''
cur.execute(query)
data = pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])
cur.close()

 

Closing Notes

Always remember to close the cursor when you have executed all the commands related to that cursor and you no longer needed the connection.

In case we need to access two different schema tables then we can also create two different cursors to execute different queries related to those two tables.

Hope you found this article valuable in your quest of connecting Python and the Snowflake database.

 

References

https://docs.snowflake.com/en/user-guide/sqlalchemy.html

https://docs.snowflake.com/en/user-guide/python-connector-pandas.html

https://docs.snowflake.com/en/user-guide/python-connector-example.html

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Himanshu Kunwar

28 May 2021

RELATED ARTICLES

Most Popular

Recent Comments