In this article, we will see how to import Kaggle Datasets into Google Colab.
Getting Started
Here, we are going to cover two different methods to start working with Colab. In the first method, we will use Kaggle API to download our dataset, and after that, we are good to go to use our dataset. In another method, we manually download from the Kaggle website and use our dataset for our production or analysis data. you first need to log in to your Google account, then go to this link https://colab.research.google.com.
Method 1: Downloading Kaggle Dataset in Google Colab Notebook
Step 1: Open your Google Colab Notebook
Step 2: Download and Install the required packages.
pip install opendatasets pip install pandas
Step 3: Visit www.kaggle.com. Go to your profile and click on account.
Step 4: On the following page you will see an API section, where you will find a “Create New API Token” click on it, and it will download a kaggle.json file in which you will get your username and key. we will use username and key in our next step.
Step 5: Import the opendatasets library and download your Kaggle dataset by pasting the link on it.
Python3
import opendatasets as od import pandas od.download( "https: / / www.kaggle.com / datasets / \ muratkokludataset / acoustic - extinguisher - fire - dataset") |
Output:
Step 6: Now we are ready to use our dataset.
- Read file using excel file
- Read file using CSV file
- Read file using a text file
Python3
import pandas as pds # reading the XLSX file file = ('Acoustic_Extinguisher_Fire_Dataset / \ Acoustic_Extinguisher_Fire_Dataset.xlsx') newData = pds.read_excel( file ) # displaying the contents of the XLSX file newData.head() |
Output:
Method 2: By Installing Kaggle In our Colab Notebook
Step 1: Select any dataset from Kaggle
Step 2: Download Dataset API Token
We will download the Kaggle API token which will be present in Account directory under our Kaggle profile section. The file name for the token will be Kaggle.json
To download the dataset
Step 3: Setup the Colab Notebook
To download the dataset into google colab notebook we first have to install kaggle in our local system then we will grant permission kaggle.json file to download file dataset from third party link
- Install the Kaggle library
pip install kaggle
- Make a directory named “.kaggle”
mkdir ~/.kaggle
- Copy the “kaggle.json” into this new directory
cp kaggle.json ~/.kaggle/
- Allocate the required permission for this file.
chmod 600 ~/.kaggle/kaggle.json
Step 4: Download the Dataset into Colab File
To download the dataset into Colab we will use another command followed by the dataset name
- For Downloading dataset
Suppose our dataset web link in
https://www.kaggle.com/datasets/gauravduttakiit/cassava-leaf-disease-classification
then we will type
! kaggle datasets download gauravduttakiit/cassava-leaf-disease-classification
- For downloading competitions data
Suppose our competition data link is
https://www.kaggle.com/competitions/playground-series-s3e14
then we will type
! kaggle competitions download playground-series-s3e14
Method 3: By easily downloading the Kaggle dataset.
Step 1: Visit the Kaggle website and Select the Dataset tab.
Step 2: Select any Dataset and Click on the Download.
Step 3: The downloaded file will be in Zip form, Unzip it.
Step 4: Upload Your Dataset file or folder to Google Colab Notebook. On clicking on Upload your folder/file you will get an option to upload your file/ folder as the given image illustrate.
Step 5: Now we have successfully uploaded our dataset on Google Colab Notebook.
Step 5: Now you are ready to use your Kaggle dataset.
Python3
import pandas as pds # reading the XLSX file file = ('Acoustic_Extinguisher_Fire_Dataset / \ Acoustic_Extinguisher_Fire_Dataset.xlsx') newData = pds.read_excel( file ) # displaying the contents of the XLSX file newData.head() |
Output: