Monday, November 18, 2024
Google search engine
HomeLanguagesCollecting data with Google forms and Pandas

Collecting data with Google forms and Pandas

In this article, we will discuss how to collect data from Google forms using Pandas. 

Modules Needed

  • pydrive: This is used to simplify many common Google Drive API tasks. To install this type the below command in the terminal.
pip install pydrive
  • xlrd: This module is used to extract data from a spreadsheet. To install this type the below command in the terminal.
pip install xlrd
  • openpyxl: This is a Python library for reading and writing Excel. To install this type the below command in the terminal.
pip install openpyxl

Getting Started

The First important part of the whole process is to create a Google Form. Visit this link and create a new form.

Form Creation in Google Forms

When we are done setting up the questions, click Send button to publish the form. To access this Google form, click here.

The next step is to authenticate our Python Script and Local Environment with Google and access the Google Form Data from Google Drive by enabling the Google Drive API in Google Cloud Platform. Follow these steps:

  1. Goto the Google Form
  2. Responses Tab,
  3. Click the spreadsheet icon
  4. And create a new Destination Spreadsheet for our google form.

Link a new spreadsheet and Google Form

This Spreadsheet will be created and stored in Google Drive. We have to integrate Google Drive and Python Environment. First, we have to authenticate with Google.

Authenticating the Google Drive API

Now, to work with Google Drive API, we have to set up our account, enable Google Drive API and obtain our client_secrets.json key.

1. Go to the Google Cloud Platform Console, click on New Project / Projects → New Project, on the GCP Toolbar.

Click on the Project Name and then select New Project

2. Enter a name for our project and click Create.

Creation of a GCP Project

3. In the home screen of our project, click on the Navigation Bar in the Top-Left Corner and Click APIs and Services, and Select OAuth Consent Screen. 

Navigating to Oauth Consent Screen

4. In the OAuth Consent Screen Homepage, select External and click Create

Oauth Initial Step

5. In the next screen, Enter the mandatory details that include App Name, User Support Email, and Developer Contact Information. Click on “Save and Continue” and proceed to the Summary Page.

App Details

6. In the Summary Screen, click “Back to Dashboard”. We will now have an option to Publish our App. Now We are ready to Push our App to production.

Publish our App

7. Now that We have published our App, click on Search Bar and search for Google Drive API. Select the Google Drive API and enable the Service.

Search for Google Drive API Service

Enable the Google Drive API

8. Now that We have published our App and enabled the Google Drive API, click on Navigation Bar, and in APIs and Services, select Credentials.

Create Credentials

9. In the OAuth Client ID Screen, Select Application Type as Desktop App, Enter the Name and click Create.

Create OAuth Client ID

9. We will be redirected to Credential Homepage where we will be able to find our OAuth Client ID. Click on the Download Key Option and save the .JSON file by the name client_secrets.json

Download the Secret Key

Note: This secret key and the Python File created below should be present in the same directory.

Python Implementation:

Python3




from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
  
# Initializing a GoogleAuth Object
gauth = GoogleAuth()
  
# client_secrets.json file is verified
# and it automatically handles authentication
gauth.LocalWebserverAuth()
  
# GoogleDrive Instance is created using
# authenticated GoogleAuth instance
drive = GoogleDrive(gauth)
  
# Initialize GoogleDriveFile instance with file id
file_obj = drive.CreateFile({'id': 'FILE_ID'})
file_obj.GetContentFile('FILE_NAME.xls',
         mimetype='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
  
dataframe = pd.read_excel('FILE_NAME.xls')
print(dataframe)


Note: The Google Accounts used for Google Forms Creation and Python Environment GDrive Authentication must be the same

When We run the code, a Google Authentication Flow window pops up in our default Web Browser. We have to select our Google Account and Continue with the Flow.

Select our google account

Click Advanced -> Go to GFG App (unsafe)

We have to allow permission and again click Allow in the Next Screen.

Enable Permission for our App

Now we will receive a message in our browser which says “The authentication flow has completed”. And move to our Python environment, we will be able to see the Data in Pandas Dataframe format.

Output:

Pandas dataframe obtained in Real-Time from Google form through Google Drive API

RELATED ARTICLES

Most Popular

Recent Comments