Thursday, August 28, 2025
HomeLanguagesPyspark – Converting JSON to DataFrame

Pyspark – Converting JSON to DataFrame

In this article, we are going to convert JSON String to DataFrame in Pyspark.

Method 1: Using read_json()

We can read JSON files using pandas.read_json. This method is basically used to read JSON files through pandas.

Syntax: pandas.read_json(“file_name.json”)

Here we are going to use this JSON file for demonstration:

Code:

Python3




# import pandas to read json file
import pandas as pd
  
# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
  
# creating a dataframe from the json file named student
dataframe = spark.createDataFrame(pd.read_json('student.json'))
  
# display the dataframe (Pyspark dataframe)
dataframe.show()


Output:

Method 2: Using spark.read.json()

This is used to read a json data from a file and display the data in the form of a dataframe

Syntax: spark.read.json(‘file_name.json’)

JSON file for demonstration:

Code:

Python3




# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# read json file
data = spark.read.json('college.json')
  
# display json data
data.show()


Output:

RELATED ARTICLES

Most Popular

Dominic
32236 POSTS0 COMMENTS
Milvus
80 POSTS0 COMMENTS
Nango Kala
6609 POSTS0 COMMENTS
Nicole Veronica
11779 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11828 POSTS0 COMMENTS
Shaida Kate Naidoo
6719 POSTS0 COMMENTS
Ted Musemwa
7002 POSTS0 COMMENTS
Thapelo Manthata
6678 POSTS0 COMMENTS
Umr Jansen
6690 POSTS0 COMMENTS