Thursday, September 25, 2025
HomeLanguagesPyspark – Converting JSON to DataFrame

Pyspark – Converting JSON to DataFrame

In this article, we are going to convert JSON String to DataFrame in Pyspark.

Method 1: Using read_json()

We can read JSON files using pandas.read_json. This method is basically used to read JSON files through pandas.

Syntax: pandas.read_json(“file_name.json”)

Here we are going to use this JSON file for demonstration:

Code:

Python3




# import pandas to read json file
import pandas as pd
  
# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
  
# creating a dataframe from the json file named student
dataframe = spark.createDataFrame(pd.read_json('student.json'))
  
# display the dataframe (Pyspark dataframe)
dataframe.show()


Output:

Method 2: Using spark.read.json()

This is used to read a json data from a file and display the data in the form of a dataframe

Syntax: spark.read.json(‘file_name.json’)

JSON file for demonstration:

Code:

Python3




# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# read json file
data = spark.read.json('college.json')
  
# display json data
data.show()


Output:

RELATED ARTICLES

Most Popular

Dominic
32319 POSTS0 COMMENTS
Milvus
84 POSTS0 COMMENTS
Nango Kala
6682 POSTS0 COMMENTS
Nicole Veronica
11854 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11910 POSTS0 COMMENTS
Shaida Kate Naidoo
6795 POSTS0 COMMENTS
Ted Musemwa
7071 POSTS0 COMMENTS
Thapelo Manthata
6753 POSTS0 COMMENTS
Umr Jansen
6761 POSTS0 COMMENTS