Monday, October 20, 2025
HomeLanguagesPyspark – Converting JSON to DataFrame

Pyspark – Converting JSON to DataFrame

In this article, we are going to convert JSON String to DataFrame in Pyspark.

Method 1: Using read_json()

We can read JSON files using pandas.read_json. This method is basically used to read JSON files through pandas.

Syntax: pandas.read_json(“file_name.json”)

Here we are going to use this JSON file for demonstration:

Code:

Python3




# import pandas to read json file
import pandas as pd
  
# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
  
# creating a dataframe from the json file named student
dataframe = spark.createDataFrame(pd.read_json('student.json'))
  
# display the dataframe (Pyspark dataframe)
dataframe.show()


Output:

Method 2: Using spark.read.json()

This is used to read a json data from a file and display the data in the form of a dataframe

Syntax: spark.read.json(‘file_name.json’)

JSON file for demonstration:

Code:

Python3




# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# read json file
data = spark.read.json('college.json')
  
# display json data
data.show()


Output:

RELATED ARTICLES

Most Popular

Dominic
32361 POSTS0 COMMENTS
Milvus
88 POSTS0 COMMENTS
Nango Kala
6728 POSTS0 COMMENTS
Nicole Veronica
11892 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11954 POSTS0 COMMENTS
Shaida Kate Naidoo
6852 POSTS0 COMMENTS
Ted Musemwa
7113 POSTS0 COMMENTS
Thapelo Manthata
6805 POSTS0 COMMENTS
Umr Jansen
6801 POSTS0 COMMENTS