Series is a type of list in Pandas that can take integer values, string values, double values, and more. But in Pandas Series we return an object in the form of a list, having an index starting from 0 to n, Where n is the length of values in the series. Later in this article, we will discuss Dataframes in pandas, but we first need to understand the main difference between Series and Dataframe.
Series can only contain a single list with an index, whereas Dataframe can be made of more than one series or we can say that a Dataframe is a collection of series that can be used to analyze the data.
Creating Pandas DataFrames from Series
Python3
# importing pandas library import pandas as pd # Creating a list author = [ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ] # Creating a Series by passing list # variable to Series() function auth_series = pd.Series(author) # Printing Series print (auth_series) |
Output:
0 Jitender
1 Purnima
2 Arpit
3 Jyoti
dtype: object
Let’s check the type of Series:
Python3
print ( type (auth_series)) |
Output:
<class 'pandas.core.series.Series'>
Create DataFrame From Multiple Series
We have created two lists ‘author’ and article’ which have been passed to pd.Series() functions to create two Series. After creating the Series, we created a dictionary and passed Series objects as values of the dictionary, and the keys of the dictionary will be served as Columns of the Dataframe.
Python3
# Importing Pandas library import pandas as pd # Creating two lists author = [ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ] article = [ 210 , 211 , 114 , 178 ] # Creating two Series by passing lists auth_series = pd.Series(author) article_series = pd.Series(article) # Creating a dictionary by passing Series objects as values frame = { 'Author' : auth_series, 'Article' : article_series} # Creating DataFrame by passing Dictionary result = pd.DataFrame(frame) # Printing elements of Dataframe print (result) |
Output:
Author Article 0 Jitender 210 1 Purnima 211 2 Arpit 114 3 Jyoti 178
Add a Column in Pandas Dataframe
We have added one more series externally named as the age of the authors, then directly added this series in the Pandas Dataframe.
Python3
# Importing pandas library import pandas as pd # Creating Series auth_series = pd.Series([ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ]) article_series = pd.Series([ 210 , 211 , 114 , 178 ]) # Creating Dictionary frame = { 'Author' : auth_series, 'Article' : article_series} # Creating Dataframe result = pd.DataFrame(frame) # Creating another list age = [ 21 , 21 , 24 , 23 ] # Creating new column in the dataframe by # providing s Series created using list result[ 'Age' ] = pd.Series(age) # Printing dataframe print (result) |
Output:
Author Article Age 0 Jitender 210 21 1 Purnima 211 21 2 Arpit 114 24 3 Jyoti 178 23
Missing value in Pandas Dataframe
Remember one thing if any value is missing then by default it will be converted into NaN value, i.e, null by default.
Python3
# Importing pandas library import pandas as pd # Creating Series auth_series = pd.Series([ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ]) article_series = pd.Series([ 210 , 211 , 114 , 178 ]) # Creating Dictionary frame = { 'Author' : auth_series, 'Article' : article_series} # Creating Dataframe result = pd.DataFrame(frame) # Creating another list age = [ 21 , 21 , 24 ] # Creating new column in the dataframe by # providing s Series created using list result[ 'Age' ] = pd.Series(age) # Printing dataframe print (result) |
Output:
Author Article Age 0 Jitender 210 21.0 1 Purnima 211 21.0 2 Arpit 114 23.0 3 Jyoti 178 NaN
Creating a Dataframe using a dictionary of Series
Here, we have passed a dictionary that has been created using a series as values then passed this dictionary to create a Dataframe. We can see while creating a Dataframe using Python Dictionary, the keys of the dictionary will become Columns and values will become Rows.
Python3
# Importing pandas library import pandas as pd # Creating dictionary of Series dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ]), 'Author_Book_No' :\ pd.Series([ 210 , 211 , 114 , 178 ]), 'Age' : pd.Series([ 21 , 21 , 24 , 23 ])} # Creating Dataframe df = pd.DataFrame(dict1) # Printing dataframe print (df) |
Output:
Auth_Name Auth_Book_No Age 0 Jitender 210 21 1 Purnima 211 21 2 Arpit 114 24 3 Jyoti 178 23
Explicit Indexing in Pandas Dataframe
Here we can see after providing an index to the dataframe explicitly, it has filled all data with NaN values since we have created this dataframe using Series and Series has its own default indices(0,1,2) which is why when indices of both dataframe and Series do not match, we got all NaN values.
Python3
# Importing pandas library import pandas as pd # Creating dictionary of Series dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ]), 'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ]), 'Age' : pd.Series([ 21 , 21 , 24 , 23 ])} # Creating Dataframe df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]) # Printing dataframe print (df) |
Output:
Auth_Name Author_Book_No Age SNo1 NaN NaN NaN SNo2 NaN NaN NaN SNo3 NaN NaN NaN SNo4 NaN NaN NaN
Here, we can rectify this problem by providing the same index values to every Series element.
Python3
# This code is provided by Sheetal Verma # Importing pandas library import pandas as pd # Creating dictionary of Series dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' , 'Purnima' , 'Arpit' , 'Jyoti' ], index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]), 'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ], index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]), 'Age' : pd.Series([ 21 , 21 , 24 , 23 ], index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])} # Creating Dataframe df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]) # Printing dataframe print (df) |
Output:
Auth_Name Author_Book_No Age SNo1 Jitender 210 21 SNo2 Purnima 211 21 SNo3 Arpit 114 24 SNo4 Jyoti 178 23