Friday, December 27, 2024
Google search engine
HomeLanguagesSelect Pandas dataframe rows between two dates

Select Pandas dataframe rows between two dates

Prerequisites: pandas

Pandas is an open-source library that is built on top of NumPy library. It is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing data much easier. Pandas is fast and it has high-performance & productivity for users.

This article focuses on getting selected pandas data frame rows between two dates. We can do this by using a filter.

Dates can be represented initially in several ways :

  • string
  • np.datetime64
  • datetime.datetime

To manipulate dates in pandas, we use the pd.to_datetime() function in pandas to convert different date representations to datetime64[ns] format.

Syntax: pandas.to_datetime(arg, errors=’raise’, dayfirst=False, yearfirst=False, utc=None, box=True, format=None, exact=True, unit=None, infer_datetime_format=False, origin=’unix’, cache=False)

Parameters:

  • arg: An integer, string, float, list or dict object to convert in to Date time object.
  • dayfirst: Boolean value, places day first if True.
  • yearfirst: Boolean value, places year first if True.
  • utc: Boolean value, Returns time in UTC if True.
  • format: String input to tell position of day, month and year.

Approach

  • Import module
  • Create or load data
  • Create dataframe
  • Convert the dates column to datetime64[ns] data type
  • Define a start date and end date.
  • Use a filter to display the updated dataframe and store it.
  • Display dataframe

Example: Original dataframe

Python3




import pandas as pd
data = {'Name': ['Tani', 'Saumya',
                 'Ganesh', 'Kirti'],
  
        'Articles': [5, 3, 4, 3],
  
        'Location': ['Kanpur', 'Kolkata',
                     'Kolkata', 'Bombay'],
        'Dates': ['2020-08-04', '2020-08-07', '2020-08-08', '2020-06-08']}
  
# Create DataFrame
df = pd.DataFrame(data)
display(df)


Output:

Example: Selecting data frame rows between two rows

Python3




import pandas as pd
data = {'Name': ['Tani', 'Saumya',
                 'Ganesh', 'Kirti'],
  
        'Articles': [5, 3, 4, 3],
  
        'Location': ['Kanpur', 'Kolkata',
                     'Kolkata', 'Bombay'],
        'Dates': ['2020-08-04', '2020-08-07', '2020-08-08', '2020-06-08']}
  
# Create DataFrame
df = pd.DataFrame(data)
start_date = '2020-08-05'
end_date = '2020-08-08'
mask = (df['Dates'] > start_date) & (df['Dates'] <= end_date)
  
df = df.loc[mask]
display(df)


Output:

RELATED ARTICLES

Most Popular

Recent Comments