Pandas timestamp is equivalent to DateTime in Python. The timestamp is used for time series oriented data structures in pandas. Sometimes date and time is provided as a timestamp in pandas or is beneficial to be converted in timestamp. And, it is required to compare timestamps to know the latest entry, entries between two timestamps, the oldest entry, etc. for various tasks. Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple ‘–’ operator.
Given time can be converted to pandas timestamp using pandas.Timestamp() method. This method can take input in various forms such as DateTime-like string (e.g. ‘2017-01-01T12’), Unix epoch in units of seconds (1513393355.5), etc. The values can be taken for a year, month, day, hour, minute, second, etc. separated by commas or using variable names. For example, if we want to write 2018/2/21 11:40:00, we can provide (2018,2,21,11,40) as parameters to Timestamp method or can write ( year=2018,month=2,day=21,hour=11,minute=40). Values not provided will be considered as zero. This approach is used in the following code to create the timestamp column ‘new_time’ using provided date and time information.
Approach:
- Create a dataframe with date and time values
- Convert date and time values to timestamp values using pandas.timestamp() method
- Compare required timestamps using regular comparison operators.
Create a pandas Dataframe with date and time:
Python3
import pandas as pd # Create a dataframe df = pd.DataFrame({ 'year' : [ 2017 , 2017 , 2017 , 2018 , 2018 ], 'month' : [ 11 , 11 , 12 , 1 , 2 ], 'day' : [ 29 , 30 , 31 , 1 , 2 ], 'hour' : [ 10 , 10 , 10 , 11 , 11 ], 'minute' : [ 10 , 20 , 25 , 30 , 40 ]}) def time(rows): return (pd.Timestamp(rows.year, rows.month, rows.day, rows.hour, rows.minute)) # Create new column with entries of date # and time provided in timestamp format df[ 'new_time' ] = df. apply (time, axis = 'columns' ) display(df) |
Output:
Above df is used in following examples.
Example 1: Here, the first and second timestamp in ‘new_time’ are compared to know the oldest among those.
Python3
# Compare first and second timestamps if df[ 'new_time' ][ 0 ] < = df[ 'new_time' ][ 1 ]: print ( "First entry is old" ) else : print ( "Second entry is old" ) |
Output:
First entry is old
Example 2: Here, all timestamps in ‘new_time’ are compared with Timestamp(2018-01-05 12:00:00) and the entries before this timestamp are returned
Python3
# Timestamps satisfying given condition for i in range ( len (df[ 'year' ])): if df[ 'new_time' ][i] < pd.Timestamp( 2018 , 1 , 5 , 12 ): print (df[ 'new_time' ][i]) |
Output:
2017-11-29 10:10:00 2017-11-30 10:20:00 2017-12-31 10:25:00 2018-01-01 11:30:00
Example 3: Here again we compared all timestamps with Timestamp(2018-01-05 12:00:00), but returned comparison result as boolean values (True/False) for all timestamps.
Python3
# Boolean value output for given condition print (df[ 'new_time' ] > pd.Timestamp( 2018 , 1 , 5 , 12 )) |
Output:
0 False 1 False 2 False 3 False 4 True Name: new_time, dtype: bool
Example 4: Here, the max function is used to get the maximum of all timestamps, that is the recent entry in the ‘new_time’ column.
Also, with that, we calculated the time difference between the first and the second timestamp in the ‘new_time’ column.
Python3
# Latest timestamp print ( "Latest Timestamp: " , max (df[ 'new_time' ])) # Get difference between 2 timestamps diff = abs (df[ 'new_time' ][ 0 ] - df[ 'new_time' ][ 1 ]) print ( "Difference: " , diff) |
Output:
Latest Timestamp: 2018-02-02 11:40:00 Difference: 1 days 00:10:00