Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas combine_first()
method is used to combine two series into one. The result is union of the two series that is in case of Null value in caller series, the value from passed series is taken. In case of both null values at the same index, null is returned at that index.
Note: This method is different from Series.combine() which takes a function as parameter to decide output value.
Syntax: Series.combine_first(other)
Parameters:
other: Other series to be combined with caller series.Return type: Pandas series
Example:
In this example, two series are created from list using Pandas Series()
method. Some Null values are also passed to each list using Numpy np.nan
. Both series are then combined using .combine_first()
method. At first, the method is called by series1 and result is stored in result1 and then similarly it is called by series2 and stored in result2. Both of the returned series are then printed to compare outputs.
# importing pandas module import pandas as pd # importing numpy module import numpy as np # creating series 1 series1 = pd.Series([ 70 , 5 , 0 , 225 , 1 , 16 , np.nan, 10 , np.nan]) # creating series 2 series2 = pd.Series([ 27 , np.nan, 2 , 23 , 1 , 95 , 53 , 10 , 5 ]) # combining and returning results to variable # calling on series1 result1 = series1.combine_first(series2) # calling on series2 result2 = series2.combine_first(series1) # printing result print ( 'Result 1:\n' , result1, '\n\nResult 2:\n' , result2) |
Output:
As shown in the output, even though the same series were combined, but the outputs are different. This is because of combine_first()
method prioritize first series ( Caller series ) before. If there is null value at that position, it takes value at same index from second series.
Result 1: 0 70.0 1 5.0 2 0.0 3 225.0 4 1.0 5 16.0 6 53.0 7 10.0 8 5.0 dtype: float64 Result 2: 0 27.0 1 5.0 2 2.0 3 23.0 4 1.0 5 95.0 6 53.0 7 10.0 8 5.0 dtype: float64