Saturday, December 28, 2024
Google search engine
HomeLanguagesHow to Convert String to Integer in Pandas DataFrame?

How to Convert String to Integer in Pandas DataFrame?

Let’s see methods to convert string to an integer in Pandas DataFrame:

Method 1: Use of Series.astype() method.

Syntax: Series.astype(dtype, copy=True, errors=’raise’)

Parameters: This method will take following parameters:

  • dtype: Data type to convert the series into. (for example str, float, int).
  • copy: Makes a copy of dataframe/series.
  • errors: Error raising on conversion to invalid data type. For example dict to string. ‘raise’ will raise the error and ‘ignore’ will pass without raising error.

Return: Series with changed data type.

One of the most effective approaches is Pandas astype(). It is used to modify a set of data types. The columns are imported as the data frame is created from a csv file and the data type is configured automatically which several times is not what it should have. For instance, a salary column may be imported as a string but we have to convert it into float to do operations.

Example 1: 

Python3




# import pandas library
import pandas as pd
 
# dictionary
Data = {'Name': ['GeeksForGeeks','Python'],
          'Unique ID': ['900','450']}
 
# create a dataframe object
df = pd.DataFrame(Data)
 
# convert string to an integer
df['Unique ID'] = df['Unique ID'].astype(int)
 
# show the dataframe
print (df)
print("-"*25)
 
# show the data types
# of each columns
print (df.dtypes)


Output :
 

dataframe with datatypes

Example 2:

Python3




# import pandas library
import pandas as pd
 
# dictionary
Data = {'Algorithm': ['Graph', 'Dynamic Programming',
                      'Number Theory',
                      ' Sorting And Searching'],
         
          'Problems': ['62', '110', '40', '55']}
 
# create a dataframe object
df = pd.DataFrame(Data)
 
# convert string to integer
df['Problems'] = df['Problems'].astype(int)
 
# show the dataframe
print (df)
print("-"*25)
 
# show the data type
# of each columns
print (df.dtypes)


Output :
 

dataframe with data types

Method 2: Use of  pandas.to_numeric () method.

Syntax: pandas.to_numeric(arg, errors=’raise’, downcast=None)

Parameters: This method will take following parameters:

  • arg: list, tuple, 1-d array, or Series.
  • errors: {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
    -> If ‘raise’, then invalid parsing will raise an exception
    -> If ‘coerce’, then invalid parsing will be set as NaN
    -> If ‘ignore’, then invalid parsing will return the input
  • downcast: [default None] If not None, and if the data has been successfully cast to a numerical dtype downcast that resulting data to the smallest numerical dtype possible according to the following rules:
    -> ‘integer’ or ‘signed’: smallest signed int dtype (min.: np.int8)
    -> ‘unsigned’: smallest unsigned int dtype (min.: np.uint8)
    -> ‘float’: smallest float dtype (min.: np.float32)

Returns: numeric if parsing succeeded. Note that return type depends on input. Series if Series, otherwise ndarray.

pandas.to numeric() is one of the widely used methods in order to convert argument to a numeric form in Pandas.

Example 1:

Python3




# import pandas library
import pandas as pd
 
# dictionary
Data = {'Name': ['GeeksForGeeks','Python'],
          'Unique ID': ['900','450']}
 
# create a dataframe object
df = pd.DataFrame(Data)
 
# convert integer to string
df['Unique ID'] = pd.to_numeric(df['Unique ID'])
 
# show the dataframe
print (df)
print("-"*30)
 
# show the data type
# of each columns
print (df.dtypes)


Output :
 

dataframe with datatypes

Example 2:

Python3




# import pandas library
import pandas as pd
 
# dictionary
Data = {'Algorithm': ['Graph', 'Dynamic Programming',
                      'Number Theory',
                      ' Sorting And Searching'],
         
          'Problems': ['62', '110', '40', '55']}
 
# create a dataframe object
df = pd.DataFrame(Data)
 
# convert string to an integer
df['Problems'] = pd.to_numeric(df['Problems'])
 
# show the dataframe
print (df)
print("-"*30)
 
# show the data type
# of each column
print (df.dtypes)


Output :
 

dataframe with datatypes

 

RELATED ARTICLES

Most Popular

Recent Comments