Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). This data structure can be converted to NumPy ndarray with the help of the DataFrame.to_numpy() method. In this article we will see how to convert dataframe to numpy array.
Syntax of Pandas DataFrame.to_numpy()
Syntax: Dataframe.to_numpy(dtype = None, copy = False)
Parameters:
- dtype: Data type which we are passing like str.
- copy: [bool, default False] Ensures that the returned value is a not a view on another array.
Returns: numpy.ndarray
Convert DataFrame to Numpy Array
Here, we will see how to convert DataFrame to a Numpy array.
Python3
import pandas as pd # initialize a dataframe df = pd.DataFrame( [[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ], [ 10 , 11 , 12 ]], columns = [ 'a' , 'b' , 'c' ]) # convert dataframe to numpy array arr = df.to_numpy() print ( '\nNumpy Array\n----------\n' , arr) print ( type (arr)) |
Output:
Numpy Array ---------- [[ 1 2 3] [ 4 5 6] [ 7 8 9] [10 11 12]] <class 'numpy.ndarray'>
Here we want to convert a particular column into numpy array.
Python3
import pandas as pd # initialize a dataframe df = pd.DataFrame( [[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ], [ 10 , 11 , 12 ]], columns = [ 'a' , 'b' , 'c' ]) # convert dataframe to numpy array arr = df[[ 'a' , 'c' ]].to_numpy() print ( '\nNumpy Array\n----------\n' , arr) print ( type (arr)) |
Output:
Numpy Array ---------- [[ 1 3] [ 4 6] [ 7 9] [10 12]] <class 'numpy.ndarray'>
Here we are converting a dataframe with different datatypes.
Python3
import pandas as pd import numpy as np #initialize a dataframe df = pd.DataFrame( [[ 1 , 2 , 3 ], [ 4 , 5 , 6.5 ], [ 7 , 8.5 , 9 ], [ 10 , 11 , 12 ]], columns = [ 'a' , 'b' , 'c' ]) arr = df.to_numpy() print ( 'Numpy Array' , arr) print ( 'Numpy Array Datatype :' , arr.dtype) |
Output:
Numpy Array [[ 1. 2. 3. ] [ 4. 5. 6.5] [ 7. 8.5 9. ] [10. 11. 12. ]] Numpy Array Datatype : float64
To get the link to the CSV file, click on nba.csv
Example 1:
Here, we are using a CSV file for changing the Dataframe into a Numpy array by using the method DataFrame.to_numpy(). After that, we are printing the first five values of the Weight column by using the df.head() method.
Python3
# importing pandas import pandas as pd # reading the csv data = pd.read_csv( "nba.csv" ) data.dropna(inplace = True ) # creating DataFrame from weight column df = pd.DataFrame(data[ 'Weight' ].head()) # using to_numpy() function print (df.to_numpy()) |
Output:
[[180.] [235.] [185.] [235.] [238.]]
Example 2:
In this example, we are just providing the parameters in the same code to provide the dtype here.
Python3
# importing pandas import pandas as pd # read csv file data = pd.read_csv( "nba.csv" ) data.dropna(inplace = True ) # creating DataFrame from weight column df = pd.DataFrame(data[ 'Weight' ].head()) # providing dtype print (df.to_numpy(dtype = 'float32' )) |
Output:
[[180.] [235.] [185.] [235.] [238.]]
Example 3:
Validating the type of the array after conversion.
Python3
# importing pandas import pandas as pd # reading csv data = pd.read_csv( "nba.csv" ) data.dropna(inplace = True ) # creating DataFrame from weight column df = pd.DataFrame(data[ 'Weight' ].head()) # using to_numpy() print ( type (df.to_numpy())) |
Output:
<class 'numpy.ndarray'>