Reading binary files is an important skill for working with data (non-textual) such as images, audio, and videos. Using file mode and the “read” method you can easily read binary files. Python has the ability to handle the data and consent provides various help with certain criteria. Whether you are dealing with multimedia files, compressed data, or custom binary formats, Python’s ability to handle binary data empowers you to create powerful and versatile applications for a wide range of use cases. In this article, you will learn What binary files are and how to read data into a byte array, and Read binary data into chunks? and so on.
What are Binary files?
Generally, binary means two. In computer science, binary files are stored in a binary format having digits 0’s and 1’s. For example, the number 9 in binary format is represented as ‘1001’. In this way, our computer stores each and every file in a machine-readable format in a sequence of binary digits. The structure and format of binary files depend on the type of file. Image files have different structures when compared to audio files. However, decoding binary files depends on the complexity of the file format. In this article, let’s understand the reading of binary files.
Python Read A Binary File
To read a binary file,
Step 1: Open the binary file in binary mode
To read a binary file in Python, first, we need to open it in binary mode (‘”rb”‘). We can use the ‘open()’ function to achieve this.
Step 2: Create a binary file
To create a binary file in Python, You need to open the file in binary write mode ( wb ). For more refer to this article.
Step 3: Read the binary data
After opening the binary file in binary mode, we can use the read() method to read its content into a variable. The” read()” method will return a sequence of bytes, which represents the binary data.
Step 4: Process the binary data
Once we have read the binary data into a variable, we can process it according to our specific requirements. Processing the binary data could involve various tasks such as decoding binary data, analyzing the content, or writing the data to another binary file.
Step 5: Close the file
After reading and processing the binary data, it is essential to close the file using the “close()” method to release system resources and avoid potential issues with file access.
Python3
# Opening the binary file in binary mode as rb(read binary) f = open ( "files.zip" , mode = "rb" ) # Reading file data with read() method data = f.read() # Knowing the Type of our data print ( type (data)) # Printing our byte sequenced data print (data) # Closing the opened file f.close() |
Output:
In the output, we see a sequence of byte data as bytes are the fundamental unit of binary representation.
b’PK\x03\x04\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00TODO11.txt\xe3\xe5JN,\xceH-/\xe6\xe5\x82\xc0\xcc\xbc\x92\xd4\x9c\x9c\xcc\x82\xc4\xc4\x12^.w7w\x00PK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x01\x00 \x00\x00\x00\x00\x00\x00\x00TODO11.txtPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x008\x00\x00\x00F\x00\x00\x00\x00\x00′
Reading binary data into a byte array
This given code demonstrates how to read binary data from a file into a byte array and then To read binary data into a binary array print the data using a while loop. Let’s explain the code step-by-step:
Open the Binary File
This line opens the binary file named “string.bin” in binary mode (‘”rb”‘). The file is opened for reading, and the file object is stored in the variable ‘file’.
Python3
# Open the binary file file = open ( "string.bin" , "rb" ) |
Reading the first three bytes
This line reads the first three bytes from the binary file and stores them in the variable “data”. The “read(3)” method bytes from the file and advance the pointer accordingly.
Python3
data = file .read( 3 ) |
Print data using a “while ” Loop
The loop will keep reading and printing three bytes at a time until the end of the file is reached. Once the end of the file is reached, the read() method will return an empty bytes object, which evaluates to False in the while loop condition, and the loop will terminate.
Python3
while data: print (data) data = file .read( 3 ) |
Close the Binary File
Finally, after the loop has finished reading and printing the data, we close the binary file using the ‘close()’ method to release system resources.
Python3
file .close() |
Now by using the above steps in one, we will get this :
The code output will depend on the content of the “string.bin” binary file. The code reads and prints the data in chunks of three bytes at a time until the end of the file is reached. Each iteration of the loop will print the three bytes read from the file.
Python
# Open the binary file file = open ( "string.bin" , "rb" ) # Reading the first three bytes from the binary file data = file .read( 3 ) # Printing data by iterating with while loop while data: print (data) data = file .read( 3 ) # Close the binary file file .close() |
For example, if the content of “string.bin” is b’GeeksForGeeks’ (a sequence of six bytes), the output will be:
Output:
b 'Gee'
b ' ksf'
b 'org'
b 'eek'
b 's'
Read Binary files in Chunks
To Read binary file data in chunks we use a while loop to read the binary data from the file in chunks of the specified size (chunk_size). The loop continues until the end of the file is reached, and each chunk of data is processed accordingly.
In this “chunk_size=1024” is used to specify the size of each chunk to read the binary file. file = open(“binary_file.bin”, “rb”): This line opens the binary file named “binary_file.bin” in binary mode (“rb”). while True is used to sets up an infinite loop that will keep reading the file in chunks until the end of the file is reached. “chunk = file. read(chunk_size)” is Inside the loop, and the read(chunk_size) method is used to read a chunk of binary data from the file.
Python3
# Specify the size of each chunk to read chunk_size = 10 file = open ( "binary_file.bin" , "rb" ) # Using while loop to iterate the file data while True : chunk = file .read(chunk_size) if not chunk: break # Processing the chunk of binary data print (f "Read {len(chunk)} bytes: {chunk}" ) |
The output of the code will depend on the content of the “binary_file.bin” binary file and the specified “chunk_size”, For example, if the file contains the binary data “b” Hello, this is binary data!’, and the chunk_size is set to 10, the output will be:
Output :
Read 10 bytes: b'Hello, thi'
Read 10 bytes: b's is binar'
Read 7 bytes: b'y data!'
Outputs vary depending on the binary file data we are reading and also on the chunk size we are specifying.
Read Binary file Data into Array
To read a binary file into an array.bin and used the “wb” mode to write a given binary file. The “array” is the name of the file. assigned array as num=[3,6,9,12,18] to get the array in byte format. use byte array().
To write an array to the file we use:
Python3
file = open ( "array" , "wb" ) num = [ 3 , 6 , 9 , 12 , 18 ] array = bytearray(num) file .write(array) file .close() |
To read the written array from the given file, we have used the same file i.e., file=open(“array”, “rb”). rb used to read the array from the file. The list() is used to create a list object. number=list(file. read(3)). To read the bytes from the file. read() is used.
Python3
file = open ( "array" , "rb" ) number = list ( file .read( 3 )) print (number) file .close() |
Output:
[3,6,9]
Read Binary files in Python using NumPy
To read a binary file into a NumPy array, import module NumPy. The “dtype” is “np.unit8” which stands for “unsigned 8-bit integer” This means that each item in the array is an 8-bit (1 byte) integer, with values that can range from 0 to 255.
Python3
import numpy as np # Open the file in binary mode with open ( 'myfile.bin' , 'rb' ) as f: # Read the data into a NumPy array array = np.fromfile(f, dtype = np.uint8) # Change dtype according to your data |
Remember to change your file to your binary files
Output:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=np.uint8)
Related Article
Python | Convert String to bytes