Monday, November 18, 2024
Google search engine
HomeLanguagesPython – Get list of files in directory with size

Python – Get list of files in directory with size

In this article, we are going to see how to extract the list of files of the directory along with its size. For this, we will use the OS module.

OS module in Python provides functions for interacting with the operating system. OS comes under Python’s standard utility modules. This module provides a portable way of using operating system-dependent functionality. os.path module is a submodule of OS module in Python used for common path name manipulation.

Functions Used

  • os.path.isfile() method in Python is used to check whether the specified path is an existing regular file or not.

Syntax: os.path.isfile(path)

Parameter:  

  • path:  path-like object representing a file system path. A path-like object is either a string or bytes object representing a path.

Return Type: This method returns a Boolean value of class bool. This method returns True if specified path is an existing regular file, otherwise returns False. 

  • os.path.join() method in Python join one or more path components intelligently. This method concatenates various path components with exactly one directory separator (‘/’) following each non-empty part except the last path component. If the last path component to be joined is empty then a directory separator (‘/’) is put at the end. If a path component represents an absolute path, then all previous components joined are discarded and joining continues from the absolute path component.

Syntax: os.path.join(path, *paths)  

Parameter:  

  • path:  A path-like object representing a file system path.  
  • *path: A path-like object representing a file system path. It represents the path components to be joined.
  • A path-like object is either a string or bytes object representing a path.
  • Note: The special syntax *args (here *paths) in function definitions in python is used to pass a variable number of arguments to
  • a function.

Return Type: This method returns a string which represents the concatenated path components.  

  • os.listdir(): This method in python is used to get the list of all files and directories in the specified directory. If we don’t specify any directory, then a list of files and directories in the current working directory will be returned.

Syntax: os.listdir(path)

Parameters:  

  • path (optional) : path of the directory

Return Type: This method returns the list of all files and directories in the specified path. The return type of this method is list.

  • filter(): This method filters the given sequence with the help of a function that tests each element in the sequence to be true or not.

Syntax: filter(function, sequence)

Parameters:

  • function: function that tests if each element of a sequence true or not.
  • sequence: sequence which needs to be filtered, it can be sets, lists, tuples, or containers of any iterators.

Returns: returns an iterator that is already filtered.

  • os.stat(): This method in Python performs stat() system call on the specified path. This method is used to get the status of the specified path.

Syntax: os.stat(path)

Parameter:

path: A string or bytes object representing a valid path

Returns : st_size: It represents the size of the file in bytes.

  • os.walk(): This generates the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at the directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).

List of files in a directory with size

In this part of the code, we will just get the list of files’ names and sizes. In this code, we have os.stat() function to get the size of each file, and the size will results in ‘byte’ so we have to divide the size of the file from 1024*1024 to get the size in the ‘megabytes’ for a better understanding. 

Directory Used

Python3




# import python modules
import os
 
# directory name from which
# we are going to extract our files with its size
path = "D:\Books"
 
# Get list of all files only in the given directory
fun = lambda x : os.path.isfile(os.path.join(path,x))
files_list = filter(fun, os.listdir(path))
 
# Create a list of files in directory along with the size
size_of_file = [
    (f,os.stat(os.path.join(path, f)).st_size)
    for f in files_list
]
# Iterate over list of files along with size
# and print them one by one.
for f,s in size_of_file:
    print("{} : {}MB".format(f, round(s/(1024*1024),3)))


 
 

Output:

 

2015_Book_LinearAlgebraDoneRight.pdf : 2.199MB
An Introduction to Statistical Learning - Gareth James.pdf : 8.996MB
Hand-on-ML.pdf : 7.201MB
ISLR Seventh Printing.pdf : 11.375MB
The Business of the 21st Century - Robert Kiyosaki.pdf : 8.932MB
The Elements of Statistical Learning - Trevor Hastie.pdf : 12.687MB
the_compound_effect_ebook.pdf : 5.142MB

List of files paths in directory sorted by size

 

In the previous code, we have only output the filenames and their corresponding sizes, but in this case, we have printed file paths instead of each file name and we have sorted it according to the size of each file name in ascending order. In this, we have to use the sorted function, to sort our file, according to its size.

 

Syntax: sorted(iterable, key=key, reverse=reverse)

Parameters:

  • iterable : The sequence to sort, list, dictionary, tuple etc.
  • key : A Function to execute to decide the order. Default is None

reverse : A Boolean. False will sort ascending, True will sort descending. Default is False

Directory Used

 

Python3




# import python modules
import os
 
# directory name from which we are
# going to extract our files with its size
path = "D:\ABC"
 
# Get list of all files only in the given directory
fun = lambda x : os.path.isfile(os.path.join(path,x))
 
files_list = filter(fun, os.listdir(path))
 
# Create a list of files in directory along with the size
size_of_file = [
    (f,os.stat(os.path.join(path, f)).st_size)
    for f in files_list
]
# Iterate over list of files along with size
# and print them one by one.
# now we have print the result by
# sorting the size of the file
# so, we have call sorted function
# to sort according to the size of the file
 
# created a lambda function that help us
# to sort according the size of the file.
fun = lambda x : x[1]
 
 
# in this case we have its file path instead of file
for f,s in sorted(size_of_file,key = fun):
    print("{} : {}MB".format(os.path.join(path,f),round(s/(1024*1024),3)))


Output: 

D:\ABC\1.png : 0.022MB
D:\ABC\17.png : 0.024MB
D:\ABC\16.png : 0.036MB
D:\ABC\15.png : 0.047MB
D:\ABC\7.png : 0.074MB
D:\ABC\10.png : 0.076MB
D:\ABC\6.png : 0.09MB
D:\ABC\13.png : 0.093MB
D:\ABC\14.png : 0.097MB
D:\ABC\8.png : 0.104MB
D:\ABC\2.png : 0.115MB
D:\ABC\5.png : 0.126MB
D:\ABC\11.mp4 : 5.966MB

List of file names in a directory sorted by size:

This code is not much different from the previous code, there is a very small change in this code, in this code, we have just print the output as file name instead of file paths, and the rest of the code is the same.

Directory Used

Python3




# import python modules
import os
 
# directory name from which we are
# going to extract our files with its size
path = "D:\ABC"
 
# Get list of all files only in the given directory
fun = lambda x : os.path.isfile(os.path.join(path,x))
 
files_list = filter(fun, os.listdir(path))
 
# Create a list of files
# in directory along with the size
size_of_file = [
    (f,os.stat(os.path.join(path, f)).st_size)
    for f in files_list
]
# Iterate over list of files along with size
# and print them one by one.
# now we have print the result by
# sorting the size of the file
# so, we have call sorted function
# to sort according to the size of the file
 
# created a lambda function that help
# us to sort according the size of the file.
fun = lambda x : x[1]
 
# in this case we have use its file name.
for f,s in sorted(size_of_file,key = fun):
    print("{} : {}MB".format(f,round(s/(1024*1024),3)))


Output: 

1.png : 0.022MB
17.png : 0.024MB
16.png : 0.036MB
15.png : 0.047MB
7.png : 0.074MB
10.png : 0.076MB
6.png : 0.09MB
13.png : 0.093MB
14.png : 0.097MB
8.png : 0.104MB
2.png : 0.115MB
5.png : 0.126MB
11.mp4 : 5.966MB

List of files paths in directory and sub-directories sorted by size:

This code is different from all the 3 above codes, in this code, we have to show all the sub_directory and file sizes with their name or file paths. So, first, we have to get our all sub_directores and files present in the directory by using the os.walk() function, which results in a generator object containing 3 things, i.e., path, sub_directory names, and file names present in the given directory. Then we created a list of files with have their size, and next, we have to get the size of the sub_directory present in the directory. At last, we have output our code with the sorted size of file names and their sub-directories.

Syntax: os.walk(path)

Parameters:

path : The path of the directory from where we can create our directory tree.

Returns : 

  • root : Prints out directories only from what you specified.
  • dirs : Prints out sub-directories from root.
  • files : Prints out all files from root and directories.

Directory Used

Python3




# import python modules
import os
 
# directory name from which we are
# going to extract our files with its size
path = "D:\ABC"
 
# get the path p, sub_directory sub_dir,
# and filename files from the given path
walk_method = os.walk(path)
 
# using exception handling to remove
# the stop iteration from generator object
# which we get the output from os.walk()  method.
while True:
    try:
        p, sub_dir, files = next(walk_method)
        break
    except:
        break 
         
# Create a list of files in directory along with the size
size_of_file = [
    (f,os.stat(os.path.join(path, f)).st_size)
    for f in files
]
  
# get the size of the sub_dir of the given path
for sub in sub_dir:
    i = os.path.join(path,sub)
    size = 0
    for k in os.listdir(i):
        size += os.stat(os.path.join(i,k)).st_size
    size_of_file.append((sub,size))
     
# Iterate over list of files along with size
# and print them one by one.
# now we have print the result by
# sorting the size of the file
# so, we have call sorted function
# to sort according to the size of the file
 
# in this case we have use its file paths.
for f,s in sorted(size_of_file,key = lambda x : x[1]):
    print("{} : {}MB".format(os.path.join(path,f),round(s/(1024*1024),3)))


Output:

D:\ABC\1.png : 0.022MB
D:\ABC\17.png : 0.024MB
D:\ABC\16.png : 0.036MB
D:\ABC\15.png : 0.047MB
D:\ABC\7.png : 0.074MB
D:\ABC\10.png : 0.076MB
D:\ABC\6.png : 0.09MB
D:\ABC\13.png : 0.093MB
D:\ABC\14.png : 0.097MB
D:\ABC\8.png : 0.104MB
D:\ABC\2.png : 0.115MB
D:\ABC\5.png : 0.126MB
D:\ABC\11.mp4 : 5.966MB
D:\ABC\Books : 56.532MB

RELATED ARTICLES

Most Popular

Recent Comments