Jupyter Notebook is a versatile tool or IDE widely used mostly by data scientists, researchers, and programmers for interactive computing and data analysis, dashboards, and visualizations. It offers a unique and rich set of features. Some topics are there it can perform such as: running the code in different languages, timing the code execution, debugging the code, and profiling. We’ll discuss the commands that work like magic for profiling in Jupyter Notebook. Magic Commands are the special commands in Jupyter Notebook that either start with the ‘% ‘ or ‘%% ‘ sign and perform different operations and tasks. Profiling helps in identifying bottlenecks, optimizing code, and improving overall performance. So, let’s explore the world of magic commands in Jupyter Notebook, specifically designed for profiling our Python code. Before we jump into it, we first need to understand some of the key concepts that we should know about profiling in Jupyter Notebook. So, what actually profiling is in Jupyter Notebook?
What are Magic Commands?
Magic commands are special commands in Jupyter Notebook, and they are denoted with either a single % or %%. Here the commands with % are for line-level commands. The commands with %% are for cell-level commands or multi-line level code. These magic commands help us to perform a wide range of tasks that go beyond standard Python capabilities. Magic commands are like shortcuts that can make complex operations simple and easy to understand, they make our notebook more productive and efficient.
Type of Magic Commands
Magic commands in Jupyter Notebook are divided into two categories i.e. Line Magic Commands and Cell Magic Commands. By the names, we can understand that the line magic commands will start with % and the cell magic commands will start with %% as we’ve already discussed before as well.
Line Magic Commands: These are used to operate on a single line of code. For example: %timeit, %memit, %load, %reset, %who and etc.
Cell Magic Commands: These affect the entire cell. For example: %%time, %%writefile, %%html, %%latex, %%bash and etc.
How to Use the Magic Commands
Making use of the magic commands in Jupyter Notebook is quite straightforward. We just simply prefix our code with the appropriate magic command, and Jupyter Notebook takes care of the rest. These commands can be utilized for a bunch of different tasks, such as measuring code execution time, profiling memory usage, debugging, and more.
What is Profiling and Its Type
Profiling refers to the process of analyzing code performance and resource utilization in the programming world. There are mainly two types of profiling:
Code Profiling: Code profiling involves measuring the execution time of various parts of our code to identify bottlenecks or areas where optimization is needed.
Memory Profiling: Memory profiling focuses on tracking how our code uses system memory. It also helps in identifying memory leaks and inefficient memory usage patterns.
Basically, Profiling is the process of measuring the performance of code, such as how long does it take to run and how much memory does it use! If we want to profile a single line of code, we can use the ‘%prun‘ magic command. But if we want to profile a whole cell of code, we can use the ‘%%prun‘ magic command. Another magic command is ‘%lprun‘ and this is used for profiling the code line-by-line. ‘%lprun‘ is mostly used for pinpointing specific lines of code, if it is causing performace issues.
Profiling Jupyter Notebook
Now let’s try to understand what are the steps needed to profile in Jupyter Notebook.
1. Installing the required libraries:
First we need to ensure that we have the necessary libraries installed in out system. As we need ‘line_profiler‘ for ‘%lprun’ and ‘cProfile‘ for ‘%prun‘. If it’s not installed already in our notebook, we can install them by using the following commad:
!pip install line_profiler
2. Load the Magic Commands
In our Jupyter Notebook, we then need to load the ‘%lprun’ and ‘%prun’ magic commands. We’ll now use the following commands to load them:
Python3
% load_ext line_profiler % load_ext cProfile |
3. Profiling the Code
Now we’ll use ‘%lprun’ to profile specific lines of code and ‘%prun’ to profile entire functions as we have dicussed about this in the introduction. Let’s understand this with a small example:
Python3
% load_ext line_profiler def fib(n): if n < 2 : return n else : return fib(n - 1 ) + fib(n - 2 ) % lprun - f fib fib( 20 ) |
Output:
Timer unit: 1e-09 s
Total time: 0.00622773 s
File: /tmp/ipykernel_601603/4220250964.py
Function: fib at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 def fib(n):
4 21891 3171786.0 144.9 50.9 if n < 2:
5 10946 1101316.0 100.6 17.7 return n
6 else:
7 10945 1954627.0 178.6 31.4 return fib(n-1) + fib(n-2)
When you run the above code, your output will be telling you about; Line, Hits, Time, Per Hit, % Time, and Line Contents. It’ll be somewhat looking like this:
Examples
There are many magic commands that are useful for different scenarios. Here we will explore some of the commands that are dedicated to profiling.
Line-by-Line Profiling (%lprun)
Suppose we have a function ‘calculate_sum’ that adds all the elements of a list. Then we suspect that there’s a performance issue, and we want to profile it. Let’s see how we can use %lprun:
Python3
def calculate_sum(my_list): return sum (my_list) my_list = [ 1 , 2 , 3 , 4 ] % lprun - f calculate_sum calculate_sum(my_list) |
Output:
Timer unit: 1e-07 s
Total time: 2.4e-06 s
File: C:\Users\GFG0265\AppData\Local\Temp\ipykernel_32000\3964288041.py
Function: calculate_sum at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def calculate_sum(my_list):
2 1 24.0 24.0 100.0 return sum(my_list)
Function-Level Profiling (%prun)
Imagine we have a complex function process_data that we want to profile. Then for that we can use %prun like this:
Python3
def process_data(): # Code pass % prun process_data() |
Output:
4 function calls in 0.000 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 2018323565.py:1(process_data)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Profiling a Complex Script
Suppose we have a complex script that involves multiple functions, and we want to identify which functions are consuming the most time during execution. Let’s try to understand it with the complete code example and the output format:
Python3
def slow_function(): total = 0 for i in range ( 1000000 ): total + = i return total def fast_function(): return sum ( range ( 1000000 )) def main(): result1 = slow_function() result2 = fast_function() print ( "Result 1:" , result1) print ( "Result 2:" , result2) if __name__ = = "__main__" : main() |
Output:
Result 1: 499999500000
Result 2: 499999500000
To profile the performance of the Python function “main()”, the code employs the %prun magic command with the -s cumulative option. This generates a report that identifies the sections of the code that consume the most cumulative time, providing insights for optimizing the code and improving its performance.
Python3
% % prun - s cumulative main() |
Output:
81 function calls in 0.138 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.138 0.138 {built-in method builtins.exec}
1 0.000 0.000 0.138 0.138 <string>:1(<module>)
1 0.000 0.000 0.138 0.138 2450259892.py:10(main)
1 0.098 0.098 0.098 0.098 2450259892.py:1(slow_function)
1 0.000 0.000 0.039 0.039 2450259892.py:7(fast_function)
1 0.038 0.038 0.038 0.038 {built-in method builtins.sum}
2 0.000 0.000 0.001 0.001 {built-in method builtins.print}
8 0.000 0.000 0.001 0.000 iostream.py:610(write)
8 0.001 0.000 0.001 0.000 iostream.py:532(_schedule_flush)
1 0.000 0.000 0.000 0.000 iostream.py:243(schedule)
1 0.000 0.000 0.000 0.000 socket.py:543(send)
8 0.000 0.000 0.000 0.000 iostream.py:505(_is_master_process)
1 0.000 0.000 0.000 0.000 threading.py:1185(is_alive)
1 0.000 0.000 0.000 0.000 threading.py:1118(_wait_for_tstate_lock)
8 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects}
8 0.000 0.000 0.000 0.000 {built-in method nt.getpid}
1 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.lock' objects}
8 0.000 0.000 0.000 0.000 {method '__exit__' of '_thread.RLock' objects}
1 0.000 0.000 0.000 0.000 iostream.py:127(_event_pipe)
8 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
8 0.000 0.000 0.000 0.000 {built-in method builtins.len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}
1 0.000 0.000 0.000 0.000 threading.py:568(is_set
Code Timing (%timeit )
The %timeit magic command is used to measure the execution of the time of a single line of code (as we have already discussed about the single line and a cell of code). It runs the specified code multiple times and provides us with information about the average time taken.
Python3
% timeit sum ( range ( 1 , 1000 )) |
Output:
8.52 µs ± 21.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%memit for Memory Profiling
The %memit magic command allows us to profile memory usage for a single line of code. It’s particularly helpful in identifying memory-intensive operations.
Step 1: We need to Install the memory_profiler package if we haven’t already.
!pip install memory-profiler
It’ll install the memory-profiler. After installing we’re not done yet for using %%memit.
Step 2: After installing memory-profiler we need to load its extension in our Jupyter Notebook now.
Python3
% load_ext memory_profiler |
And now we’ve succesfully loaded its extension. So, let’s use it now.
Python3
def some_memory_intensive_function(): my_list = [ 0 ] * 1000000 % memit some_memory_intensive_function() |
Output:
peak memory: 70.90 MiB, increment: 0.32 MiB
Table for Magic commands
Magic commands |
Short Descriptions |
---|---|
%timeit |
Measure execution time of a single line of code |
%memit |
Profile memory usage of a single line of code |
%prun |
Profile code execution using Python’s profiler |
%%time |
Measure execution time of an entire cell |
%%writefile |
Write the contents of a cell to a file |
%%html |
Render the cell as HTML |
%%latex |
Render the cell as LaTeX |
%%bash |
Run cell contents as a Bash script |
Conclusion
Magic commands in Jupyter Notebook make profiling our code a breeze in Python. Whether we need to analyze individual lines or entire functions, %lprun and %prun, %timeit, %memit, %%time, %%writefile, %%html, %%latex, %%bash and more; there provide valuable insights to help us optimize our code for better performance or for any complex operations. Profiling is an essential skill for any programmer or data scientist or an seasoned enthusiast, and with these magic commands, we can streamline the process and become a more efficient coder/programmer.