close
close

first Drop

Com TW NOw News 2024

Speed ​​up your Python code with NumPy
news

Speed ​​up your Python code with NumPy

Speed ​​up your Python code with NumPySpeed ​​up your Python code with NumPyImage by storyset on Freepik

NumPy is a Python package that is widely used for mathematical and statistical applications. However, some still did not know that NumPy could speed up our Python code execution.

There are several reasons why NumPy might speed up the execution of Python code, including:

  • NumPy uses C code instead of Python when looping
  • The better CPU caching process
  • Efficient algorithms in mathematical operations
  • Able to use parallel operations
  • Memory efficient in large datasets and complex calculations

NumPy is effective at improving Python code execution for many reasons. This tutorial will show examples of how NumPy speeds up the code process. Let’s dive in.

NumPy in Accelerate Python Code Execution

The first example compares the numeric operations on Python lists and NumPy arrays, which produce the object with the intended value as a result.

For example, we want to get a list of numbers from two lists and add them together, so we perform the vector operation. We can try the experiment with the following code:

import numpy as np
import time

sample = 1000000

list_1 = range(sample)
list_2 = range(sample)
start_time = time.time()
result = ((x + y) for x, y in zip(list_1, list_2))
print("Time taken using Python lists:", time.time() - start_time)

array_1 = np.arange(sample)
array_2 = np.arange(sample)
start_time = time.time()
result = array_1 + array_2
print("Time taken using NumPy arrays:", time.time() - start_time)
Output>>
Time taken using Python lists: 0.18960118293762207
Time taken using NumPy arrays: 0.02495265007019043

As you can see in the above output, NumPy array execution is faster than Python list execution in getting the same result.

In the example, you can see that the NumPy execution is faster. Let’s see if we want to perform aggregate statistical analysis.

array = np.arange(1000000)

start_time = time.time()
sum_rst = np.sum(array)
mean_rst = np.mean(array)
print("Time taken for aggregation functions:", time.time() - start_time)
Output>> 
Time taken for aggregation functions: 0.0029935836791992188

NumPy can process the aggregate function quite fast. If we compare it with the Python execution, we can see the differences in execution time.

list_1 = list(range(1000000))

start_time = time.time()
sum_rst = sum(list_1)
mean_rst = sum(list_1) / len(list_1)
print("Time taken for aggregation functions (Python):", time.time() - start_time)
Output>>
Time taken for aggregation functions (Python): 0.09979510307312012

With the same result, Python’s built-in function would take much longer than NumPy. If we had a much larger dataset, Python would take much longer to complete NumPy.

Another example is when we try to perform in-place operations. We see that NumPy is much faster than the Python example.

array = np.arange(1000000)
start_time = time.time()
array += 1
print("Time taken for in-place operation:", time.time() - start_time)
list_1 = list(range(1000000))
start_time = time.time()
for i in range(len(list_1)):
    list_1(i) += 1
print("Time taken for in-place list operation:", time.time() - start_time)
Output>>
Time taken for in-place operation: 0.0010089874267578125
Time taken for in-place list operation: 0.1937870979309082

The point of the example is that if you have the ability to do it with NumPy, that’s much better, because the process will be much faster.

We can try a more complex implementation using matrix multiplication to see how fast NumPy is compared to Python.

def python_matrix_multiply(A, B):
    result = ((0 for _ in range(len(B(0)))) for _ in range(len(A)))
    for i in range(len(A)):
        for j in range(len(B(0))):
            for k in range(len(B)):
                result(i)(j) += A(i)(k) * B(k)(j)
    return result

def numpy_matrix_multiply(A, B):
    return np.dot(A, B)

n = 200
A = ((np.random.rand() for _ in range(n)) for _ in range(n))
B = ((np.random.rand() for _ in range(n)) for _ in range(n))

A_np = np.array(A)
B_np = np.array(B)

start_time = time.time()
python_result = python_matrix_multiply(A, B)
print("Time taken for Python matrix multiplication:", time.time() - start_time)

start_time = time.time()
numpy_result = numpy_matrix_multiply(A_np, B_np)
print("Time taken for NumPy matrix multiplication:", time.time() - start_time)
Output>>
Time taken for Python matrix multiplication: 1.8010151386260986
Time taken for NumPy matrix multiplication: 0.008051872253417969

As you can see, NumPy is even faster at more complex operations, such as matrix multiplication, using standard Python code.

We could try many more examples, but NumPy should be faster than Python’s built-in function execution times.

Conclusion

NumPy is a powerful package for mathematical and numerical processes. Compared with the default built-in Python function, the execution time of NumPy would be faster than Python. So try to use NumPy if applicable to speed up our Python code.

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he enjoys sharing Python and data tips via social media and writing media. Cornellius writes on various topics in AI and machine learning.