Python 3.11 is 60% faster than 3.10: using bubble sorting and recursive function comparison test

Python 3.11 pre release has been released. The update log mentions:

Python 3.11 is up to 10–60% faster than Python 3.10. On average, we measured a 1.25x speedup on the standard benchmark suite. See Faster CPython for details. — Python 3.11 Changelog.

The speed of Python in the production system has always been compared and roast by novices., Because there is really no block, in order to solve the performance problem, we always need to use Python or Tuplex to convert key code.

Python 3.11 has specially enhanced this optimization. We can actually verify whether there is an official average improvement of 1.25 times?

As a data science, I'm more looking forward to seeing if it has any improvement in Pandas processing DF.

First, let's try some Fibonacci sequences.

Install Python 3.11 pre release

For windows, you can download the installation file from the official, and ubuntu can be installed with apt command

sudo apt install Python3.11

We can't use 3.11 directly in our work. Therefore, you need to create a separate virtual environment to save two Python versions.

$ virtualenv env10 --python=3.10
$ virtualenv env11 --python=3.11

# To activate v11 you can run,
$ source env11/bin/activate

How fast is Python 3.11 compared to Python 3.10?

I created a small function to generate some Fibonacci numbers.

def fib(n: int) -> int:
    return n if n < 2 else fib(n - 1) + fib(n - 2)

Use Timeit to run the Fibonacci number generator above to determine the execution time. The following command repeats the build process ten times and displays the best execution time.

# To generate the (n)th Fibonacci number
python -m timeit -n 10 "from fib import fib;fib(n)"

Here are the results on Python 3.10 and Python 3.11

Python 3.11 outperforms Python 3.10 in every run. The execution time is about half that of version 3.11.

I actually want to confirm its performance on the Pandas mission. Unfortunately, so far, Numpy and Pandas do not support Python 3.11.

Bubble sorting

Since it is impossible to benchmark Pandas, let's try the performance comparison of common calculations to measure the time spent sorting one million numbers. Sorting is the most commonly used operation in daily use. I believe its results can provide us with a good reference.

import random
from timeit import timeit
from typing import List

def bubble_sort(items: List[int]) -> List[int]:
    n = len(items)

    for i in range(n - 1):

        for j in range(0, n - i - 1):

            if items[j] > items[j + 1]:
                items[j], items[j + 1] = items[j + 1], items[j]

numbers = [random.randint(1, 10000) for i in range(1000000)]

print(timeit(lambda:bubble_sort(numbers),number=5))

The above code generates a million random numbers. The timeit function is set to measure only the duration of the bubble sort function execution.

give the result as follows

Python 3.11 takes only 21 seconds to sort, while 3.10 takes 39 seconds.

Are there performance differences in I/O operations?

Is there a difference in the speed of reading and writing information between the two versions of the disk. When pandas reads df and deep learning reads data, I/O performance is very important.

Two programs are prepared here. The first one writes one million files to disk.

from timeit import timeit

statement = """
for i in range(100000):
    with open(f"./data/a{i}.txt", "w") as f:
        f.write('a')
"""

print(timeit(statement, number=10))

We use the timeit function to print the duration. You can repeat the task multiple times and take the average value by setting the number parameter.

The second program also uses the timeit function. But it only reads a million files.

from glob import glob
from timeit import timeit

file_paths = glob("./data/*.txt")

statement = f"""
for path in {file_paths}:
    with open(path, "r") as f:
        f.read()
"""
  
print(timeit(statement, number=10))

Here is the output of the two versions we run.

Although Python 3.10 seems to have an advantage over Python 3.11, it doesn't matter. Because running this experiment many times will draw different conclusions, but it is certain that I/O has not improved.

summary

Python 3.11 is still a pre release version. But it seems to be a great version in Python history. It is 60% faster than the previous version, which is still OK. Some of our experiments above have also proved that Python 3.11 is indeed faster.

Translator's note: the previous project was upgraded to 3.6 a few days ago, and the new projects were developed with 3.9. Now 3.11 will be released soon, and the performance has been greatly improved. What are you going to do, uncle tortoise 😂

https://avoid.overfit.cn/post/8592a93acd9441a8aacc0623bdd35e96

Author: Thuwarakesh Murallie

Tags: Python Machine Learning

Posted by netcoord99 on Wed, 18 May 2022 09:01:12 +0300