前言 / Introduction

我們在上一篇講述了如何有效的處理 I/O bound 的方法,但實際上困擾 Python 開發的問題在於 CPU bound,原因來自 Global Interpreter Lock (GIL)。 這項限制導致 CPU-bound work slow,迫使許多開發者放棄Python,轉而使用其他語言去進行開發。

但這將會是過去式!

從 Python 3.12,尤其是 3.13 and 3.14,Python 進行大量的改動,使其能夠有效的開發多線程:

  • Multiple interpreters running in parallel (PEP 684, PEP 734)
  • Free-threaded mode (PEP 703)
  • New executors like InterpreterPoolExecutor
  • Less performance penalty compared to past experimental builds

在這篇文章將簡短討論 why the GIL was a bottleneck ,過去的情況是如何,以及有甚麼改變。

What the GIL Really Does

The Global Interpreter Lock ensures that only one thread executes Python bytecode at a time — even on a multicore machine.

這項設計規範有好有壞。

✅ When the GIL is NOT a Problem (I/O-bound)

If your code does:

  • HTTP requests
  • File I/O
  • Database queries
  • Remote APIs

Python 還是能夠正確執行 Threads,因為 interpreter 會在 I/O blocking 的時候釋放 GIL。

import threading
import requests

def get_url(url):
    requests.get(url)

threads = [
    threading.Thread(target=get_url, args=("https://httpbin.org/delay/3",))
    for _ in range(5)
]

for t in threads:
    t.start()

for t in threads:
    t.join()
  • ✅ All requests run in parallel
  • ❌ Only because there’s waiting (I/O sleep time)

❌ When the GIL IS a Problem (CPU-bound)

If your code does:

  • Image manipulation
  • Data parsing
  • Compression
  • Numerical work
  • Hashing / encryption

但在 CPU Bound 卻只能正面與 GIL 對撞,因為沒有機制能夠去釋放 GIL,即便已經使用平行或多核。

import threading

def cpu_heavy(n):
    count = 0
    for i in range(n):
        count += i * i
    return count

threads = [
    threading.Thread(target=cpu_heavy, args=(50_000_000,))
    for _ in range(4)
]

for t in threads:
    t.start()

for t in threads:
    t.join()
  • ✅ This uses 4 threads
  • ❌ But it only runs on ONE CPU core
  • ❌ All threads take turns under the GIL
  • ❌ No performance gain

The classic workaround? multiprocessing, C extensions, or offloading to NumPy or Rust.

What Changed in Python 3.12–3.14?

Python 3.12 — Interpreters Become Isolated (PEP 684)

Traditionally, all threads shared:

  • The same interpreter
  • The same GIL
  • Shared state

現在允許多個 interpreters 同時存在相同的 process,並且具有不同的狀態。

But they still couldn’t run in true parallel yet.

Python 3.13 — Free-Threaded Mode Arrives (PEP 703)

在 Python 3.13 提供了 no-GIL mode,也就是 free-threaded mode

  • The legacy GIL is disabled
  • Each object has new memory-safety mechanisms
  • Performance penalty: ~5–10% for single-thread code
  • Requires building Python with special flag

It’s not the default yet, but it's real and usable.

Python 3.14 — It Gets Practical

而 Python 3.14 正式支援了 free-threading

  • A finished implementation of free-threading
  • Better performance with adaptive interpreter (PEP 659)
  • Improvements in C API compatibility
  • concurrent.futures.InterpreterPoolExecutor

This means:

  • ✅ CPU-bound threads can run in true parallel
  • ✅ No hacky multiprocessing needed
  • ✅ No shared GIL between interpreters
  • ✅ Extension modules can gradually support it

InterpreterPoolExecutor Example (Python 3.14)

Here’s a real CPU-bound example that benefits from the new model:

from concurrent.futures import InterpreterPoolExecutor
import math

def cpu_task(n):
    total = 0
    for i in range(n):
        total += math.sqrt(i)
    return total

if __name__ == "__main__":
    with InterpreterPoolExecutor() as executor:
        futures = [
            executor.submit(cpu_task, 50_000_000)
            for _ in range(4)
        ]
        results = [f.result() for f in futures]
    print(results)
  • ✅ Runs across multiple cores
  • ✅ No shared GIL
  • ✅ Easier than multiprocessing
  • ✅ Clean API

So… Is the GIL Gone?

不完全對,還是有下面幾點的問題:

VersionStatus of GILParallel CPU Threads?
≤ 3.11Always on❌ No
3.12Still on❌ (interpreters isolated but not parallel)
3.13Optional (build without GIL)✅ Yes
3.14More stable, better tooling✅ Yes

要完全走通,還是需要一段時間,尤其在許多的 C 庫支援上,但已經在往前走了。

Key Takeaways

  • Threads were always useful for I/O-bound, not CPU-bound tasks.

  • The GIL prevented multicore execution for CPU work.

  • Python 3.12 started isolating interpreters.

  • Python 3.13 introduced free-threaded builds.

  • Python 3.14 makes it practical with InterpreterPoolExecutor.

  • Extension ecosystems (Cython, pybind11, PyO3, etc.) are adapting.