前言 / Introduction
我們在上一篇講述了如何有效的處理 I/O bound 的方法,但實際上困擾 Python 開發的問題在於 CPU bound,原因來自
Global Interpreter Lock (GIL)。
這項限制導致 CPU-bound work slow,迫使許多開發者放棄Python,轉而使用其他語言去進行開發。
但這將會是過去式!
從 Python 3.12,尤其是 3.13 and 3.14,Python 進行大量的改動,使其能夠有效的開發多線程:
- Multiple interpreters running in parallel (PEP 684, PEP 734)
- Free-threaded mode (PEP 703)
- New executors like InterpreterPoolExecutor
- Less performance penalty compared to past experimental builds
在這篇文章將簡短討論 why the GIL was a bottleneck ,過去的情況是如何,以及有甚麼改變。
What the GIL Really Does
The Global Interpreter Lock ensures that only one thread executes Python bytecode at a time — even on a multicore machine.
這項設計規範有好有壞。
✅ When the GIL is NOT a Problem (I/O-bound)
If your code does:
- HTTP requests
- File I/O
- Database queries
- Remote APIs
Python 還是能夠正確執行 Threads,因為 interpreter 會在 I/O blocking 的時候釋放 GIL。
import threading
import requests
def get_url(url):
requests.get(url)
threads = [
threading.Thread(target=get_url, args=("https://httpbin.org/delay/3",))
for _ in range(5)
]
for t in threads:
t.start()
for t in threads:
t.join()
- ✅ All requests run in parallel
- ❌ Only because there’s waiting (I/O sleep time)
❌ When the GIL IS a Problem (CPU-bound)
If your code does:
- Image manipulation
- Data parsing
- Compression
- Numerical work
- Hashing / encryption
但在 CPU Bound 卻只能正面與 GIL 對撞,因為沒有機制能夠去釋放 GIL,即便已經使用平行或多核。
import threading
def cpu_heavy(n):
count = 0
for i in range(n):
count += i * i
return count
threads = [
threading.Thread(target=cpu_heavy, args=(50_000_000,))
for _ in range(4)
]
for t in threads:
t.start()
for t in threads:
t.join()
- ✅ This uses 4 threads
- ❌ But it only runs on ONE CPU core
- ❌ All threads take turns under the GIL
- ❌ No performance gain
The classic workaround? multiprocessing, C extensions, or offloading to NumPy or Rust.
What Changed in Python 3.12–3.14?
Python 3.12 — Interpreters Become Isolated (PEP 684)
Traditionally, all threads shared:
- The same interpreter
- The same GIL
- Shared state
現在允許多個 interpreters 同時存在相同的 process,並且具有不同的狀態。
But they still couldn’t run in true parallel yet.
Python 3.13 — Free-Threaded Mode Arrives (PEP 703)
在 Python 3.13 提供了 no-GIL mode,也就是 free-threaded mode。
- The legacy GIL is disabled
- Each object has new memory-safety mechanisms
- Performance penalty: ~5–10% for single-thread code
- Requires building Python with special flag
It’s not the default yet, but it's real and usable.
Python 3.14 — It Gets Practical
而 Python 3.14 正式支援了 free-threading
- A finished implementation of free-threading
- Better performance with adaptive interpreter (PEP 659)
- Improvements in C API compatibility
concurrent.futures.InterpreterPoolExecutor
This means:
- ✅ CPU-bound threads can run in true parallel
- ✅ No hacky multiprocessing needed
- ✅ No shared GIL between interpreters
- ✅ Extension modules can gradually support it
InterpreterPoolExecutor Example (Python 3.14)
Here’s a real CPU-bound example that benefits from the new model:
from concurrent.futures import InterpreterPoolExecutor
import math
def cpu_task(n):
total = 0
for i in range(n):
total += math.sqrt(i)
return total
if __name__ == "__main__":
with InterpreterPoolExecutor() as executor:
futures = [
executor.submit(cpu_task, 50_000_000)
for _ in range(4)
]
results = [f.result() for f in futures]
print(results)
- ✅ Runs across multiple cores
- ✅ No shared GIL
- ✅ Easier than multiprocessing
- ✅ Clean API
So… Is the GIL Gone?
不完全對,還是有下面幾點的問題:
| Version | Status of GIL | Parallel CPU Threads? |
|---|---|---|
| ≤ 3.11 | Always on | ❌ No |
| 3.12 | Still on | ❌ (interpreters isolated but not parallel) |
| 3.13 | Optional (build without GIL) | ✅ Yes |
| 3.14 | More stable, better tooling | ✅ Yes |
要完全走通,還是需要一段時間,尤其在許多的 C 庫支援上,但已經在往前走了。
Key Takeaways
Threads were always useful for I/O-bound, not CPU-bound tasks.
The GIL prevented multicore execution for CPU work.
Python 3.12 started isolating interpreters.
Python 3.13 introduced free-threaded builds.
Python 3.14 makes it practical with InterpreterPoolExecutor.
Extension ecosystems (Cython, pybind11, PyO3, etc.) are adapting.