gh-100239: Specialize concatenation of lists and tuples by eendebakpt · Pull Request #128956 · python/cpython

eendebakpt · 2025-01-17T19:46:20Z

We add list and tuple concatenation to BINARY_OP_EXTEND
We pass type information in BINARY_OP_EXTEND in tier 2. This allows the jit to perform better optimizations.

Benchmark results:

Benchmark	main (mean)	branch (mean)	Δ
`list_list_concat`	~30.0 ns	~30.5 ns	+0.5 ns (~+1.7%)
`tuple_tuple_concat`	~29.6 ns	~31.0 ns	+1.3 ns (~+4.5%)
`float_mix_mul (2 + x) * y`	16.0 ns	9.92 ns	−6.1 ns (−38%)

The performance of (2 + x) * y is improved significantly due to the multiplication now being inplace. The list-list and tuple-tuple concatenation is slighty slower, but the type is propagated now.

Originally this PR was only for the list-list and tuple-tuple specialization, which is why these are still in. I can factor out the type propagation changes (mainly for float-int arithmetic) if needed.

Benchmark script

"""pyperf benchmarks for BINARY_OP_EXTEND optimizations.

Covers:
  - list + list concatenation (specialized via BINARY_OP_EXTEND)
  - tuple + tuple concatenation (specialized via BINARY_OP_EXTEND)
  - (2 + x) * y with floats (tier 2 should use _BINARY_OP_MULTIPLY_FLOAT_INPLACE
    thanks to result_type/result_unique propagation through BINARY_OP_EXTEND)
"""
import pyperf


# --- list-list concat ---------------------------------------------------
def bench_list_list(loops):
    l1 = [1, 2, 3, 4]
    l2 = [5, 6, 7, 8]
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    for _ in range_it:
        # 20 concatenations per loop iteration
        x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2
        x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2
        x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2
        x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2; x = l1 + l2
    return pyperf.perf_counter() - t0


# --- tuple-tuple concat -------------------------------------------------
def bench_tuple_tuple(loops):
    t1 = (1, 2, 3, 4)
    t2 = (5, 6, 7, 8)
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    for _ in range_it:
        x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2
        x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2
        x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2
        x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2; x = t1 + t2
    return pyperf.perf_counter() - t0


# --- (2 + x) * y float inplace multiply --------------------------------
def bench_float_mix_mul(loops):
    x = 3.5
    y = 2.0
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    for _ in range_it:
        r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y
        r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y
        r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y
        r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y
        r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y; r = (2 + x) * y
    return pyperf.perf_counter() - t0

if name == "main":
runner = pyperf.Runner()
runner.metadata["description"] = "BINARY_OP_EXTEND microbenchmarks"
runner.bench_time_func("list_list_concat", bench_list_list, inner_loops=20)
runner.bench_time_func("tuple_tuple_concat", bench_tuple_tuple, inner_loops=20)
runner.bench_time_func("float_mix_mul", bench_float_mix_mul, inner_loops=20)

Issue: Specialize long tail of binary operations using a table. #100239

# Conflicts: # Lib/test/test_capi/test_opt.py # Python/specialize.c

eendebakpt requested a review from ericsnowcurrently as a code owner January 17, 2025 19:46

eendebakpt marked this pull request as draft January 17, 2025 19:46

bedevere-app bot mentioned this pull request Jan 17, 2025

Specialize long tail of binary operations using a table. #100239

Open

bedevere-app bot added the awaiting review label Jan 17, 2025

eendebakpt and others added 3 commits April 5, 2026 21:47

specialize concatenation of lists and tuples

02ca84d

📜🤖 Added by blurb_it.

27f4c56

refactor for type information

51d1b11

eendebakpt force-pushed the binary_op_list_list branch from 03b3922 to 51d1b11 Compare April 5, 2026 20:00

add unique type propagation

e8263f9

eendebakpt marked this pull request as ready for review April 5, 2026 21:28

eendebakpt requested review from Fidget-Spinner, markshannon, savannahostrowski and tomasr8 as code owners April 5, 2026 21:28

eendebakpt mentioned this pull request Apr 5, 2026

gh-100239: Propagate type info through _BINARY_OP_EXTEND in tier 2 #148146

Merged

Merge branch 'main' into binary_op_list_list

77c1558

# Conflicts: # Lib/test/test_capi/test_opt.py # Python/specialize.c

eendebakpt marked this pull request as draft April 6, 2026 19:56

bedevere-app bot removed the awaiting review label Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-100239: Specialize concatenation of lists and tuples#128956

gh-100239: Specialize concatenation of lists and tuples#128956
eendebakpt wants to merge 5 commits intopython:mainfrom
eendebakpt:binary_op_list_list

eendebakpt commented Jan 17, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

eendebakpt commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eendebakpt commented Jan 17, 2025 •

edited

Loading