Skip to content

⚡️ Speed up function _ensure_maven_central_repo by 26% in PR #2015 (fix/gradle-maven-central-dependency)#2016

Closed
codeflash-ai[bot] wants to merge 1 commit intofix/gradle-maven-central-dependencyfrom
codeflash/optimize-pr2015-2026-04-07T11.20.29
Closed

⚡️ Speed up function _ensure_maven_central_repo by 26% in PR #2015 (fix/gradle-maven-central-dependency)#2016
codeflash-ai[bot] wants to merge 1 commit intofix/gradle-maven-central-dependencyfrom
codeflash/optimize-pr2015-2026-04-07T11.20.29

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai bot commented Apr 7, 2026

⚡️ This pull request contains optimizations for PR #2015

If you approve this dependent PR, these changes will be merged into the original PR branch fix/gradle-maven-central-dependency.

This PR will be automatically closed if the original PR is merged.


📄 26% (0.26x) speedup for _ensure_maven_central_repo in codeflash/languages/java/gradle_strategy.py

⏱️ Runtime : 370 microseconds 294 microseconds (best of 250 runs)

📝 Explanation and details

The optimization pre-compiles the regex pattern r"repositories\s*\{" into a module-level constant (_REPO_BLOCK_PATTERN), eliminating the 49.4% overhead from re-compiling it on every function call. It also removes the redundant is_kts variable and merges both identical append branches into a single line, cutting string operations and conditional checks. Line profiler confirms the regex search dropped from 525 µs to 87 µs per hit (~6× faster), yielding a 25% overall speedup with no correctness trade-offs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1048 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from pathlib import Path  # used to create Path objects for build files

# imports
from codeflash.languages.java.gradle_strategy import _ensure_maven_central_repo  # function under test


def test_returns_unchanged_when_maven_central_already_present():
    # Prepare a build file Path and content that already contains the exact substring "mavenCentral()"
    build_file = Path("build.gradle")
    content = "repositories {\n    mavenCentral()\n    jcenter()\n}\n"
    # Call the function and expect the same content returned (no modification)
    result = _ensure_maven_central_repo(build_file, content)  # 541ns -> 561ns (3.57% slower)
    # Assert equality: the content should be unchanged
    assert result == content  # identical textual content


def test_inserts_into_existing_repositories_block_with_space_before_brace():
    # When repositories has a space before the brace, insertion should occur immediately after the opening brace
    build_file = Path("build.gradle")
    original = "repositories {\n    jcenter()\n}\n"
    result = _ensure_maven_central_repo(build_file, original)  # 4.69μs -> 2.63μs (77.9% faster)
    # The new content must contain mavenCentral()
    assert "mavenCentral()" in result
    # mavenCentral() should appear before the existing jcenter() entry
    assert result.index("mavenCentral()") < result.index("jcenter()")
    # The rest of the original content (like closing brace) should still be present
    assert result.rstrip().endswith("}")  # still ends with a closing brace


def test_inserts_into_existing_repositories_block_without_space_before_brace():
    # When repositories is written without space before the brace (repositories{),
    # the function should still find it and insert mavenCentral() after the brace
    build_file = Path("build.gradle")
    original = "someHeader()\nrepositories{\n    google()\n}\n"
    result = _ensure_maven_central_repo(build_file, original)  # 4.53μs -> 2.56μs (76.6% faster)
    # Ensure the insertion happened exactly after the first repositories opening brace
    assert "mavenCentral()" in result
    # Ensure google() still exists and comes after the inserted mavenCentral()
    assert result.index("mavenCentral()") < result.index("google()")


def test_appends_repositories_block_when_missing_for_gradle_and_kts():
    # Test both .gradle and .kts file types: both branches append a repositories block when none exists
    for filename in ("build.gradle", "build.gradle.kts"):
        build_file = Path(filename)
        original = "plugins {\n    id 'com.example'\n}\n"  # no repositories block present
        result = _ensure_maven_central_repo(build_file, original)  # 4.68μs -> 1.80μs (160% faster)
        # The result should contain exactly one mavenCentral() (the appended block)
        assert result.count("mavenCentral()") == 1
        # The appended block should be at the end of the file (ends with newline as implementation does)
        assert result.endswith("\n")  # implementation appends trailing newline
        # The appended repositories block should appear after the original content
        assert original in result
        assert result.index("repositories {") >= result.index(original)


def test_does_not_change_when_maven_in_comment():
    build_file = Path("build.gradle")
    original = "repositories {\n    google()\n}\n"
    result = _ensure_maven_central_repo(build_file, original)  # 4.42μs -> 2.31μs (91.0% faster)
    assert "mavenCentral()" in result
    assert "google()" in result
    assert result.count("mavenCentral()") == 1


def test_similar_names_do_not_count_as_maven_central_call():
    build_file = Path("build.gradle")
    original = "repositories {\n    mavenCentralized()\n    google()\n}\n"
    result = _ensure_maven_central_repo(build_file, original)  # 4.48μs -> 2.50μs (79.5% faster)
    assert "mavenCentral()" in result
    assert result.count("mavenCentral()") == 1
    mavencentral_idx = result.index("mavenCentral()")
    mavenCentralized_idx = result.index("mavenCentralized()")
    assert mavencentral_idx < mavenCentralized_idx


def test_handles_empty_content_by_appending_repositories_block():
    # Empty content should get a repositories block appended
    build_file = Path("build.gradle")
    original = ""
    result = _ensure_maven_central_repo(build_file, original)  # 2.82μs -> 962ns (193% faster)
    # Ensure the appended repositories block exists
    assert "repositories {" in result
    assert "mavenCentral()" in result
    # Ensure the result is not empty anymore
    assert len(result) > 0


def test_large_scale_many_lines_appends_single_maven_central_quickly():
    # Create a large file-like content with 1000 lines and no repositories block
    lines = [f"line_{i}" for i in range(1000)]
    original = "\n".join(lines) + "\n"
    build_file = Path("build.gradle")
    result = _ensure_maven_central_repo(build_file, original)  # 11.0μs -> 8.99μs (22.9% faster)
    # Ensure exactly one mavenCentral() was appended
    assert result.count("mavenCentral()") == 1
    # Ensure the appended block occurs at the very end (after the large content)
    assert result.endswith("\n")  # implementation appends trailing newline
    assert result.index("mavenCentral()") > result.index("line_999")  # appended after existing content


def test_many_repositories_occurrences_only_first_is_modified():
    build_file = Path("build.gradle")
    original = "repositories {\n    google()\n}\nrepositories {\n    jcenter()\n}\n"
    result = _ensure_maven_central_repo(build_file, original)  # 4.53μs -> 2.50μs (81.5% faster)
    assert result.count("mavenCentral()") == 1
    first_repo_pos = result.index("repositories {")
    mavencentral_pos = result.index("mavenCentral()")
    second_repo_pos = result.index("repositories {", first_repo_pos + 1)
    assert first_repo_pos < mavencentral_pos < second_repo_pos


def test_repeated_invocations_are_idempotent():
    # Calling the function repeatedly should be idempotent: after first insertion, subsequent calls shouldn't add more
    build_file = Path("build.gradle")
    original = "repositories {\n    google()\n}\n"
    # First call inserts mavenCentral()
    first = _ensure_maven_central_repo(build_file, original)  # 4.34μs -> 2.41μs (79.7% faster)
    assert first.count("mavenCentral()") == 1
    # Second call should detect the exact string and not change anything
    second = _ensure_maven_central_repo(build_file, first)  # 401ns -> 380ns (5.53% faster)
    assert second == first  # no further modifications
    # Do a bunch of repeated calls (1000 iterations) to validate stability at scale
    current = first
    for _ in range(1000):
        current = _ensure_maven_central_repo(build_file, current)  # 180μs -> 178μs (0.931% faster)
    # After many repetitions, still only one mavenCentral()
    assert current.count("mavenCentral()") == 1
    assert current == first
from pathlib import Path

# imports
from codeflash.languages.java.gradle_strategy import _ensure_maven_central_repo


def test_already_has_maven_central():
    """Test that content with mavenCentral() already present is returned unchanged."""
    build_file = Path("build.gradle")
    content = """
    repositories {
        mavenCentral()
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 581ns -> 631ns (7.92% slower)
    assert result == content


def test_add_maven_central_to_existing_repositories_block_gradle():
    """Test adding mavenCentral() to an existing repositories block in build.gradle."""
    build_file = Path("build.gradle")
    content = """
    repositories {
        google()
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.86μs -> 2.75μs (76.3% faster)
    assert "mavenCentral()" in result
    assert "repositories {" in result
    assert "google()" in result


def test_add_maven_central_to_existing_repositories_block_kts():
    """Test adding mavenCentral() to an existing repositories block in build.gradle.kts."""
    build_file = Path("build.gradle.kts")
    content = """
    repositories {
        google()
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.61μs -> 2.56μs (80.4% faster)
    assert "mavenCentral()" in result
    assert "repositories {" in result
    assert "google()" in result


def test_create_new_repositories_block_gradle():
    """Test creating a new repositories block when none exists in build.gradle."""
    build_file = Path("build.gradle")
    content = """
    plugins {
        id 'java'
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 3.27μs -> 1.33μs (145% faster)
    assert "repositories {" in result
    assert "mavenCentral()" in result
    assert result.endswith("}\n")


def test_create_new_repositories_block_kts():
    """Test creating a new repositories block when none exists in build.gradle.kts."""
    build_file = Path("build.gradle.kts")
    content = """
    plugins {
        id("java")
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 3.30μs -> 1.24μs (165% faster)
    assert "repositories {" in result
    assert "mavenCentral()" in result
    assert result.endswith("}\n")


def test_maven_central_added_with_proper_indentation():
    """Test that mavenCentral() is added with correct indentation (4 spaces)."""
    build_file = Path("build.gradle")
    content = "repositories {\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 4.48μs -> 2.54μs (76.0% faster)
    assert "\n    mavenCentral()" in result


def test_repositories_block_with_multiple_repos():
    """Test adding mavenCentral() when repositories block has multiple existing repos."""
    build_file = Path("build.gradle")
    content = """
    repositories {
        google()
        jcenter()
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.32μs -> 2.40μs (80.3% faster)
    assert "mavenCentral()" in result
    assert "google()" in result
    assert "jcenter()" in result


def test_empty_content_gradle():
    """Test with empty content string in build.gradle."""
    build_file = Path("build.gradle")
    content = ""
    result = _ensure_maven_central_repo(build_file, content)  # 2.79μs -> 1.15μs (143% faster)
    assert "repositories {" in result
    assert "mavenCentral()" in result


def test_empty_content_kts():
    """Test with empty content string in build.gradle.kts."""
    build_file = Path("build.gradle.kts")
    content = ""
    result = _ensure_maven_central_repo(build_file, content)  # 2.96μs -> 1.10μs (169% faster)
    assert "repositories {" in result
    assert "mavenCentral()" in result


def test_maven_central_case_sensitive():
    """Test that mavenCentral() check is case-sensitive (different case should not match)."""
    build_file = Path("build.gradle")
    content = "repositories {\n    MavenCentral()\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 4.56μs -> 2.60μs (75.7% faster)
    # MavenCentral() with capital M should not be recognized as mavenCentral()
    assert result.count("mavenCentral()") == 1


def test_maven_central_substring_not_matching():
    """Test that partial matches of mavenCentral() are not treated as present."""
    build_file = Path("build.gradle")
    content = "repositories {\n    mavenCentral\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 4.38μs -> 2.42μs (80.5% faster)
    # mavenCentral without parentheses should not be recognized
    assert result.count("mavenCentral()") == 1


def test_repositories_with_whitespace_variations():
    """Test repositories block detection with various whitespace patterns."""
    build_file = Path("build.gradle")
    content = "repositories  {  google()  }"
    result = _ensure_maven_central_repo(build_file, content)  # 4.43μs -> 2.43μs (81.9% faster)
    assert "mavenCentral()" in result


def test_repositories_with_newline_after_brace():
    """Test repositories block with newline immediately after opening brace."""
    build_file = Path("build.gradle")
    content = "repositories {\ngoogle()\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 4.16μs -> 2.45μs (69.4% faster)
    assert "mavenCentral()" in result


def test_nested_repositories_block():
    """Test with nested braces to ensure we only match top-level repositories block."""
    build_file = Path("build.gradle")
    content = """
    subprojects {
        repositories {
            google()
        }
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.34μs -> 2.56μs (69.8% faster)
    assert "mavenCentral()" in result


def test_multiple_repositories_blocks_only_first_modified():
    """Test that only the first repositories block is modified."""
    build_file = Path("build.gradle")
    content = """
    repositories {
        google()
    }
    other {
        repositories {
            jcenter()
        }
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.42μs -> 2.54μs (74.3% faster)
    # mavenCentral should appear exactly once, added to the first repositories block
    assert result.count("mavenCentral()") == 1


def test_repositories_keyword_in_comment():
    """Test that when repositories block is only in comments, a new one is created."""
    build_file = Path("build.gradle")
    content = """
    /* this is a block comment about repositories { } */
    plugins {
        id 'java'
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.45μs -> 2.54μs (74.9% faster)
    # The regex will match 'repositories' in the block comment, so it tries to insert
    # after it. Since there's no actual repositories block structure, we should
    # end up with a new repositories block created at the end.
    assert "repositories {" in result
    assert "mavenCentral()" in result


def test_file_name_with_different_extensions():
    """Test that both .gradle and .kts files are handled the same way."""
    content = "plugins { id 'java' }"

    result_gradle = _ensure_maven_central_repo(Path("build.gradle"), content)  # 3.23μs -> 1.26μs (155% faster)
    result_kts = _ensure_maven_central_repo(Path("build.gradle.kts"), content)

    # Both should produce the same result (same structure appended)
    assert "repositories {" in result_gradle  # 1.67μs -> 682ns (145% faster)
    assert "repositories {" in result_kts
    assert "mavenCentral()" in result_gradle
    assert "mavenCentral()" in result_kts


def test_very_long_file_content():
    """Test with very long file content to ensure performance."""
    build_file = Path("build.gradle")
    # Create content with many lines
    content = "\n".join(["// comment line"] * 500) + "\nrepositories {\n    google()\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 10.6μs -> 8.32μs (27.2% faster)
    assert "mavenCentral()" in result
    assert result.count("repositories {") == 1


def test_repositories_block_at_end_of_file():
    """Test when repositories block is at the end of the file."""
    build_file = Path("build.gradle")
    content = "plugins { id 'java' }\nrepositories {\n    google()\n}"
    result = _ensure_maven_central_repo(build_file, content)  # 4.39μs -> 2.56μs (71.7% faster)
    assert "mavenCentral()" in result


def test_maven_central_appears_immediately_after_brace():
    """Test when mavenCentral() appears right after repositories { on same logical position."""
    build_file = Path("build.gradle")
    content = "repositories { mavenCentral() }"
    result = _ensure_maven_central_repo(build_file, content)  # 521ns -> 501ns (3.99% faster)
    # Should return unchanged since mavenCentral() is already present
    assert result == content


def test_append_block_preserves_original_content():
    """Test that appending a new repositories block doesn't corrupt original content."""
    build_file = Path("build.gradle")
    content = "plugins { id 'java' }\ndependencies { }"
    result = _ensure_maven_central_repo(build_file, content)  # 3.11μs -> 1.19μs (161% faster)
    assert content in result or (content.replace("\n", "") in result.replace("\n", ""))
    assert "repositories {" in result
    assert "mavenCentral()" in result


def test_very_large_repositories_block():
    """Test with a repositories block containing many repository declarations."""
    build_file = Path("build.gradle")
    repo_lines = "\n    ".join([f"maven {{ url '{i}' }}" for i in range(100)])
    content = f"repositories {{\n    {repo_lines}\n}}"
    result = _ensure_maven_central_repo(build_file, content)  # 5.39μs -> 3.21μs (67.7% faster)
    assert "mavenCentral()" in result
    assert result.count("maven {") == 100


def test_large_file_with_many_braces():
    """Test large file with many brace pairs to ensure regex doesn't get confused."""
    build_file = Path("build.gradle")
    content_parts = ["plugins { id 'java' }", "dependencies { }", "tasks { }", "configurations { }"] * 50 + [
        "repositories { google() }"
    ]
    content = "\n".join(content_parts)
    result = _ensure_maven_central_repo(build_file, content)  # 7.15μs -> 5.22μs (37.0% faster)
    assert "mavenCentral()" in result
    # Only one repositories block should have mavenCentral added
    assert result.count("mavenCentral()") == 1


def test_performance_with_very_long_single_line():
    """Test performance with very long content on a single line."""
    build_file = Path("build.gradle")
    content = "plugins { id 'java' } " + "x " * 1000 + " repositories { google() }"
    result = _ensure_maven_central_repo(build_file, content)  # 5.95μs -> 4.09μs (45.6% faster)
    assert "mavenCentral()" in result


def test_multiple_invocations_same_content():
    """Test that calling the function multiple times on result is idempotent."""
    build_file = Path("build.gradle")
    content = "repositories { google() }"

    result1 = _ensure_maven_central_repo(build_file, content)  # 4.45μs -> 2.44μs (82.0% faster)
    result2 = _ensure_maven_central_repo(build_file, result1)
    result3 = _ensure_maven_central_repo(build_file, result2)  # 401ns -> 351ns (14.2% faster)

    # All subsequent calls should return the same result
    assert result1 == result2
    assert result2 == result3  # 251ns -> 240ns (4.58% faster)
    assert result1.count("mavenCentral()") == 1


def test_file_with_many_line_comments():
    """Test large file with many line comments containing keywords."""
    build_file = Path("build.gradle")
    comments = "\n".join([f"// Comment {i} about repositories and mavenCentral" for i in range(200)])
    content = comments + "\nrepositories { google() }"
    result = _ensure_maven_central_repo(build_file, content)  # 18.5μs -> 16.0μs (16.1% faster)
    assert "mavenCentral()" in result
    assert result.count("mavenCentral()") == 1


def test_content_with_special_characters():
    """Test content with special regex characters doesn't break the regex matching."""
    build_file = Path("build.gradle")
    content = """
    repositories {
        maven { url 'https://example.com/[.*]+' }
        maven { url 'https://example.com/.*' }
    }
    """
    result = _ensure_maven_central_repo(build_file, content)  # 4.50μs -> 2.52μs (78.1% faster)
    assert "mavenCentral()" in result


def test_path_object_with_different_names():
    """Test various build file path names to ensure .kts detection works."""
    content = "repositories { google() }"

    paths = [
        Path("build.gradle"),
        Path("subfolder/build.gradle"),
        Path("build.gradle.kts"),
        Path("subfolder/build.gradle.kts"),
        Path("project/build.gradle"),
        Path("project/build.gradle.kts"),
    ]

    for path in paths:
        result = _ensure_maven_central_repo(path, content)  # 11.5μs -> 6.35μs (81.6% faster)
        assert "mavenCentral()" in result

To edit these changes git checkout codeflash/optimize-pr2015-2026-04-07T11.20.29 and push.

Codeflash Static Badge

The optimization pre-compiles the regex pattern `r"repositories\s*\{"` into a module-level constant (`_REPO_BLOCK_PATTERN`), eliminating the 49.4% overhead from re-compiling it on every function call. It also removes the redundant `is_kts` variable and merges both identical append branches into a single line, cutting string operations and conditional checks. Line profiler confirms the regex search dropped from 525 µs to 87 µs per hit (~6× faster), yielding a 25% overall speedup with no correctness trade-offs.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Apr 7, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 7, 2026

Claude finished @codeflash-ai[bot]'s task in 1m 2s —— View job


PR Review Summary

Prek Checks

All checks passed (ruff check, ruff format). No issues.

Code Review

This is a SMALL optimization PR from codeflash-ai[bot]. The changes are correct:

  1. Pre-compiled regex (_REPO_BLOCK_PATTERN) — standard Python optimization; eliminates re-compilation on every call. ✅
  2. Removed is_kts variable — it was genuinely dead code; both branches of the if/else were identical. ✅
  3. Merged duplicate if/else into single return — no behavioral change, cleaner code. ✅

The 26% speedup claim is credible — pre-compiling a regex used in a hot path is a well-known Python optimization.

No bugs, security issues, or breaking changes detected.

Duplicate Detection

No duplicates detected. The _REPO_BLOCK_PATTERN constant is new and local to gradle_strategy.py.

Test Coverage

1048 generated regression tests pass with 100% coverage. No existing unit tests existed for this function; coverage is adequate given the generated test suite.


Last updated: 2026-04-07T11:21Z

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 7, 2026

CI failures are pre-existing on the base branch (not caused by this PR): futurehouse-structure, unit-tests (windows-latest, 3.13). Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 7, 2026

Closing: this optimization PR targets the old regex-based _ensure_maven_central_repo implementation. The current branch replaced it with a tree-sitter based approach that correctly scopes to top-level repositories {} blocks (avoiding false matches inside buildscript {}). Merging #2016 as-is would regress that fix. Additionally, the PR has merge conflicts.

@claude claude bot closed this Apr 7, 2026
@claude claude bot deleted the codeflash/optimize-pr2015-2026-04-07T11.20.29 branch April 7, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants