Skip to content

perf: Lazy imports for scipy/matplotlib to reduce import time #426

@Jammy2211

Description

@Jammy2211

Overview

Import time is now the single largest cost in smoke tests — 2.6s (49%) of a 5.3s modeling script. matplotlib and scipy are the main culprits, loaded eagerly via plot submodules and utility modules across the stack. Deferring these to first-use could cut import time by 40-60%, bringing modeling scripts under 4s and simulator scripts under 2.5s.

Plan

  • Defer matplotlib imports by lazy-loading plot submodules in autoarray, autogalaxy, autolens, and autofit
  • Defer scipy imports in autoarray's convolver, delaunay interpolator, and utility modules
  • Defer scipy imports in autofit's interpolator and BFGS search modules
  • Profile before/after to validate improvement
Detailed implementation plan

Affected Repositories

  • PyAutoLens (primary — issue tracker)
  • PyAutoArray
  • PyAutoGalaxy
  • PyAutoFit

Work Classification

Library

Branch Survey

Repository Current Branch Dirty?
PyAutoFit main_build clean
PyAutoArray main 4 untracked
PyAutoGalaxy main 2 untracked
PyAutoLens main clean

Suggested branch: feature/lazy-imports

Implementation Steps

  1. PyAutoArray — Defer plot submodule

    • In autoarray/__init__.py, convert from . import plot to a lazy import pattern
    • Move matplotlib backend setup in autoarray/plot/__init__.py from module-level to first-use
  2. PyAutoArray — Defer scipy imports

    • autoarray/operators/convolver.py: Move import scipy to inside methods that use it
    • autoarray/inversion/mesh/interpolator/delaunay.py: Move scipy.spatial imports to inside class methods
    • autoarray/mask/mask_2d_util.py: Move scipy.ndimage import to inside functions
    • autoarray/util/cholesky_funcs.py: Move scipy.linalg to inside functions
    • autoarray/util/fnnls.py: Move scipy.linalg to inside functions
  3. PyAutoFit — Defer scipy and matplotlib

    • Lazy-load the interpolator submodule (uses scipy)
    • Defer scipy.optimize in BFGS/LBFGS search modules
    • Lazy-load non_linear.plot submodule
  4. PyAutoGalaxy — Defer plot submodule

    • In autogalaxy/__init__.py, convert from . import plot to lazy import
  5. PyAutoLens — Defer plot submodule

    • In autolens/__init__.py, convert from . import plot to lazy import
  6. Validation

    • Profile python -X importtime -c "import autolens" before and after
    • Run existing test suites to confirm no regressions
    • Run smoke tests to measure end-to-end improvement

Key Files

  • autoarray/__init__.py — remove eager plot import
  • autoarray/plot/__init__.py — defer backend setup
  • autoarray/operators/convolver.py — defer scipy
  • autoarray/inversion/mesh/interpolator/delaunay.py — defer scipy.spatial
  • autofit/__init__.py — remove eager plot/interpolator imports
  • autogalaxy/__init__.py — remove eager plot import
  • autolens/__init__.py — remove eager plot import

Original Prompt

Click to expand starting prompt

Optimize Python Import Times

Motivation

Smoke test profiling (issue PyAutoLabs/PyAutoFit#1183) has reduced per-script runtime from ~100s to 3.6-5.3s. The remaining breakdown for imaging/modeling.py (5.3s total):

Component Time %
Python imports 2.6s 49%
Simple model composition 0.6s 11%
search.fit (1 likelihood) 0.6s 11%
Subplot rendering (matplotlib) 1.0s 19%
Everything else 0.5s 10%

Import time is now the single largest cost — nearly half the total. For imaging/simulator.py (3.6s total), imports are 2.3s (64%).

The import breakdown (from earlier profiling):

Package Time
autofit 0.8s
autoarray 0.3s
autogalaxy 1.0s
autolens 0.02s
autoconf.jax_wrapper 0.08s

autogalaxy (1.0s) and autofit (0.8s) dominate. These likely pull in heavy dependencies at module level (scipy, numba, astropy, matplotlib).

Approach

Investigate lazy imports for heavy dependencies that aren't needed at module load time. Candidates:

  • scipy — only needed for specific operations (convolution, interpolation)
  • numba — only needed when JIT functions are first called
  • astropy — only needed for FITS I/O and cosmology
  • matplotlib — only needed when plotting functions are called

The goal is to defer these imports until first use, reducing the ~2.6s import floor. Even a 50% reduction would bring modeling scripts under 4s and simulator scripts under 2.5s.

Context

All other smoke test optimizations have been implemented:

  • PYAUTO_WORKSPACE_SMALL_DATASETS=1 — 15x15 grids, over_sample_size=2, 2 MGE gaussians
  • PYAUTO_FAST_PLOTS=1 — skip tight_layout + savefig
  • PYAUTO_DISABLE_CRITICAL_CAUSTICS=1 — skip curve overlays
  • PYAUTO_DISABLE_JAX=1 — skip JAX JIT compilation
  • PYAUTOFIT_TEST_MODE=2 — skip VRAM, model.info, result.info, pre/post-fit I/O
  • Cosmology distance caching, FitImaging cached_property

Import optimization is the last remaining lever for further speedup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions