Summary
Diagram.cascade() with part_integrity="cascade" may not correctly propagate restrictions when a Part table references another Part table instead of its Master directly, especially with renamed (projected) foreign keys.
Context
Discussion: #1232 (Spyglass Merge table pattern)
Related: #1423 (Diagram.trace() — same issue applies in the upstream direction)
Problem
The Part→Master upward propagation in _propagate_restrictions() (diagram.py lines 514-537) does two things:
- Identifies the Master via
extract_master(target) — uses the __ naming convention to derive the master table name from the part table name
- Restricts the Master via
master_ft.proj() & child_ft.proj() — joins on shared primary key attributes
This assumes the Part table shares primary key attributes directly with its Master.
Case 1: Part references another Part with renamed FK (no direct Master reference)
@schema
class Master(dj.Manual):
definition = """
master_id : smallint
"""
class PartA(dj.Part):
definition = """
-> master
part_a_id : smallint
"""
class PartB(dj.Part):
definition = """
-> Master.PartA.proj(src_master='master_id', src_part='part_a_id')
part_b_id : smallint
"""
PartB has no direct FK to Master — it references PartA with renamed keys (src_master ← master_id, src_part ← part_a_id). When cascade reaches PartB:
extract_master correctly identifies Master by naming convention
- But
master_ft.proj() & child_ft.proj() joins on shared attribute names — master_id is not in PartB (it was renamed to src_master), so the join finds no common attributes
- The intermediate PartA restriction is also skipped
Case 2: Part references another Part only (no Master reference at all)
@schema
class Master(dj.Manual):
definition = """
master_id : smallint
"""
class PartA(dj.Part):
definition = """
-> master
part_a_id : smallint
"""
class PartB(dj.Part):
definition = """
-> Master.PartA
part_b_id : smallint
"""
PartB references PartA directly (inheriting master_id and part_a_id). The restriction path should be PartB → PartA → Master. The current code jumps from PartB to Master via proj() & proj(). If master_id is inherited through PartA, the join may work by coincidence — but PartA's restriction is skipped entirely.
Case 3: Merge table with renamed reference from a table with different PK
@schema
class Merge(dj.Manual):
definition = """
merge_id : uuid
"""
class Source1(dj.Part):
definition = """
-> master
---
-> UpstreamPipeline1
"""
class Source2(dj.Part):
definition = """
-> master
---
-> UpstreamPipeline2
"""
@schema
class Downstream(dj.Computed):
definition = """
downstream_id : smallint
---
-> Merge
-> Merge.proj(comparison_src='merge_id')
"""
Downstream has its own primary key and two FKs into Merge — one direct, one renamed via .proj(). Cascade from Downstream needs to reach Merge's Parts through both FK paths. The Part→Master propagation doesn't account for the renamed FK (comparison_src → merge_id).
Expected behavior
Cascade should correctly propagate restrictions through Part→Part→Master chains, including cases with renamed foreign keys (via .proj()), by following the actual FK path rather than relying on shared attribute names.
Proposed fix
When propagating from a Part to its Master:
- If the Part doesn't share PK attributes directly with Master, follow the FK chain through intermediate Part tables
- Apply the standard restriction propagation rules (including
attr_map alias handling from _apply_propagation_rule) at each step, rather than a direct proj() & proj() join
Verification
Needs test cases covering:
- PartB references PartA with renamed FK (no direct Master reference) — verify Master is correctly restricted
- PartB references PartA directly (no Master reference) — verify PartA and Master are both restricted
- Renamed FK back to same Master via
.proj() from a table with different PK — verify both paths are traced
Summary
Diagram.cascade()withpart_integrity="cascade"may not correctly propagate restrictions when a Part table references another Part table instead of its Master directly, especially with renamed (projected) foreign keys.Context
Discussion: #1232 (Spyglass Merge table pattern)
Related: #1423 (
Diagram.trace()— same issue applies in the upstream direction)Problem
The Part→Master upward propagation in
_propagate_restrictions()(diagram.py lines 514-537) does two things:extract_master(target)— uses the__naming convention to derive the master table name from the part table namemaster_ft.proj() & child_ft.proj()— joins on shared primary key attributesThis assumes the Part table shares primary key attributes directly with its Master.
Case 1: Part references another Part with renamed FK (no direct Master reference)
PartB has no direct FK to Master — it references PartA with renamed keys (
src_master←master_id,src_part←part_a_id). When cascade reaches PartB:extract_mastercorrectly identifies Master by naming conventionmaster_ft.proj() & child_ft.proj()joins on shared attribute names —master_idis not in PartB (it was renamed tosrc_master), so the join finds no common attributesCase 2: Part references another Part only (no Master reference at all)
PartB references PartA directly (inheriting
master_idandpart_a_id). The restriction path should be PartB → PartA → Master. The current code jumps from PartB to Master viaproj() & proj(). Ifmaster_idis inherited through PartA, the join may work by coincidence — but PartA's restriction is skipped entirely.Case 3: Merge table with renamed reference from a table with different PK
Downstream has its own primary key and two FKs into Merge — one direct, one renamed via
.proj(). Cascade from Downstream needs to reach Merge's Parts through both FK paths. The Part→Master propagation doesn't account for the renamed FK (comparison_src→merge_id).Expected behavior
Cascade should correctly propagate restrictions through Part→Part→Master chains, including cases with renamed foreign keys (via
.proj()), by following the actual FK path rather than relying on shared attribute names.Proposed fix
When propagating from a Part to its Master:
attr_mapalias handling from_apply_propagation_rule) at each step, rather than a directproj() & proj()joinVerification
Needs test cases covering:
.proj()from a table with different PK — verify both paths are traced