feat(client): add create_label/create_edge_type with SchemaMode#36
feat(client): add create_label/create_edge_type with SchemaMode#36
Conversation
…te proto - Update proto submodule to 381dbcd: adds SchemaMode enum (STRICT/VALIDATED/ FLEXIBLE), ComputedPropertyDefinition, schema_mode field on CreateLabelRequest and Label messages - Add AsyncCoordinodeClient.create_label(name, properties, *, schema_mode) → LabelInfo with full _type_map / _mode_map conversion; schema_mode defaults to 'strict' - Add AsyncCoordinodeClient.create_edge_type(name, properties) → EdgeTypeInfo (proto does not expose schema_mode for edge types) - Add sync wrappers CoordinodeClient.create_label / create_edge_type - Expose schema_mode: int on LabelInfo and EdgeTypeInfo (getattr-safe: defaults to 0 for older server images that omit the field) - Revert refresh_schema() labels(a)[0] shortcut to labels(a) AS src_labels + Python _first_label() helper; published Docker image does not yet support subscript-on-function syntax (TODO comment added) - Add 11 unit tests (test_schema_crud.py): schema_mode propagation on LabelInfo and EdgeTypeInfo, missing-field default, repr coverage - Add 6 integration tests (test_sdk.py): create_label / create_edge_type round- trips, schema_mode=flexible acceptance, invalid schema_mode ValueError, appears-in-list verification (with documented workaround for G009) - Mark test_traverse_inbound as xfail: INBOUND direction not supported by server Known gaps recorded in GAPS.md: G008 - SchemaMode enforcement not active in published image G009 - ListLabels/ListEdgeTypes omits schema-only objects (no nodes yet) G010 - labels(n)[0] subscript not in published image (workaround: _first_label) G011 - traverse INBOUND direction unimplemented on server
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughAdds full-text/hybrid search and schema-registration APIs and types, Cypher identifier validation and property-definition validation, injectable client support for LangChain/LlamaIndex adapters, tests and demo notebooks/docker, dependency pins, and updated proto/submodule refs. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Client as AsyncCoordinodeClient
participant SchemaStub as SchemaServiceStub
participant TextStub as TextServiceStub
participant Server as CoordiNode Server
rect rgba(200,150,100,0.5)
Note over App,Server: Schema Registration Flow
App->>Client: create_label(name, properties, schema_mode)
Client->>Client: _validate_property_dict / _build_property_definitions
Client->>SchemaStub: CreateLabel(proto)
SchemaStub->>Server: RPC CreateLabel
Server-->>SchemaStub: LabelInfo proto
SchemaStub-->>Client: proto.LabelInfo
Client-->>App: LabelInfo(name, schema_mode)
end
rect rgba(100,200,150,0.5)
Note over App,Server: Text Search Flow
App->>Client: text_search(query, label)
Client->>TextStub: TextSearch(proto)
TextStub->>Server: RPC TextSearch
Server-->>TextStub: TextSearch response
TextStub-->>Client: proto results
Client-->>App: list[TextResult]
end
rect rgba(150,100,200,0.5)
Note over App,Server: Hybrid Search Flow
App->>Client: hybrid_text_vector_search(text, vector, label)
Client->>Client: normalize vector -> floats
Client->>TextStub: HybridTextVectorSearch(proto)
TextStub->>Server: RPC HybridTextVectorSearch
Server-->>TextStub: Hybrid response
TextStub-->>Client: proto results
Client-->>App: list[HybridResult]
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@coordinode/coordinode/client.py`:
- Around line 418-425: The current loop building PropertyDefinition uses
_type_map.get(..., PropertyType.PROPERTY_TYPE_STRING) which silently coerces
unknown types to STRING; change this to validate the resolved type by checking
if type_str exists in _type_map and if not raise a ValueError (include the
unknown type_str and p["name"] in the message) before constructing
PropertyDefinition, using _type_map[type_str] when present; apply the same
validation and error behavior to the other occurrence that builds
PropertyDefinition (the block around the second instance noted in the comment).
- Around line 398-408: Extract the duplicated _type_map into a single shared
constant or helper (e.g., PROPERTY_TYPE_MAP or get_property_type) so both create
methods reference that single source; remove the local _type_map dicts and
import/use the shared symbol instead; ensure the mapping keys and values remain
the same (mapping to PropertyType.PROPERTY_TYPE_*) and update both places where
_type_map was used so behavior is unchanged.
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 270-273: The docstring/comment claims CoordiNode nodes have
exactly one label but the API (create_node(labels: list[str]) and
NodeResult.labels: list[str]) allows multiple labels; clarify intent and enforce
or document accordingly: either enforce the single-label constraint by
validating in create_node (raise on len(labels) != 1 and store/return a
single-label invariant) and update uses that call labels()[0] (or keep min()
deterministic) OR change the comment to state "nodes typically have one label in
this application; code uses min() to deterministically pick one when multiple
labels exist" and leave create_node/NodeResult.labels as list[str]; update the
docstring in graph.py near labels() and any comment referencing min() to match
the chosen approach.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8f002175-5aaf-4cab-8d74-37209a075a63
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
coordinode/coordinode/client.pylangchain-coordinode/langchain_coordinode/graph.pyprototests/integration/test_sdk.pytests/unit/test_schema_crud.py
…upport
- Add client= parameter to CoordinodeGraph.__init__ and
CoordinodePropertyGraphStore.__init__: accepts any object with .cypher()
instead of creating a gRPC CoordinodeClient from addr. Enables passing
LocalClient from coordinode-embedded.
- Rewrite all 4 demo notebooks for Google Colab:
* 00_seed_data.ipynb — seeds tech-industry knowledge graph; Colab uses
LocalClient(":memory:") directly (same .cypher() API)
* 01_llama_index_property_graph.ipynb — PropertyGraphStore via
_EmbeddedAdapter wrapping LocalClient; get_schema_text() via Cypher fallback
* 02_langchain_graph_chain.ipynb — CoordinodeGraph with client= adapter;
OpenAI GraphCypherQAChain section is optional (auto-skips without key)
* 03_langgraph_agent.ipynb — LocalClient direct; LangGraph agent with
mock demo first (no LLM key needed), LLM section optional
- Each notebook: IN_COLAB detection, installs coordinode-embedded from GitHub
source (Rust build, ~5 min first run), Open in Colab badge in first cell
- Add demo/README.md with Colab badge table for all 4 notebooks
Colab links:
https://colab.research.google.com/github/structured-world/coordinode-python/blob/main/demo/notebooks/00_seed_data.ipynb
https://colab.research.google.com/github/structured-world/coordinode-python/blob/main/demo/notebooks/01_llama_index_property_graph.ipynb
https://colab.research.google.com/github/structured-world/coordinode-python/blob/main/demo/notebooks/02_langchain_graph_chain.ipynb
https://colab.research.google.com/github/structured-world/coordinode-python/blob/main/demo/notebooks/03_langgraph_agent.ipynb
…pe validation - Extract shared _build_property_definitions() static method: both create_label and create_edge_type now call it, eliminating duplicated _type_map and property-building loop (addresses CodeRabbit #1) - Raise ValueError for unknown property type in both methods instead of silently coercing to STRING; error includes property name and valid options (addresses CodeRabbit #2, already in previous commit) - Clarify _first_label() docstring: nodes "typically have one label in this application" rather than asserting system-level single-label constraint (addresses CodeRabbit #3)
|
CodeRabbit review addressed (3 threads): #1 (shared _type_map): Extracted into #2 (unknown type coercion): Now raises #3 (_first_label docstring): Updated to say "nodes typically have one label in this application" rather than asserting a system-level constraint. The |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Line 128: The adapter's no-op close() prevents LocalClient resources from
being released; update the adapter's def close(self): pass to delegate to the
embedded client by calling the LocalClient.close() (or the stored client
instance's close method) so that store.close() triggers proper cleanup of
temporary dirs and other resources; locate the adapter class with the close
method and replace the no-op with a call to the underlying client's close()
(e.g., self._client.close() or whichever attribute holds the LocalClient
instance).
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Line 129: The adapter's close() method currently no-ops and must forward to
the embedded LocalClient to avoid leaking resources; update the adapter class
(the method named close) to call the wrapped LocalClient instance's close() (the
same object created in Colab mode and stored on the adapter) and propagate any
exceptions or suppress them consistently with existing shutdown semantics so
that graph.close() triggers LocalClient.close() and cleans up DB resources and
temp dirs.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Around line 192-198: In find_related(entity_name: str, depth: int = 1) -> str,
depth is only clamped with min(depth, 3) which allows 0 or negative values and
leads to an invalid '*1..N' range; sanitize depth by computing a bounded_depth =
max(1, min(depth, 3)) (or equivalent) before building the Cypher fragment and
use bounded_depth in the string passed to client.cypher so the relationship
range is always between 1 and 3.
- Around line 173-181: The save_fact function currently interpolates the
tool-provided relation into the Cypher identifier (in client.cypher), allowing
injection; before building the query in save_fact, validate and sanitize
relation: normalize to upper(), replace spaces with underscores, then enforce a
strict whitelist regex (e.g., only [A-Z0-9_]+) and reject or map any value that
fails (raise an exception or use a safe fallback like DEFAULT_RELATION). Use the
sanitized token when concatenating into the Cypher relationship name and keep
all other values as params (subject, obj, SESSION) passed to client.cypher to
avoid further injection risks.
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 48-54: The graph wrapper currently always closes self._client in
close(), which will close caller-owned clients passed via the client parameter;
change the constructor (where client: Any = None and self._client = client if
client is not None else CoordinodeClient(...)) to record ownership (e.g.
self._owns_client = client is None) and update the close() method to only call
close on self._client when self._owns_client is True, leaving externally
provided clients open.
In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py`:
- Around line 71-75: The constructor currently takes a `client` parameter and
always assigns it to `self._client`, but `close()` unconditionally calls
`self._client.close()`, which can close externally managed clients; modify the
initializer (the function that sets `self._client`) to record ownership (e.g.,
set `self._owns_client = True` when you create `CoordinodeClient(addr,
timeout=timeout)` and `False` when a client is injected), and update the
`close()` method to only call `self._client.close()` if `self._owns_client` is
True; reference the initializer where `self._client` is set and the `close()`
method to implement this ownership check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca1cd5d3-f7f2-4ed9-994a-740d5bd812a8
📒 Files selected for processing (8)
coordinode/coordinode/client.pydemo/README.mddemo/notebooks/00_seed_data.ipynbdemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynblangchain-coordinode/langchain_coordinode/graph.pyllama-index-coordinode/llama_index/graph_stores/coordinode/base.py
- ruff.toml: exclude coordinode-rs/ (submodule) and **/.ipynb_checkpoints/ from lint+format; add per-file-ignores for *.ipynb (E401/E402/I001 — normal notebook patterns: multi-import on one line, import after pip install cell) - Auto-format demo notebooks and coordinode-embedded .pyi stub - Fix UP037 in _coordinode_embedded.pyi: remove quotes from type annotation - Add .ipynb_checkpoints/ to .gitignore
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
demo/notebooks/02_langchain_graph_chain.ipynb (1)
132-133:⚠️ Potential issue | 🟡 MinorDelegate
close()to the underlyingLocalClientto release resources.Same issue as in notebook 01 — the adapter's
close()is a no-op whileLocalClient.close()should be called to release database resources.Suggested fix
def close(self): - pass + self._lc.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/02_langchain_graph_chain.ipynb` around lines 132 - 133, The adapter's close() method is currently a no-op; change it to delegate to the underlying LocalClient to release resources by calling the client's close method (e.g., invoke LocalClient.close() on the adapter's stored client instance from within close()). Locate the adapter class with def close(self): pass and replace the pass with a call to the underlying client (the attribute holding the LocalClient instance) so the adapter's close() forwards cleanup to LocalClient.close().demo/notebooks/01_llama_index_property_graph.ipynb (1)
131-132:⚠️ Potential issue | 🟡 MinorDelegate
close()to the underlyingLocalClientto release resources.The adapter's
close()is a no-op, butLocalClient.close()releases database resources. Whenstore.close()is called at the end of the notebook, resources remain uncleaned.Suggested fix
def close(self): - pass + self._lc.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/01_llama_index_property_graph.ipynb` around lines 131 - 132, The adapter's close() currently does nothing; update its close() implementation to call and await (if async) the underlying LocalClient.close() so resources are released when store.close() is invoked; locate the adapter class that defines def close(self): pass and replace it to delegate to the LocalClient instance (e.g., self._local_client.close() or await self._local_client.close()) ensuring correct sync/async semantics to mirror LocalClient.close().demo/notebooks/03_langgraph_agent.ipynb (2)
203-212:⚠️ Potential issue | 🟡 MinorClamp
depthto a valid lower bound.
min(depth, 3)allows0or negative values, producing an invalid*1..0or*1..-1range that will fail.Suggested fix
`@tool` def find_related(entity_name: str, depth: int = 1) -> str: """Find all entities reachable from entity_name within the given number of hops (max 3).""" + safe_depth = max(1, min(depth, 3)) rows = client.cypher( - "MATCH (n:Entity {name: $name, session: $sess})-[r*1.." + str(min(depth, 3)) + "]->(m) " + "MATCH (n:Entity {name: $name, session: $sess})-[r*1.." + str(safe_depth) + "]->(m) " "RETURN m.name AS related, type(last(r)) AS via LIMIT 20", params={"name": entity_name, "sess": SESSION}, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/03_langgraph_agent.ipynb` around lines 203 - 212, The function find_related constructs a Cypher range using min(depth, 3) which can be zero or negative; clamp depth to a valid lower bound by computing hops = max(1, min(depth, 3)) (or equivalent) and use hops when building the pattern (e.g., "[r*1..{hops}]") so the range is always at least 1 and no larger than 3; update the client.cypher call in find_related to use this hops variable when concatenating the query string.
182-191:⚠️ Potential issue | 🟠 MajorSanitize relationship type before interpolating into Cypher.
The
relationparameter from tool input is concatenated directly into the Cypher query. A crafted value could inject malicious Cypher (e.g.,RELATED]->(x) DELETE (a) //).Suggested fix
+import re + `@tool` def save_fact(subject: str, relation: str, obj: str) -> str: """Save a fact (subject → relation → object) into the knowledge graph. Example: save_fact('Alice', 'WORKS_AT', 'Acme Corp')""" + rel_type = relation.upper().replace(" ", "_") + if not re.fullmatch(r'[A-Z_][A-Z0-9_]*', rel_type): + return f"Invalid relation type: {relation}" client.cypher( "MERGE (a:Entity {name: $s, session: $sess}) " "MERGE (b:Entity {name: $o, session: $sess}) " - "MERGE (a)-[r:" + relation.upper().replace(" ", "_") + "]->(b)", + f"MERGE (a)-[r:{rel_type}]->(b)", params={"s": subject, "o": obj, "sess": SESSION}, ) return f"Saved: {subject} -[{relation}]-> {obj}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/03_langgraph_agent.ipynb` around lines 182 - 191, The save_fact function concatenates the tool-provided relation directly into the Cypher string (see save_fact, relation, client.cypher, SESSION), which allows Cypher injection; sanitize and validate relation before interpolation by normalizing (e.g., relation.upper().replace(" ", "_")) and enforcing a tight whitelist regex such as only A-Z, 0-9 and underscores (e.g., r'^[A-Z0-9_]+$'); if validation fails, raise an error and do not call client.cypher—only then interpolate the validated relation into the MERGE relationship token.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Around line 108-139: Update the _EmbeddedAdapter class to document the stubbed
methods: add a brief note in the class docstring and/or inline comments inside
vector_search, get_labels, and get_edge_types stating these return empty results
because embedded mode does not implement vector search or label/edge
introspection; mention expected behavior (e.g., returns empty list) and point
users to the full CoordinodeClient implementation for production features so
it's clear these are intentional stubs.
---
Duplicate comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Around line 131-132: The adapter's close() currently does nothing; update its
close() implementation to call and await (if async) the underlying
LocalClient.close() so resources are released when store.close() is invoked;
locate the adapter class that defines def close(self): pass and replace it to
delegate to the LocalClient instance (e.g., self._local_client.close() or await
self._local_client.close()) ensuring correct sync/async semantics to mirror
LocalClient.close().
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 132-133: The adapter's close() method is currently a no-op; change
it to delegate to the underlying LocalClient to release resources by calling the
client's close method (e.g., invoke LocalClient.close() on the adapter's stored
client instance from within close()). Locate the adapter class with def
close(self): pass and replace the pass with a call to the underlying client (the
attribute holding the LocalClient instance) so the adapter's close() forwards
cleanup to LocalClient.close().
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Around line 203-212: The function find_related constructs a Cypher range using
min(depth, 3) which can be zero or negative; clamp depth to a valid lower bound
by computing hops = max(1, min(depth, 3)) (or equivalent) and use hops when
building the pattern (e.g., "[r*1..{hops}]") so the range is always at least 1
and no larger than 3; update the client.cypher call in find_related to use this
hops variable when concatenating the query string.
- Around line 182-191: The save_fact function concatenates the tool-provided
relation directly into the Cypher string (see save_fact, relation,
client.cypher, SESSION), which allows Cypher injection; sanitize and validate
relation before interpolation by normalizing (e.g., relation.upper().replace("
", "_")) and enforcing a tight whitelist regex such as only A-Z, 0-9 and
underscores (e.g., r'^[A-Z0-9_]+$'); if validation fails, raise an error and do
not call client.cypher—only then interpolate the validated relation into the
MERGE relationship token.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 62214ae0-ea5f-4ac7-bc46-36478c863db6
📒 Files selected for processing (7)
.gitignorecoordinode-embedded/python/coordinode_embedded/_coordinode_embedded.pyidemo/notebooks/00_seed_data.ipynbdemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynbruff.toml
…ebooks - graph.py, base.py: add _owns_client flag — close() skips shutdown for externally-injected clients (client= kwarg); only closes clients it created - notebooks 01/02: _EmbeddedAdapter.close() delegates to self._lc.close() instead of being a no-op - notebook 03: validate rel_type against _REL_TYPE_RE before Cypher interpolation; clamp find_related depth to max(1, min(int(depth), 3))
vector_search(), get_labels(), get_edge_types() return empty results in embedded mode — add inline comments explaining this limitation so users understand the behavior without reading server-side code
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Line 228: The notebook creates a CoordinodePropertyGraphStore with an injected
client (CoordinodePropertyGraphStore(client=client)), but store.close() won’t
close that injected client under the current ownership model; update the cleanup
to explicitly call client.close() after store.close() (or instead of relying on
store.close()) so the underlying connection is properly closed—ensure you modify
the cleanup block that calls store.close() (around the store creation/cleanup
code) to add client.close() as well.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Line 231: Because the notebook injects an external client into CoordinodeGraph
(graph = CoordinodeGraph(client=client)), graph.close() will not close that
injected client; update the cleanup to explicitly call client.close() in
addition to graph.close() (e.g., after using graph, call graph.close() then
client.close()) so the injected client's resources are released; reference
CoordinodeGraph, graph, client, graph.close(), and client.close() when making
the change.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Around line 134-150: The notebook starts a detached Docker container (proc ->
container_id, COORDINODE_ADDR) but never guarantees it is stopped on errors or
when re-running; wrap the docker run + port-wait logic in a try/finally (or
register an atexit handler) and in the finally call
subprocess.run(["docker","stop", container_id]) (checking container_id is set
and not empty) so the container is always stopped on exit or error; also update
the existing cleanup at the later block that attempted to stop the container
(lines referencing container_id/COORDINODE_ADDR) to reuse the same container_id
variable or guard it and call docker stop there if the try/finally path didn’t
run.
- Around line 201-205: query_facts currently executes arbitrary Cypher
(client.cypher) which allows destructive or cross-session operations; change it
to reject or sanitize input by enforcing read-only queries and session scoping:
validate the cypher string in query_facts to disallow dangerous
keywords/constructs (e.g., CREATE, DELETE, MERGE, CALL, LOAD, DROP, SET, ';')
and require that the query includes or is transformed to include a session
filter (params={"sess": SESSION} and WHERE n.session = $sess) or only allow a
small whitelist of safe patterns (e.g., STARTS WITH "MATCH" and no mutation
tokens); if validation fails, return an error instead of executing the query.
Ensure you reference and update the query_facts function and its use of
client.cypher to implement this validation/sanitization and fail-safe behavior.
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 80-83: Replace the inline TODO at the comment above rows =
self._client.cypher with a non-TODO note or a tracked-issue reference: update
the comment that currently reads "TODO: simplify to labels(a)[0] once
subscript-on-function is in published image." to either reference the issue/PR
number (e.g., "see ISSUE-1234") or rewrite as a descriptive note (e.g., "Note:
can simplify to labels(a)[0] after subscript-on-function support is available")
so static analysis no longer flags a TODO; locate this comment adjacent to the
rows variable and self._client.cypher call in the Graph code to make the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6ffce079-c347-4845-a1ce-6a32f536ec9a
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
demo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynblangchain-coordinode/langchain_coordinode/graph.pyllama-index-coordinode/llama_index/graph_stores/coordinode/base.py
There was a problem hiding this comment.
Pull request overview
Adds schema-creation APIs to the CoordiNode Python client (including SchemaMode support) and updates downstream integrations/demos to support injecting a pre-built client (e.g., embedded LocalClient) for easier Colab usage.
Changes:
- Add
create_label(..., schema_mode=...)andcreate_edge_type(...)to async + sync clients, plusschema_modesurfaced onLabelInfo/EdgeTypeInfo. - Allow LangChain/LlamaIndex graph stores to accept an externally-provided
client=and only close internally-owned clients. - Refresh demo notebooks/README for Colab + embedded mode; update lint config and expand unit/integration tests.
Reviewed changes
Copilot reviewed 13 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
coordinode/coordinode/client.py |
New schema CRUD APIs + schema_mode plumbing on info objects. |
langchain-coordinode/langchain_coordinode/graph.py |
Accept injected client and adjust schema refresh notes/docs. |
llama-index-coordinode/llama_index/graph_stores/coordinode/base.py |
Accept injected client and avoid closing caller-owned clients. |
tests/unit/test_schema_crud.py |
Unit coverage for schema_mode defaults/propagation and repr. |
tests/integration/test_sdk.py |
Integration coverage for create_label/create_edge_type and schema_mode validation. |
demo/notebooks/00_seed_data.ipynb |
Colab-ready seeding notebook using embedded mode when available. |
demo/notebooks/01_llama_index_property_graph.ipynb |
Colab-ready LlamaIndex demo using injected client adapter. |
demo/notebooks/02_langchain_graph_chain.ipynb |
Colab-ready LangChain demo using injected client adapter. |
demo/notebooks/03_langgraph_agent.ipynb |
Colab-ready LangGraph agent demo with embedded/local modes. |
demo/README.md |
Colab badge index and local Docker Compose instructions. |
ruff.toml |
Exclusions for submodules/checkpoints and notebook per-file ignores. |
coordinode-embedded/python/coordinode_embedded/_coordinode_embedded.pyi |
Minor stub typing tweak for __enter__. |
.gitignore |
Add ignore entry for .ipynb_checkpoints. |
uv.lock |
Workspace version bump in lockfile. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…fix CONTAINER_ID scope - notebooks 01/02: add explicit client.close() after store/graph.close() in cleanup cells — _owns_client prevents store from closing injected clients - notebook 03: promote container_id to module-level CONTAINER_ID = None so cleanup cell can stop auto-started Docker containers on re-runs - notebook 03: add write-keyword guard and $sess requirement to query_facts to prevent destructive or cross-session Cypher from agent-facing tool - graph.py:82: replace TODO comment with descriptive note referencing G010/GAPS.md to silence SonarCloud warning
…ync docstring types - _build_property_definitions: raise ValueError with index when property entry is not a dict with a non-empty 'name' key (was KeyError) - LabelInfo.schema_mode inline annotation: 0=unspecified (not 0=unspecified/strict) - create_label docstring: add "list" and "map" to the type strings list - _first_label docstring: clarify min() is intentional for determinism; labels(n)[0] is unstable due to non-guaranteed label ordering
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@coordinode/coordinode/client.py`:
- Around line 371-411: Rename the parameters in _build_property_definitions from
PropertyType and PropertyDefinition to snake_case names (e.g., property_type_cls
and property_definition_cls) to satisfy linters and clarify they are class
references; update all uses inside the function (type_map values and the call to
PropertyDefinition(...)) to refer to the new parameter names and adjust any call
sites that pass the proto classes into _build_property_definitions to use the
same positional or keyword names if necessary.
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Around line 397-402: The cleanup cell contains a duplicated/concatenated token
"callerclient.close()" — fix the typo by replacing the malformed trailing text
with a single call to client.close() and its comment; specifically update the
lines around store.delete(...), store.close(), and client.close() so the final
line reads a single client.close() and retains the comment "injected client —
owned by caller" (remove the extra "caller" concatenation and duplicate
client.close()).
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 395-400: The cleanup cell contains duplicated text where the
client.close() call and its comment were concatenated; update the final lines in
the cell so the Neo4j calls are correct by leaving graph.query(...),
print("Cleaned up"), graph.close(), then a single client.close() with the
correct comment ("# injected client — owned by caller")—remove the duplicated
"client.close() # injected client — owned by callerclient.close() # injected
client — owned by caller" fragment.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3c8ff839-a7b7-41aa-88c0-c9831cf19dc9
📒 Files selected for processing (5)
coordinode/coordinode/client.pydemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynblangchain-coordinode/langchain_coordinode/graph.py
…bel visibility test Server now enforces strict schema for labels registered via create_label(). The workaround node must include only declared properties, so add tag:string to the schema alongside x:int64.
…case; fix duplicate client.close() in notebooks - _build_property_definitions: PropertyType → property_type_cls, PropertyDefinition → property_definition_cls (SonarCloud naming rule) - notebooks 01/02: remove duplicated client.close() line in cleanup cells
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 15 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… dep to >=0.6.0 - CoordinodeGraph.refresh_schema() now uses get_labels() + get_edge_types() when available (coordinode>=0.6.0), bypassing the regex text parser; injected clients without these methods fall back to _parse_schema() - Add _PROPERTY_TYPE_NAME dict mapping PropertyType proto ints to strings (INT64, FLOAT64, STRING, BOOL, BYTES, TIMESTAMP, VECTOR, LIST, MAP) - Bump coordinode requirement to >=0.6.0 in both langchain-coordinode and llama-index-graph-stores-coordinode
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.gitignore:
- Line 25: Replace the single-root .ipynb_checkpoints/ ignore entry with a
recursive pattern so checkpoint directories are ignored at any depth; update the
.gitignore entry `.ipynb_checkpoints/` to use a recursive glob (e.g.,
`**/.ipynb_checkpoints/`) so Jupyter checkpoints alongside notebooks in
subdirectories are also excluded.
In `@coordinode/coordinode/client.py`:
- Around line 394-409: The loop that builds property_definition_cls currently
coerces 'required' and 'unique' with bool(...), allowing non-boolean values like
"false" or 1 to be treated as True; change the validation so that for each
property dict (in the properties loop inside the code that constructs
property_definition_cls) you explicitly check if 'required' or 'unique' keys are
present and, if so, assert they are instances of bool (raise ValueError with a
clear message referencing the property name/index), otherwise default them to
False; then pass the validated boolean values into
property_definition_cls(name=..., type=..., required=..., unique=...).
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Around line 332-337: The query passed to store.structured_query uses a
truncated tag in param_map ("prefix": f"Alice-{tag[:4]}) which can cause
cross-run collisions; update the f-string in the param_map for
store.structured_query to use the full tag (e.g., f"Alice-{tag}") so the WHERE
p.name STARTS WITH parameter is unique per run and preserves isolation.
- Around line 163-207: The notebook may leave a detached Docker container
running when you start CoordiNode (container_id created after subprocess.run)
because there is no teardown; modify the startup logic around the branch that
runs docker (where IMAGE, proc, container_id, and COORDINODE_ADDR are set) to
register a cleanup that stops the container (e.g., atry/finally or
atexit.register that calls subprocess.run(["docker","stop", container_id]))
and/or ensure any exception path (including the timeout else) always stops the
created container; ensure the cleanup uses the captured container_id and only
attempts to stop it if proc.returncode == 0 and container_id is non-empty so you
don’t try to stop a non-created container.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 288-293: The demo query uses a truncated prefix when calling
graph.query (params={"prefix": f"Turing-{tag[:4]}"}), which can collide with
other runs; change the params value to use the full tag (e.g., params={"prefix":
f"Turing-{tag}"} or otherwise include the entire unique tag variable) so the
MATCH filter on s.name is fully namespaced by the run's tag and prevents
cross-run matches.
- Around line 164-208: The cell can leave a started Docker container running
because when falling back to docker run you set container_id =
proc.stdout.strip() but never record or stop it on teardown; ensure you record
the started container id (e.g., in a module/global name like
STARTED_CONTAINER_ID) and update your cleanup logic to stop it
(subprocess.run(["docker","stop", STARTED_CONTAINER_ID])) only if that variable
is set and the container was created by this notebook; alternatively register an
atexit handler or try/finally around the startup sequence to call
subprocess.run(["docker","stop", container_id]) so
COORDINODE_ADDR/GRPC_PORT/_port_open logic still works and you avoid leaked
containers.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Around line 207-217: The current check in query_facts only looks for the
literal "$sess" token but doesn't ensure it's used to restrict node access;
update query_facts to validate that $sess is used in a WHERE clause that
constrains a node property (e.g. pattern matching "WHERE <alias>.session =
$sess" or equivalent) rather than merely present in the query or returned;
implement a robust check (regex or simple parsing) on q to confirm presence of a
WHERE condition binding a node/alias session property to $sess and reject
queries that only return $sess or embed it outside a WHERE predicate; keep the
existing read-only keyword check (_WRITE_KEYWORDS) and still call
client.cypher(q, params={"sess": SESSION}) only after this stricter
session-scoping validation.
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 50-55: The constructor currently accepts a prebuilt client but
assumes it has get_schema_text() and vector_search(), causing refresh_schema()
and similarity_search() to fail when passed the simple LocalClient (which only
exposes cypher() and close()); update __init__ of the Graph class to either
validate the passed client implements the required protocol (cypher,
get_schema_text, vector_search) and raise a clear error, or automatically wrap
the passed LocalClient in a small adapter that implements get_schema_text()
(calling cypher under the hood) and vector_search() (delegating to an embedded
vector search implementation), and update usages of
get_triplets()/relation_names handling to accept relation_names=None and extract
relationship type via type(r) instead of r.__type__ so callers that omit
relation_names still work with CoordiNode v0.3.2; reference Graph.__init__,
refresh_schema, similarity_search, LocalClient, CoordinodeClient, and
get_triplets when making the changes.
In `@llama-index-coordinode/llama_index/graph_stores/coordinode/base.py`:
- Around line 73-76: The constructor currently accepts an injected client but
doesn't validate its API, causing later calls in get_schema() and vector_query()
to raise AttributeError when a bare coordinode-embedded LocalClient (which only
implements cypher() and close()) is passed; update the constructor that sets
self._client / self._owns_client to validate that the injected client implements
the richer protocol expected by get_schema() and vector_query() (e.g., has
get_schema_text and vector_search methods) and raise a clear TypeError if not,
or alternatively accept an adapter parameter and wrap a LocalClient into an
adapter implementing get_schema_text and vector_search before assigning
self._client (referencing CoordinodeClient, LocalClient, get_schema(),
vector_query(), get_schema_text, and vector_search to locate the code).
In `@proto`:
- Line 1: Verify the proto submodule update points to commit 381dbcd7 and
contains the expected schema-mode additions by updating/fetching the submodule
(git submodule update --init --recursive / git fetch && git checkout 381dbcd7),
then inspect the .proto files for the new SchemaMode enum, the
ComputedPropertyDefinition message, and schema_mode fields in relevant messages,
and confirm the create_label and create_edge_type RPCs reference schema_mode; if
any piece is missing, update the submodule to the correct commit or add the
missing proto definitions and re-run the proto compilation to ensure generated
clients reflect the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a5821a64-2c69-4542-80fc-5c813932a1a1
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
.gitignorecoordinode-embedded/python/coordinode_embedded/_coordinode_embedded.pyicoordinode/coordinode/client.pydemo/README.mddemo/notebooks/00_seed_data.ipynbdemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynblangchain-coordinode/langchain_coordinode/graph.pyllama-index-coordinode/llama_index/graph_stores/coordinode/base.pyprotoruff.tomltests/integration/test_sdk.pytests/unit/test_schema_crud.py
- nb03: add client.health() verification before 'Connected' print in both gRPC branches (COORDINODE_ADDR env and _port_open fallback) - nb03: scope find_related intermediate hops to session — use path variable p and WHERE ALL(x IN nodes(p) WHERE x.session = $sess); switch type(last(r)) to type(last(relationships(p))) - tests: add test_create_label_schema_mode_validated asserting schema_mode == 2 (VALIDATED) to complement flexible/invalid tests
client.health() returns bool, not raises — bare call ignored the result. Now raise RuntimeError with a clear message if health check fails.
|
@coderabbitai re: coordinode-rs submodule SHA The SHA is intentional — it is pinned to the commit immediately following the v0.3.13 release commit in coordinode-rs (the changelog/release-notes commit). No Once the coordinode-rs team tags the release, the submodule will be updated to that tag. This is tracked. |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 21 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
tests/integration/test_sdk.py:712
- Cleanup here uses
DETACH DELETEeven though the PR description says DETACH DELETE is a server no-op. If that’s still true on the target image, this test will leak nodes across runs. PreferDELETE(no detach needed for single nodes) or the two-step relationship+node delete pattern used elsewhere in the demos/tests.
assert r.score > 0
finally:
client.cypher(f"MATCH (n:{label} {{tag: $tag}}) DETACH DELETE n", params={"tag": tag})
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…assumptions FTS indexing is NOT automatic — a CREATE TEXT INDEX DDL is required before text_search() can return results. The previous code and tests incorrectly assumed automatic indexing. SDK changes: - Add TextIndexInfo result class (name, label, properties, documents_indexed) - Add AsyncCoordinodeClient.create_text_index() and drop_text_index() - Add CoordinodeClient sync wrappers for both methods - Export TextIndexInfo from coordinode package - Fix text_search() docstring: document DDL requirement, remove false claim about automatic indexing Test changes: - Fix test_text_search_returns_results: create/drop TEXT INDEX around the search call; drop empty-results xfail (now a hard assertion) - Fix test_text_search_fuzzy: create/drop TEXT INDEX; assert results are non-empty (was: just verify no exception) - Fix test_hybrid_text_vector_search_returns_results: add TEXT INDEX - Update header comment: remove 'automatic indexing' statement
DETACH DELETE may not reliably remove connected edges when nodes have relationships. Use explicit MATCH-DELETE for edges then nodes to ensure clean test isolation.
Colab notebooks use coordinode-embedded bundling coordinode-rs v0.3.13. Docker Compose stack uses the CoordiNode server image v0.3.14. These are separate version lines — make the distinction explicit.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@coordinode/coordinode/client.py`:
- Around line 583-593: The drop_text_index method currently interpolates the
index name directly into the Cypher string (await self.cypher(f"DROP TEXT INDEX
{name}")), which repeats the unsafe pattern from create_text_index; update
drop_text_index to apply the same identifier validation/sanitization used by
create_text_index (e.g., call the existing validate/sanitize function or inline
the same safe-check logic) before interpolating, and then pass the validated
identifier into self.cypher to avoid injection and ensure consistent behavior
with create_text_index.
- Around line 414-471: The _build_property_definitions function exceeds
cognitive complexity — extract small validation helpers to simplify the loop:
implement a helper like _validate_property_dict(p, idx) that verifies p is a
dict, extracts and validates the non-empty name and normalizes/validates the
type string, and another helper like _validate_property_flags(name, required,
unique) that validates boolean flags (or combine into one helper that returns
name, type_str, required, unique); then update _build_property_definitions to
call these helpers inside the for loop and use their returned values when
constructing property_definition_cls(name=..., type=..., required=...,
unique=...), keeping the same error messages and the existing type_map and
result building logic.
- Around line 571-581: The create_text_index block directly interpolates user
inputs (name, label, properties/prop_list) into the cypher string; add
validation or safe-escaping for Cypher identifiers before building props_expr
and cypher to avoid syntax errors or injection. Implement or call a helper like
_validate_cypher_identifier(value, param_name) to validate name and label, and
validate each property in prop_list (or alternatively apply backtick-escaping
for each identifier) and raise ValueError on invalid identifiers; then construct
props_expr and lang_clause and run self.cypher(cypher) as before (references:
create_text_index function, variables name, label, properties, prop_list,
props_expr, cypher).
In `@demo/README.md`:
- Around line 30-34: Add the missing metrics/health port entry to the ports
table in demo/README.md: insert a row for port 37084 and label the Service as
"Metrics / Health" (to match demo/docker-compose.yml which exposes 37084 for
/metrics and /health) so the README stays consistent with the compose file and
aids local ops/troubleshooting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ae099c38-bd20-429b-815c-b5721a5ecd3c
📒 Files selected for processing (4)
coordinode/coordinode/__init__.pycoordinode/coordinode/client.pydemo/README.mdtests/integration/test_sdk.py
Updates embedded engine to include all fixes after v0.3.13: - percentileCont/percentileDisc functions - Documentation and API reference improvements Aligns embedded engine with CoordiNode server image v0.3.14.
…identifier validation - Extract per-property validation into _validate_property_dict static method, reducing _build_property_definitions cognitive complexity from 16 → ~10 (fixes SonarCloud threshold of 15) - Add _CYPHER_IDENT_RE regex and _validate_cypher_identifier() module helper; validate name, label, and each property in create_text_index(), and name in drop_text_index() before interpolating into DDL Cypher strings - Add missing port 37084 (metrics/health) to demo/README.md port table to match docker-compose.yml
Update @b23c833 install reference in all 4 Colab notebooks to @e3ff279, which includes the coordinode-rs v0.3.14 submodule; update README note from v0.3.13 to v0.3.14.
Previous pointer (5fc543b) was a local-only commit not pushed to structured-world/coordinode, causing CI submodule fetch to fail. Switch to 4c80fbe — the official release commit tagged v0.3.14 that exists on GitHub.
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (1)
coordinode/coordinode/client.py (1)
685-710:⚠️ Potential issue | 🟡 MinorThe
text_search()docs still contradict the schema-free behavior exercised by tests.The
Argssection says schema-free text is auto-indexed on current servers, but theNotestill says indexing is "not automatic" and requires aCREATE TEXT INDEXfirst. That will mislead callers into an unnecessary setup step.✏️ Suggested wording
- Note: - Text indexing is **not** automatic. Before calling this method, - create a full-text index with the Cypher DDL statement:: + Note: + Some deployments require an explicit full-text index. On current + schema-free CoordiNode servers, text properties are indexed + automatically; create a text index only when your server requires + explicit index registration::Based on learnings: In
tests/integration/test_sdk.py, FTS indexing in CoordiNode is automatic for schema-free graphs, socreate_label()is not required beforetext_search()orhybrid_text_vector_search().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@coordinode/coordinode/client.py` around lines 685 - 710, Update the docstring for text_search() to remove the misleading "Text indexing is not automatic" Note and instead state that on current CoordiNode servers schema-free graphs are auto-indexed (so create_label() or explicit CREATE TEXT INDEX is not required for schema-free nodes), while retaining guidance that users can create an explicit text index via create_text_index or the Cypher DDL (CREATE TEXT INDEX ...) when they need a named/indexed property for schema-based graphs or explicit control; also mirror this clarification for hybrid_text_vector_search() if present and remove any contradictory phrasing about requiring CREATE TEXT INDEX before using text_search().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@coordinode-rs`:
- Line 1: The submodule is pinned to a non-existent SHA
(5fc543b04f0a732ec394eae84b130f4542754378); update the submodule pointer to the
v0.3.14 release (commit 4c80fbef57eeee16efa471dc544704d592972b4a). Replace the
bad gitlink/commit reference for the coordinode submodule with the tag v0.3.14
(or the commit 4c80fbef57ee...) in the repository’s submodule entry and commit
the updated submodule reference so CI and fresh clones fetch the correct commit.
In `@coordinode/coordinode/client.py`:
- Around line 594-603: The code builds a Cypher DDL that interpolates language
verbatim and allows empty properties; update the index-creation logic to first
reject empty property lists (raise ValueError if properties is empty after
normalizing) and validate each property as already done with
_validate_cypher_identifier(prop, "property"); additionally validate the
language parameter before interpolation—either require language to be None or a
non-empty string that passes safe identifier validation (e.g., call
_validate_cypher_identifier(language, "language") or check against an explicit
whitelist) and only then append the language clause, so malformed or injectable
DDL cannot be produced when composing cypher (variables to touch:
properties/prop_list, language, _validate_cypher_identifier, name, label,
cypher).
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Line 42: The Colab-only block that downloads and builds coordinode-embedded
(guarded by IN_COLAB) should skip the Rust download/build when a remote
CoordiNode address is provided; add a guard that checks
os.environ.get("COORDINODE_ADDR") (or similar env var used elsewhere) and only
run the rustup/maturin/git+coordinode-embedded install when IN_COLAB is true AND
COORDINODE_ADDR is not set, leaving the subsequent pip installs and
nest_asyncio.apply() unchanged.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 280-283: The cleanup step currently uses a single DETACH DELETE
via graph.query(...) which fails on servers that require explicit relationship
removal; replace that single query with the two-step cleanup used elsewhere:
first run a graph.query that MATCHes and DELETEs relationships for nodes with
n.name ENDS WITH $tag, then run a second graph.query that MATCHes and DELETEs
the nodes themselves (using the same params={"tag": tag}); keep the surrounding
print("Cleaned up"), graph.close(), and client.close() calls intact so ownership
semantics remain unchanged.
- Line 39: When IN_COLAB is true, the cell currently always bootstraps Rust and
builds/install coordinode-embedded; change the guard to skip that entire
embedded-build block if an external server is specified by COORDINODE_ADDR
(check os.environ.get("COORDINODE_ADDR")). Concretely, update the condition
using IN_COLAB and the env var (e.g., if IN_COLAB and not
os.environ.get("COORDINODE_ADDR")) so the rustup/maturin/pip install of
coordinode-embedded and the cargo PATH modification are only executed when Colab
is used standalone; leave the subsequent pip installs (coordinode, langchain,
etc.) untouched.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Line 127: SESSION is currently set to uuid.uuid4().hex[:8], which truncates
the UUID to 8 hex chars and reduces entropy; change this to use the full UUID
(e.g., uuid.uuid4().hex without slicing) so SESSION retains full uniqueness and
avoids cross-run collisions and accidental data deletion—update the assignment
where SESSION is defined (referencing SESSION and uuid.uuid4().hex[:8]) to use
the full uuid.uuid4().hex value.
- Around line 168-180: The function query_facts currently slices rows[:20] after
executing client.cypher, which can materialize huge results; before calling
client.cypher(q, ...), ensure the cypher string q includes a LIMIT 20 (or a
configurable safe max) if it doesn't already—e.g., detect an existing LIMIT via
a simple regex and only append " LIMIT 20" when absent—then execute
client.cypher with the modified q (preserving the existing read-only and
session-scoping checks using _WRITE_CLAUSE_RE, _SESSION_WHERE_SCOPE_RE, and
_SESSION_NODE_SCOPE_RE) so the database enforces the result bound instead of
materializing everything in query_facts.
In `@demo/README.md`:
- Around line 9-17: The Colab badge links currently point to blob/main so the
notebooks float with the branch while only the embedded install is actually
pinned; update the README note and/or the links in the table: either change each
Colab URL to reference a specific commit SHA (replace blob/main with
blob/<commit-sha>) so notebook content is truly pinned, or modify the Note text
to explicitly state that only the embedded install inside the notebooks
(coordinode-embedded) is commit-pinned and that the notebook links reference the
main branch; update the phrasing where it currently says “Notebooks are pinned
to a specific commit” to reflect which part is pinned.
---
Duplicate comments:
In `@coordinode/coordinode/client.py`:
- Around line 685-710: Update the docstring for text_search() to remove the
misleading "Text indexing is not automatic" Note and instead state that on
current CoordiNode servers schema-free graphs are auto-indexed (so
create_label() or explicit CREATE TEXT INDEX is not required for schema-free
nodes), while retaining guidance that users can create an explicit text index
via create_text_index or the Cypher DDL (CREATE TEXT INDEX ...) when they need a
named/indexed property for schema-based graphs or explicit control; also mirror
this clarification for hybrid_text_vector_search() if present and remove any
contradictory phrasing about requiring CREATE TEXT INDEX before using
text_search().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 884059ed-6081-42b6-89f3-322bcf8f4bae
📒 Files selected for processing (7)
coordinode-rscoordinode/coordinode/client.pydemo/README.mddemo/notebooks/00_seed_data.ipynbdemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynb
There was a problem hiding this comment.
♻️ Duplicate comments (5)
demo/notebooks/02_langchain_graph_chain.ipynb (2)
280-283:⚠️ Potential issue | 🟠 MajorUse two-step cleanup instead of
DETACH DELETEfor server compatibility.Line 280 reintroduces
DETACH DELETE, while this PR already uses explicit relationship+node cleanup indemo/notebooks/00_seed_data.ipynbfor broader compatibility.Suggested fix
-graph.query("MATCH (n) WHERE n.name ENDS WITH $tag DETACH DELETE n", params={"tag": tag}) +graph.query( + "MATCH (n) WHERE n.name ENDS WITH $tag OPTIONAL MATCH (n)-[r]-() DELETE r", + params={"tag": tag}, +) +graph.query("MATCH (n) WHERE n.name ENDS WITH $tag DELETE n", params={"tag": tag}) print("Cleaned up") graph.close() client.close() # injected client — owned by callerRun this read-only check to confirm cleanup-query inconsistency across notebooks:
#!/bin/bash python3 <<'PY' import json for path in [ "demo/notebooks/00_seed_data.ipynb", "demo/notebooks/02_langchain_graph_chain.ipynb", ]: nb = json.load(open(path)) print(f"\n== {path} ==") for i, cell in enumerate(nb["cells"]): if cell.get("cell_type") != "code": continue src = "".join(cell.get("source", [])) if "DELETE" in src and ("DETACH DELETE" in src or "demo_tag" in src or "ENDS WITH $tag" in src): print(f"[cell {i}]") print(src.strip()) PYExpected:
00_seed_data.ipynbshows explicit two-step deletes, while02_langchain_graph_chain.ipynbstill showsDETACH DELETE.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/02_langchain_graph_chain.ipynb` around lines 280 - 283, Replace the one-step DETACH DELETE used in the graph.query call with the explicit two-step cleanup used elsewhere: first delete relationships that involve the target nodes, then delete the nodes themselves; update the call that currently uses graph.query("MATCH (n) WHERE n.name ENDS WITH $tag DETACH DELETE n", params={"tag": tag}) to run two queries via graph.query—one removing relationships for nodes matching the ENDS WITH $tag predicate (DELETE relationships), and a second deleting those nodes (DELETE nodes); keep the same params={"tag": tag} and ensure graph.close() and client.close() behavior remains unchanged.
39-39:⚠️ Potential issue | 🟠 MajorSkip embedded Rust build when
COORDINODE_ADDRis already set.Line 39 still compiles
coordinode-embeddedfor all Colab runs, even when the notebook will use an external server. This adds avoidable latency/failure risk.Suggested fix
-# Install coordinode-embedded in Colab only (requires Rust build). -if IN_COLAB: +# Install coordinode-embedded in Colab only when embedded fallback is needed. +if IN_COLAB and not os.environ.get("COORDINODE_ADDR"):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/02_langchain_graph_chain.ipynb` at line 39, The Colab-only Rust install and coordinode-embedded pip build should be skipped when an external server is configured via COORDINODE_ADDR; wrap the existing IN_COLAB installation block (the code that creates _ctx, downloads sh.rustup.rs, runs subprocess.run to install rustup, sets _cargo_bin into PATH, and pip-installs maturin and the git+...coordinode-embedded package) with an extra guard like: if IN_COLAB and not os.environ.get("COORDINODE_ADDR"): ...; ensure any variables and subprocess.run calls inside that block (e.g., _rustup_path handling, _cargo_bin PATH update, pip install of maturin and coordinode-embedded) are only executed when COORDINODE_ADDR is unset so Colab runs that point to an external coordinode server skip the embedded Rust build.demo/notebooks/01_llama_index_property_graph.ipynb (1)
42-42:⚠️ Potential issue | 🟠 MajorAvoid forcing embedded build in Colab when using a remote server.
Line 42 still runs the Rust/embedded install path whenever
IN_COLABis true. IfCOORDINODE_ADDRis provided, this is unnecessary and can fail before the gRPC path is used.Suggested fix
-# Install coordinode-embedded in Colab only (requires Rust build). -if IN_COLAB: +# Install coordinode-embedded in Colab only when embedded fallback is needed. +if IN_COLAB and not os.environ.get("COORDINODE_ADDR"):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/01_llama_index_property_graph.ipynb` at line 42, The Colab-specific Rust/embedded install block guarded by IN_COLAB should be skipped when a remote coordinode server is intended; check os.getenv("COORDINODE_ADDR") (or os.environ) before running the rustup/maturin/coordinode-embedded install and only run that block if IN_COLAB is true AND COORDINODE_ADDR is not set/empty. Update the conditional around the rustup/maturin/coordinode-embedded code (the block using IN_COLAB, _rustup_path, subprocess.run for rustup and pip install git+https://...coordinode-embedded) so it falls back to the normal pip installs when COORDINODE_ADDR is provided.demo/notebooks/03_langgraph_agent.ipynb (2)
168-180:⚠️ Potential issue | 🟠 MajorEnforce LIMIT before executing
query_facts()to avoid large materialization.Line 180 truncates after
client.cypher(...)already fetched rows. Add/validateLIMITin the query string first so the database enforces bounds.🔧 Proposed fix
+_LIMIT_RE = re.compile(r"\bLIMIT\s+(\d+)\b", re.IGNORECASE) + `@tool` def query_facts(cypher: str) -> str: @@ - q = cypher.strip() + q = cypher.strip().rstrip(";") @@ if not (_SESSION_WHERE_SCOPE_RE.search(q) or _SESSION_NODE_SCOPE_RE.search(q)): return "Query must scope reads to the current session with either WHERE <alias>.session = $sess or {session: $sess}" - rows = client.cypher(q, params={"sess": SESSION}) - return str(rows[:20]) if rows else "No results" + m = _LIMIT_RE.search(q) + if not m: + q = f"{q} LIMIT 20" + elif int(m.group(1)) > 20: + return "Query must include LIMIT <= 20." + rows = client.cypher(q, params={"sess": SESSION}) + return str(rows) if rows else "No results"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/03_langgraph_agent.ipynb` around lines 168 - 180, The query_facts function currently executes client.cypher(q, ...) after validation but before bounding results; ensure the DB enforces a LIMIT by inspecting q for an existing LIMIT and, if absent, appending a safe LIMIT (e.g., LIMIT 100) to the query string before calling client.cypher; keep the existing read-only check (_WRITE_CLAUSE_RE) and session-scoping checks (_SESSION_WHERE_SCOPE_RE or _SESSION_NODE_SCOPE_RE) intact, and call client.cypher with the modified q and params={"sess": SESSION} so large materializations are prevented.
127-127:⚠️ Potential issue | 🟠 MajorUse full UUID for
SESSIONto avoid cross-run collisions.Line 127 truncates to 8 hex chars (32 bits), which makes collisions realistic across repeated runs and can mix/cleanup another run’s data.
🔧 Proposed fix
-SESSION = uuid.uuid4().hex[:8] # isolates this demo's data from other sessions +SESSION = uuid.uuid4().hex # isolates this demo's data from other sessions🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/03_langgraph_agent.ipynb` at line 127, Replace the truncated session identifier with a full UUID: change the SESSION assignment (currently "SESSION = uuid.uuid4().hex[:8]") to use the full UUID (e.g., uuid.uuid4().hex or str(uuid.uuid4())) so runs are isolated; also scan for any code that assumes an 8-char SESSION and update those assumptions (cleanup, naming, or lookup logic) to accept the full UUID value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Line 42: The Colab-specific Rust/embedded install block guarded by IN_COLAB
should be skipped when a remote coordinode server is intended; check
os.getenv("COORDINODE_ADDR") (or os.environ) before running the
rustup/maturin/coordinode-embedded install and only run that block if IN_COLAB
is true AND COORDINODE_ADDR is not set/empty. Update the conditional around the
rustup/maturin/coordinode-embedded code (the block using IN_COLAB, _rustup_path,
subprocess.run for rustup and pip install git+https://...coordinode-embedded) so
it falls back to the normal pip installs when COORDINODE_ADDR is provided.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 280-283: Replace the one-step DETACH DELETE used in the
graph.query call with the explicit two-step cleanup used elsewhere: first delete
relationships that involve the target nodes, then delete the nodes themselves;
update the call that currently uses graph.query("MATCH (n) WHERE n.name ENDS
WITH $tag DETACH DELETE n", params={"tag": tag}) to run two queries via
graph.query—one removing relationships for nodes matching the ENDS WITH $tag
predicate (DELETE relationships), and a second deleting those nodes (DELETE
nodes); keep the same params={"tag": tag} and ensure graph.close() and
client.close() behavior remains unchanged.
- Line 39: The Colab-only Rust install and coordinode-embedded pip build should
be skipped when an external server is configured via COORDINODE_ADDR; wrap the
existing IN_COLAB installation block (the code that creates _ctx, downloads
sh.rustup.rs, runs subprocess.run to install rustup, sets _cargo_bin into PATH,
and pip-installs maturin and the git+...coordinode-embedded package) with an
extra guard like: if IN_COLAB and not os.environ.get("COORDINODE_ADDR"): ...;
ensure any variables and subprocess.run calls inside that block (e.g.,
_rustup_path handling, _cargo_bin PATH update, pip install of maturin and
coordinode-embedded) are only executed when COORDINODE_ADDR is unset so Colab
runs that point to an external coordinode server skip the embedded Rust build.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Around line 168-180: The query_facts function currently executes
client.cypher(q, ...) after validation but before bounding results; ensure the
DB enforces a LIMIT by inspecting q for an existing LIMIT and, if absent,
appending a safe LIMIT (e.g., LIMIT 100) to the query string before calling
client.cypher; keep the existing read-only check (_WRITE_CLAUSE_RE) and
session-scoping checks (_SESSION_WHERE_SCOPE_RE or _SESSION_NODE_SCOPE_RE)
intact, and call client.cypher with the modified q and params={"sess": SESSION}
so large materializations are prevented.
- Line 127: Replace the truncated session identifier with a full UUID: change
the SESSION assignment (currently "SESSION = uuid.uuid4().hex[:8]") to use the
full UUID (e.g., uuid.uuid4().hex or str(uuid.uuid4())) so runs are isolated;
also scan for any code that assumes an 8-char SESSION and update those
assumptions (cleanup, naming, or lookup logic) to accept the full UUID value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8dc482ed-0817-4ff8-9dfe-58d924e99e51
📒 Files selected for processing (5)
coordinode-rsdemo/notebooks/00_seed_data.ipynbdemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynb
…n create_text_index - raise ValueError when properties list is empty - validate language as Cypher identifier before DDL interpolation - replace two-step edge+node DELETE with DETACH DELETE in integration test (undirected MATCH returns each edge twice, causing duplicate DELETE failure) fix(notebooks): skip embedded install when COORDINODE_ADDR is set - 01_llama, 02_langchain: guard Rust build with COORDINODE_ADDR check so Docker/server stacks skip the redundant ~5 min Colab build step - 03_langgraph: use full uuid4().hex (32 chars) for session ID to avoid collision in concurrent demo sessions - 03_langgraph: auto-append LIMIT 20 in query_facts when query has no LIMIT clause, bounding LLM-generated query result size
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
coordinode/coordinode/client.py (1)
689-715:⚠️ Potential issue | 🟡 MinorAlign the
text_search()docstring with actual FTS behavior.The
Argssection now says schema-free graphs are indexed automatically, butReturnsandNotestill say callers must create a text index first. That contradiction will send users down an unnecessary setup path.Based on learnings: In `tests/integration/test_sdk.py`, FTS indexing in CoordiNode is automatic for schema-free graphs. Calling `create_label()` before `text_search()` or `hybrid_text_vector_search()` is NOT required.Suggested wording
- Returns: - List of :class:`TextResult` ordered by BM25 score descending. - Returns ``[]`` if no text index exists for *label*. - - Note: - Text indexing is **not** automatic. Before calling this method, - create a full-text index with the Cypher DDL statement:: + Returns: + List of :class:`TextResult` ordered by BM25 score descending. + Labels with no searchable text content simply return ``[]``. + + Note: + On current CoordiNode servers, schema-free graphs index text + automatically. Use :meth:`create_text_index` when you want explicit + DDL-managed index lifecycle/control::🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@coordinode/coordinode/client.py` around lines 689 - 715, The docstring for text_search() is inconsistent about whether full-text indexing is automatic; update the Args/Returns/Note to reflect actual behavior: for schema-free graphs CoordiNode automatically indexes text properties so callers do not need to call create_label() or create_text_index() before text_search() or hybrid_text_vector_search(); keep a short note explaining that explicit CREATE TEXT INDEX or create_text_index() is only required for non-schema-free deployments or when custom index configuration is desired, and ensure references to create_label(), create_text_index(), text_search(), and hybrid_text_vector_search() are used so readers can find related APIs.demo/notebooks/02_langchain_graph_chain.ipynb (1)
40-40: 🧹 Nitpick | 🔵 TrivialPin the LangChain versions behind this demo.
This notebook installs whatever the latest
langchain*packages are, then importsGraphCypherQAChainfrom a specific module path. That surface moves often enough that a clean rerun can break without any repo change.In the official LangChain Python docs, what is the current import path and constructor usage for `GraphCypherQAChain` (including whether `allow_dangerous_requests=True` is required), and which `langchain` package/version documents that API?Also applies to: 265-265
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@demo/notebooks/02_langchain_graph_chain.ipynb` at line 40, The notebook installs floating langchain packages which can break imports like GraphCypherQAChain; update the pip install subprocess.run call that lists "langchain", "langchain-coordinode", "langchain-community", and "langchain-openai" to use explicit version pins (e.g. "langchain==X.Y.Z" etc.) and lock any transitive langchain-related packages used by the demo, and then verify and update the demo import/constructor for GraphCypherQAChain to match the official LangChain docs (adjust the import path and whether allow_dangerous_requests=True is required) before choosing the versions; specifically modify the subprocess.run install arguments and the demo cells that import/construct GraphCypherQAChain so they are consistent with the pinned langchain package versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Line 160: The demo truncates the UUID when creating the run tag ("tag =
uuid.uuid4().hex[:6]"), which increases collision risk; change the assignment to
keep the full UUID (e.g., use uuid.uuid4().hex or str(uuid.uuid4())) so the
"tag" variable retains full uniqueness across runs and shared environments.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Line 158: The tag currently uses a truncated value uuid.uuid4().hex[:6] which
reduces entropy and risks collisions; change the assignment of tag (where
uuid.uuid4().hex[:6] appears) to use the full UUID hex (e.g., uuid.uuid4().hex)
so run-scoped names remain unique across demo environments while leaving the
rest of the per-run namespacing logic unchanged.
- Around line 87-94: CoordinodeGraph's embedded adapter should not implement
get_labels() or get_edge_types() (or any schema RPC-like methods) because
returning [] falsely signals schema support; remove these two methods from the
embedded adapter (or change them so they are not discoverable via hasattr, e.g.,
raise AttributeError or NotImplementedError instead of returning an empty list)
so callers using hasattr/callable checks will fall back to
get_schema_text()/refresh_schema() as intended; ensure other optional methods
(get_schema_text, vector_search) remain untouched and continue to be checked via
hasattr where used.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Line 124: The query_facts function currently only appends LIMIT 20 when no
LIMIT is present, leaving queries like "LIMIT 100000" to execute and rely on
rows[:20] after materialization; fix by capping any explicit LIMIT before
execution: update query_facts to replace any match of _LIMIT_RE
(r"\bLIMIT\s+\d+\b") with "LIMIT 20" (e.g., via re.sub) on q prior to calling
client.cypher, also ensure you strip trailing semicolons as already done; this
guarantees client.cypher never receives a larger LIMIT and avoids relying on
rows[:20] after large results are materialized.
---
Duplicate comments:
In `@coordinode/coordinode/client.py`:
- Around line 689-715: The docstring for text_search() is inconsistent about
whether full-text indexing is automatic; update the Args/Returns/Note to reflect
actual behavior: for schema-free graphs CoordiNode automatically indexes text
properties so callers do not need to call create_label() or create_text_index()
before text_search() or hybrid_text_vector_search(); keep a short note
explaining that explicit CREATE TEXT INDEX or create_text_index() is only
required for non-schema-free deployments or when custom index configuration is
desired, and ensure references to create_label(), create_text_index(),
text_search(), and hybrid_text_vector_search() are used so readers can find
related APIs.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Line 40: The notebook installs floating langchain packages which can break
imports like GraphCypherQAChain; update the pip install subprocess.run call that
lists "langchain", "langchain-coordinode", "langchain-community", and
"langchain-openai" to use explicit version pins (e.g. "langchain==X.Y.Z" etc.)
and lock any transitive langchain-related packages used by the demo, and then
verify and update the demo import/constructor for GraphCypherQAChain to match
the official LangChain docs (adjust the import path and whether
allow_dangerous_requests=True is required) before choosing the versions;
specifically modify the subprocess.run install arguments and the demo cells that
import/construct GraphCypherQAChain so they are consistent with the pinned
langchain package versions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e49edf9c-9103-42f5-95dd-98e2f10335d1
📒 Files selected for processing (5)
coordinode/coordinode/client.pydemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynbtests/integration/test_sdk.py
…from embedded adapter - 01_llama, 02_langchain: use full uuid4().hex for `tag` variable (was [:6] = 24 bits entropy, risked cross-run collisions on shared envs) - 01_llama, 02_langchain: remove get_labels()/get_edge_types() from _EmbeddedAdapter — returning [] made adapter look schema-RPC-capable, silently short-circuiting the get_schema_text() path in CoordinodeGraph - 03_langgraph: replace explicit LIMIT N > 20 in query_facts via re.sub instead of only appending when absent; server no longer materializes oversized result sets from LLM-generated queries fix(client): remove contradictory "indexed automatically" claim from text_search() Args.label docstring; Note already says indexing requires explicit CREATE TEXT INDEX DDL
|
@coderabbitai Re: Re: LangChain version pinning ( |
|
Tip For best results, initiate chat on the files or code changes.
Docstring fix — that's the right resolution. The LangChain version pinning — understood; tracking current releases is the right call for Colab-oriented demo notebooks where pinning quickly becomes a maintenance liability. ✏️ Learnings added
🧠 Learnings used |
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 21 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "i", | ||
| "m", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ",", | ||
| " ", | ||
| "s", | ||
| "o", | ||
| "c", | ||
| "k", | ||
| "e", | ||
| "t", | ||
| "\n", | ||
| "\n", | ||
| "\n", |
There was a problem hiding this comment.
This code cell’s source is serialized as one character per list element (e.g., "i", "m", "p", ...), which will run as invalid Python in Jupyter. Re-save/re-export the notebook so source is a single string or a list of full lines (each typically ending with \n).
| "i", | |
| "m", | |
| "p", | |
| "o", | |
| "r", | |
| "t", | |
| " ", | |
| "o", | |
| "s", | |
| ",", | |
| " ", | |
| "s", | |
| "o", | |
| "c", | |
| "k", | |
| "e", | |
| "t", | |
| "\n", | |
| "\n", | |
| "\n", | |
| "import os, socket\n", | |
| "\n", | |
| "\n", |
| "source": [ | ||
| "i", | ||
| "f", | ||
| " ", | ||
| "n", | ||
| "o", | ||
| "t", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| ".", | ||
| "g", | ||
| "e", | ||
| "t", |
There was a problem hiding this comment.
This code cell’s source is also serialized character-by-character, which will fail to execute as Python. Re-export the notebook so the cell source is stored as a single string or a list of full code lines.
| "source": [ | |
| "i", | |
| "f", | |
| " ", | |
| "n", | |
| "o", | |
| "t", | |
| " ", | |
| "o", | |
| "s", | |
| ".", | |
| "e", | |
| "n", | |
| "v", | |
| "i", | |
| "r", | |
| "o", | |
| "n", | |
| ".", | |
| "g", | |
| "e", | |
| "t", | |
| "source": "if not os.environ.get(\"OPENAI_API_KEY\"):\n print(\"Skipping GraphCypherQAChain example because OPENAI_API_KEY is not set.\")\nelse:\n from langchain.chains import GraphCypherQAChain\n from langchain_openai import ChatOpenAI\n\n chain = GraphCypherQAChain.from_llm(\n ChatOpenAI(temperature=0),\n graph=graph,\n verbose=False,\n allow_dangerous_requests=True,\n )\n\n result = chain.invoke({\"query\": f\"Which nodes are connected to {tag}?\"})\n print(result)\n" | |
| "t", |
| "i", | ||
| "m", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ",", | ||
| " ", | ||
| "s", | ||
| "o", | ||
| "c", | ||
| "k", | ||
| "e", | ||
| "t", | ||
| "\n", | ||
| "\n", | ||
| "\n", |
There was a problem hiding this comment.
This code cell’s source is serialized one character per list element, which will execute as invalid Python in Jupyter. Please re-save/re-export the notebook so the source is a single string or a list of full lines (with \n).
| "i", | |
| "m", | |
| "p", | |
| "o", | |
| "r", | |
| "t", | |
| " ", | |
| "o", | |
| "s", | |
| ",", | |
| " ", | |
| "s", | |
| "o", | |
| "c", | |
| "k", | |
| "e", | |
| "t", | |
| "\n", | |
| "\n", | |
| "\n", | |
| "import os, socket\n", | |
| "\n", | |
| "\n", |
| "id": "a1b2c3d4-0000-0000-0000-000000000007", | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": "import uuid\n\nDEMO_TAG = os.environ.get(\"COORDINODE_DEMO_TAG\") or f\"seed_data_{uuid.uuid4().hex[:8]}\"\nprint(\"Using DEMO_TAG:\", DEMO_TAG)\n# Delete relationships first, then nodes — two-step ensures correct cleanup across\n# all CoordiNode versions (embedded and server).\nclient.cypher(\n \"MATCH (n {demo: true, demo_tag: $tag})-[r]-() DELETE r\",\n params={\"tag\": DEMO_TAG},\n)\nclient.cypher(\n \"MATCH (n {demo: true, demo_tag: $tag}) DELETE n\",\n params={\"tag\": DEMO_TAG},\n)\nprint(\"Previous demo data removed\")" |
There was a problem hiding this comment.
The cleanup query uses an undirected pattern (-[r]-()) to delete relationships before deleting nodes. In CoordiNode this can match the same relationship from both endpoints and cause duplicate-delete/cleanup failures (this PR already switched other cleanups to DETACH DELETE for that reason). Consider replacing both statements with a single MATCH (n {demo: true, demo_tag: $tag}) DETACH DELETE n, or at least WITH DISTINCT r DELETE r to de-duplicate relationships.
| "source": "import uuid\n\nDEMO_TAG = os.environ.get(\"COORDINODE_DEMO_TAG\") or f\"seed_data_{uuid.uuid4().hex[:8]}\"\nprint(\"Using DEMO_TAG:\", DEMO_TAG)\n# Delete relationships first, then nodes — two-step ensures correct cleanup across\n# all CoordiNode versions (embedded and server).\nclient.cypher(\n \"MATCH (n {demo: true, demo_tag: $tag})-[r]-() DELETE r\",\n params={\"tag\": DEMO_TAG},\n)\nclient.cypher(\n \"MATCH (n {demo: true, demo_tag: $tag}) DELETE n\",\n params={\"tag\": DEMO_TAG},\n)\nprint(\"Previous demo data removed\")" | |
| "source": "import uuid\n\nDEMO_TAG = os.environ.get(\"COORDINODE_DEMO_TAG\") or f\"seed_data_{uuid.uuid4().hex[:8]}\"\nprint(\"Using DEMO_TAG:\", DEMO_TAG)\n# Remove prior demo nodes and any attached relationships in one step to avoid\n# duplicate relationship matches during cleanup.\nclient.cypher(\n \"MATCH (n {demo: true, demo_tag: $tag}) DETACH DELETE n\",\n params={\"tag\": DEMO_TAG},\n)\nprint(\"Previous demo data removed\")" |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@demo/notebooks/01_llama_index_property_graph.ipynb`:
- Around line 272-585: The startup eagerly converts COORDINODE_PORT into
GRPC_PORT using int(...) before choosing the connection path, so a malformed
port will crash even when COORDINODE_ADDR is present; change the logic in the
module that defines GRPC_PORT (the int(os.environ.get("COORDINODE_PORT",
"7080")) expression) to delay or guard parsing: read the raw env string (e.g.,
coord_port_str = os.environ.get("COORDINODE_PORT", "7080")) and only call
int(...) when needed for the branch that probes the port (_port_open or
similar), or wrap the int(...) in a try/except and fall back to the default 7080
so a bad COORDINODE_PORT does not abort startup when COORDINODE_ADDR is used.
Ensure references to GRPC_PORT/_port_open and COORDINODE_ADDR remain consistent
after the change.
In `@demo/notebooks/02_langchain_graph_chain.ipynb`:
- Around line 269-582: The code eagerly parses COORDINODE_PORT into GRPC_PORT
with int(...) before checking COORDINODE_ADDR, causing startup to fail on
malformed COORDINODE_PORT; change GRPC_PORT to keep the raw string (e.g.,
GRPC_PORT = os.environ.get("COORDINODE_PORT", "7080")) or set it to None and
only convert to int when actually needed (wrap conversion in try/except at the
point _port_open or the branch that uses it), and ensure the branch that uses
COORDINODE_ADDR (and the CoordinodeClient(...) construction) does not trigger
int(...) on a bad value.
In `@demo/notebooks/03_langgraph_agent.ipynb`:
- Line 124: query_facts currently unconditionally replaces any explicit "LIMIT
<n>" with "LIMIT 20", breaking caller intent (e.g. LIMIT 1 -> 20); update the
logic in query_facts that uses _LIMIT_RE so it only increases a numeric limit
when the captured n > 20 and otherwise preserves the original LIMIT; implement
by matching the LIMIT number via _LIMIT_RE (capture the digits), parse to int,
and if >20 substitute "LIMIT 20" (or leave the original match untouched), and if
no LIMIT exists append " LIMIT 20" as before; reference symbols: function
query_facts and regex _LIMIT_RE.
In `@tests/integration/test_sdk.py`:
- Around line 613-637: Wrap the index creation and test actions in a try/finally
so cleanup always runs even if create_text_index or later steps raise;
specifically, move the call to client.create_text_index(idx_name, label, "body")
inside the try, and in the finally first attempt to drop_text_index(idx_name)
inside its own try/except (so a failing drop doesn't prevent node deletion),
then always run client.cypher(... DELETE ...) to remove the seeded node; apply
the same pattern to the fuzzy and hybrid tests that call create_text_index /
drop_text_index.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 489df211-a20a-46f0-b061-6e6becd47698
📒 Files selected for processing (5)
coordinode/coordinode/client.pydemo/notebooks/01_llama_index_property_graph.ipynbdemo/notebooks/02_langchain_graph_chain.ipynbdemo/notebooks/03_langgraph_agent.ipynbtests/integration/test_sdk.py
| "G", | ||
| "R", | ||
| "P", | ||
| "C", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "i", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| ".", | ||
| "g", | ||
| "e", | ||
| "t", | ||
| "(", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| "\"", | ||
| ",", | ||
| " ", | ||
| "\"", | ||
| "7", | ||
| "0", | ||
| "8", | ||
| "0", | ||
| "\"", | ||
| ")", | ||
| ")", | ||
| "\n", | ||
| "\n", | ||
| "i", | ||
| "f", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| ".", | ||
| "g", | ||
| "e", | ||
| "t", | ||
| "(", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "\"", | ||
| ")", | ||
| ":", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| "[", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "\"", | ||
| "]", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "f", | ||
| "r", | ||
| "o", | ||
| "m", | ||
| " ", | ||
| "c", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| " ", | ||
| "i", | ||
| "m", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| " ", | ||
| "C", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| "C", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| "\n", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "c", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "C", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| "C", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| ")", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "p", | ||
| "r", | ||
| "i", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "f", | ||
| "\"", | ||
| "C", | ||
| "o", | ||
| "n", | ||
| "n", | ||
| "e", | ||
| "c", | ||
| "t", | ||
| "e", | ||
| "d", | ||
| " ", | ||
| "t", | ||
| "o", | ||
| " ", | ||
| "{", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "}", | ||
| "\"", | ||
| ")", | ||
| "\n", | ||
| "e", | ||
| "l", | ||
| "i", | ||
| "f", | ||
| " ", | ||
| "_", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| "_", | ||
| "o", | ||
| "p", | ||
| "e", | ||
| "n", | ||
| "(", | ||
| "G", | ||
| "R", | ||
| "P", | ||
| "C", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| ")", | ||
| ":", |
There was a problem hiding this comment.
Avoid eager COORDINODE_PORT parsing before connection-path selection.
int(os.environ.get("COORDINODE_PORT", "7080")) is evaluated before the COORDINODE_ADDR branch. A malformed port value will crash startup even when Line 332 has a valid COORDINODE_ADDR.
Suggested fix
-GRPC_PORT = int(os.environ.get("COORDINODE_PORT", "7080"))
-
-if os.environ.get("COORDINODE_ADDR"):
+if os.environ.get("COORDINODE_ADDR"):
COORDINODE_ADDR = os.environ["COORDINODE_ADDR"]
from coordinode import CoordinodeClient
client = CoordinodeClient(COORDINODE_ADDR)
print(f"Connected to {COORDINODE_ADDR}")
-elif _port_open(GRPC_PORT):
- COORDINODE_ADDR = f"localhost:{GRPC_PORT}"
+else:
+ try:
+ grpc_port = int(os.environ.get("COORDINODE_PORT", "7080"))
+ except ValueError as exc:
+ raise RuntimeError("COORDINODE_PORT must be an integer") from exc
+
+ if _port_open(grpc_port):
+ COORDINODE_ADDR = f"localhost:{grpc_port}"
+ from coordinode import CoordinodeClient
+ client = CoordinodeClient(COORDINODE_ADDR)
+ print(f"Connected to {COORDINODE_ADDR}")
+ else:
+ # embedded fallback...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@demo/notebooks/01_llama_index_property_graph.ipynb` around lines 272 - 585,
The startup eagerly converts COORDINODE_PORT into GRPC_PORT using int(...)
before choosing the connection path, so a malformed port will crash even when
COORDINODE_ADDR is present; change the logic in the module that defines
GRPC_PORT (the int(os.environ.get("COORDINODE_PORT", "7080")) expression) to
delay or guard parsing: read the raw env string (e.g., coord_port_str =
os.environ.get("COORDINODE_PORT", "7080")) and only call int(...) when needed
for the branch that probes the port (_port_open or similar), or wrap the
int(...) in a try/except and fall back to the default 7080 so a bad
COORDINODE_PORT does not abort startup when COORDINODE_ADDR is used. Ensure
references to GRPC_PORT/_port_open and COORDINODE_ADDR remain consistent after
the change.
| "G", | ||
| "R", | ||
| "P", | ||
| "C", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "i", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| ".", | ||
| "g", | ||
| "e", | ||
| "t", | ||
| "(", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| "\"", | ||
| ",", | ||
| " ", | ||
| "\"", | ||
| "7", | ||
| "0", | ||
| "8", | ||
| "0", | ||
| "\"", | ||
| ")", | ||
| ")", | ||
| "\n", | ||
| "\n", | ||
| "i", | ||
| "f", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| ".", | ||
| "g", | ||
| "e", | ||
| "t", | ||
| "(", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "\"", | ||
| ")", | ||
| ":", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "o", | ||
| "s", | ||
| ".", | ||
| "e", | ||
| "n", | ||
| "v", | ||
| "i", | ||
| "r", | ||
| "o", | ||
| "n", | ||
| "[", | ||
| "\"", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "\"", | ||
| "]", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "f", | ||
| "r", | ||
| "o", | ||
| "m", | ||
| " ", | ||
| "c", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| " ", | ||
| "i", | ||
| "m", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| " ", | ||
| "C", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| "C", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| "\n", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "c", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| " ", | ||
| "=", | ||
| " ", | ||
| "C", | ||
| "o", | ||
| "o", | ||
| "r", | ||
| "d", | ||
| "i", | ||
| "n", | ||
| "o", | ||
| "d", | ||
| "e", | ||
| "C", | ||
| "l", | ||
| "i", | ||
| "e", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| ")", | ||
| "\n", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| " ", | ||
| "p", | ||
| "r", | ||
| "i", | ||
| "n", | ||
| "t", | ||
| "(", | ||
| "f", | ||
| "\"", | ||
| "C", | ||
| "o", | ||
| "n", | ||
| "n", | ||
| "e", | ||
| "c", | ||
| "t", | ||
| "e", | ||
| "d", | ||
| " ", | ||
| "t", | ||
| "o", | ||
| " ", | ||
| "{", | ||
| "C", | ||
| "O", | ||
| "O", | ||
| "R", | ||
| "D", | ||
| "I", | ||
| "N", | ||
| "O", | ||
| "D", | ||
| "E", | ||
| "_", | ||
| "A", | ||
| "D", | ||
| "D", | ||
| "R", | ||
| "}", | ||
| "\"", | ||
| ")", | ||
| "\n", | ||
| "e", | ||
| "l", | ||
| "i", | ||
| "f", | ||
| " ", | ||
| "_", | ||
| "p", | ||
| "o", | ||
| "r", | ||
| "t", | ||
| "_", | ||
| "o", | ||
| "p", | ||
| "e", | ||
| "n", | ||
| "(", | ||
| "G", | ||
| "R", | ||
| "P", | ||
| "C", | ||
| "_", | ||
| "P", | ||
| "O", | ||
| "R", | ||
| "T", | ||
| ")", | ||
| ":", |
There was a problem hiding this comment.
Avoid eager COORDINODE_PORT parsing before branch on COORDINODE_ADDR.
Line 269 is evaluated before Line 329. If COORDINODE_PORT is malformed, notebook startup fails even when COORDINODE_ADDR is valid and should be used directly.
Suggested fix
-GRPC_PORT = int(os.environ.get("COORDINODE_PORT", "7080"))
-
-if os.environ.get("COORDINODE_ADDR"):
+if os.environ.get("COORDINODE_ADDR"):
COORDINODE_ADDR = os.environ["COORDINODE_ADDR"]
from coordinode import CoordinodeClient
client = CoordinodeClient(COORDINODE_ADDR)
print(f"Connected to {COORDINODE_ADDR}")
-elif _port_open(GRPC_PORT):
- COORDINODE_ADDR = f"localhost:{GRPC_PORT}"
+else:
+ try:
+ grpc_port = int(os.environ.get("COORDINODE_PORT", "7080"))
+ except ValueError as exc:
+ raise RuntimeError("COORDINODE_PORT must be an integer") from exc
+
+ if _port_open(grpc_port):
+ COORDINODE_ADDR = f"localhost:{grpc_port}"
+ from coordinode import CoordinodeClient
+ client = CoordinodeClient(COORDINODE_ADDR)
+ print(f"Connected to {COORDINODE_ADDR}")
+ else:
+ # embedded fallback...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@demo/notebooks/02_langchain_graph_chain.ipynb` around lines 269 - 582, The
code eagerly parses COORDINODE_PORT into GRPC_PORT with int(...) before checking
COORDINODE_ADDR, causing startup to fail on malformed COORDINODE_PORT; change
GRPC_PORT to keep the raw string (e.g., GRPC_PORT =
os.environ.get("COORDINODE_PORT", "7080")) or set it to None and only convert to
int when actually needed (wrap conversion in try/except at the point _port_open
or the branch that uses it), and ensure the branch that uses COORDINODE_ADDR
(and the CoordinodeClient(...) construction) does not trigger int(...) on a bad
value.
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import os, re, uuid\nfrom langchain_core.tools import tool\n\nSESSION = uuid.uuid4().hex # isolates this demo's data from other sessions\n\n_REL_TYPE_RE = re.compile(r\"[A-Z_][A-Z0-9_]*\")\n# Regex guards for query_facts (demo safety guard).\n_WRITE_CLAUSE_RE = re.compile(\n r\"\\b(CREATE|MERGE|DELETE|DETACH|SET|REMOVE|DROP|CALL|LOAD)\\b\",\n re.IGNORECASE | re.DOTALL,\n)\n# NOTE: this guard checks that AT LEAST ONE node pattern carries session scope.\n# A Cartesian-product query such as `MATCH (n), (m {session: $sess}) RETURN n`\n# would pass yet return unscoped rows for `n`. A complete per-alias check would\n# require parsing the Cypher AST, which is out of scope for a demo safety guard.\n# In production code, use server-side row-level security instead of client regex.\n_SESSION_WHERE_SCOPE_RE = re.compile(\n r\"WHERE\\b[^;{}]*\\.session\\s*=\\s*\\$sess\",\n re.IGNORECASE | re.DOTALL,\n)\n_SESSION_NODE_SCOPE_RE = re.compile(\n r\"\\([^)]*\\{[^}]*session\\s*:\\s*\\$sess[^}]*\\}[^)]*\\)\",\n re.IGNORECASE | re.DOTALL,\n)\n\n\n@tool\ndef save_fact(subject: str, relation: str, obj: str) -> str:\n \"\"\"Save a fact (subject → relation → object) into the knowledge graph.\n Example: save_fact('Alice', 'WORKS_AT', 'Acme Corp')\"\"\"\n rel_type = relation.upper().replace(\" \", \"_\")\n # Validate rel_type before interpolating into Cypher to prevent injection.\n if not _REL_TYPE_RE.fullmatch(rel_type):\n return f\"Invalid relation type {relation!r}: only letters, digits, and underscores allowed\"\n client.cypher(\n f\"MERGE (a:Entity {{name: $s, session: $sess}}) \"\n f\"MERGE (b:Entity {{name: $o, session: $sess}}) \"\n f\"MERGE (a)-[r:{rel_type}]->(b)\",\n params={\"s\": subject, \"o\": obj, \"sess\": SESSION},\n )\n return f\"Saved: {subject} -[{rel_type}]-> {obj}\"\n\n\n@tool\ndef query_facts(cypher: str) -> str:\n \"\"\"Run a read-only Cypher MATCH query against the knowledge graph.\n Must scope reads via either WHERE <alias>.session = $sess\n or a node pattern {session: $sess}.\"\"\"\n q = cypher.strip()\n if _WRITE_CLAUSE_RE.search(q):\n return \"Only read-only Cypher is allowed in query_facts.\"\n # Require $sess in a WHERE clause or node pattern, not just anywhere.\n # Accepts both: WHERE n.session = $sess and MATCH (n {session: $sess})\n if not (_SESSION_WHERE_SCOPE_RE.search(q) or _SESSION_NODE_SCOPE_RE.search(q)):\n return \"Query must scope reads to the current session with either WHERE <alias>.session = $sess or {session: $sess}\"\n # Enforce a result cap so agents cannot dump the entire graph.\n # Cap or append LIMIT 20 before execution so the server never materializes\n # more than 20 rows, even when the LLM writes an explicit large LIMIT.\n _LIMIT_RE = re.compile(r\"\\bLIMIT\\s+\\d+\\b\", re.IGNORECASE)\n if _LIMIT_RE.search(q):\n q = _LIMIT_RE.sub(\"LIMIT 20\", q)\n else:\n q = q.rstrip().rstrip(\";\") + \" LIMIT 20\"\n rows = client.cypher(q, params={\"sess\": SESSION})\n return str(rows) if rows else \"No results\"\n\n\n@tool\ndef find_related(entity_name: str, depth: int = 1) -> str:\n \"\"\"Find all entities reachable from entity_name within the given number of hops (max 3).\"\"\"\n safe_depth = max(1, min(int(depth), 3))\n rows = client.cypher(\n f\"MATCH p=(n:Entity {{name: $name, session: $sess}})-[*1..{safe_depth}]->(m:Entity {{session: $sess}}) \"\n \"WHERE ALL(x IN nodes(p) WHERE x.session = $sess) \"\n \"RETURN m.name AS related, type(last(relationships(p))) AS via LIMIT 20\",\n params={\"name\": entity_name, \"sess\": SESSION},\n )\n if not rows:\n return f\"No related entities found for {entity_name}\"\n return \"\\n\".join(f\"{r['via']} -> {r['related']}\" for r in rows)\n\n\n@tool\ndef list_all_facts() -> str:\n \"\"\"List every fact stored in the current session's knowledge graph.\"\"\"\n rows = client.cypher(\n \"MATCH (a:Entity {session: $sess})-[r]->(b:Entity {session: $sess}) \"\n \"RETURN a.name AS subject, type(r) AS relation, b.name AS object\",\n params={\"sess\": SESSION},\n )\n if not rows:\n return \"No facts stored yet\"\n return \"\\n\".join(f\"{r['subject']} -[{r['relation']}]-> {r['object']}\" for r in rows)\n\n\ntools = [save_fact, query_facts, find_related, list_all_facts]\nprint(f\"Session: {SESSION}\")\nprint(\"Tools:\", [t.name for t in tools])" |
There was a problem hiding this comment.
query_facts() currently expands smaller explicit LIMITs to 20.
At Line 124, replacing all LIMIT <n> with LIMIT 20 changes caller intent (e.g., LIMIT 1 becomes 20). Cap only when n > 20, and preserve smaller limits.
🔧 Proposed fix
- _LIMIT_RE = re.compile(r"\bLIMIT\s+\d+\b", re.IGNORECASE)
- if _LIMIT_RE.search(q):
- q = _LIMIT_RE.sub("LIMIT 20", q)
+ _LIMIT_RE = re.compile(r"\bLIMIT\s+(\d+)\b", re.IGNORECASE)
+ def _cap_limit(m):
+ return f"LIMIT {min(int(m.group(1)), 20)}"
+ if _LIMIT_RE.search(q):
+ q = _LIMIT_RE.sub(_cap_limit, q)
else:
q = q.rstrip().rstrip(";") + " LIMIT 20"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@demo/notebooks/03_langgraph_agent.ipynb` at line 124, query_facts currently
unconditionally replaces any explicit "LIMIT <n>" with "LIMIT 20", breaking
caller intent (e.g. LIMIT 1 -> 20); update the logic in query_facts that uses
_LIMIT_RE so it only increases a numeric limit when the captured n > 20 and
otherwise preserves the original LIMIT; implement by matching the LIMIT number
via _LIMIT_RE (capture the digits), parse to int, and if >20 substitute "LIMIT
20" (or leave the original match untouched), and if no LIMIT exists append "
LIMIT 20" as before; reference symbols: function query_facts and regex
_LIMIT_RE.
| rows = client.cypher( | ||
| f"CREATE (n:{label} {{tag: $tag, body: 'machine learning and neural networks'}}) RETURN n AS node_id", | ||
| params={"tag": tag}, | ||
| ) | ||
| seed_id = rows[0]["node_id"] | ||
| # Text index must be created explicitly; nodes written before index creation | ||
| # are indexed immediately at DDL time. | ||
| idx_info = client.create_text_index(idx_name, label, "body") | ||
| assert isinstance(idx_info, TextIndexInfo) | ||
| try: | ||
| results = client.text_search(label, "machine learning", limit=5) | ||
| assert isinstance(results, list) | ||
| assert results, "text_search returned no results after index creation" | ||
| assert any(r.node_id == seed_id for r in results), ( | ||
| f"seeded node {seed_id} not found in text_search results: {results}" | ||
| ) | ||
| r = results[0] | ||
| assert isinstance(r, TextResult) | ||
| assert isinstance(r.node_id, int) | ||
| assert isinstance(r.score, float) | ||
| assert r.score > 0 | ||
| assert isinstance(r.snippet, str) | ||
| finally: | ||
| client.drop_text_index(idx_name) | ||
| client.cypher(f"MATCH (n:{label} {{tag: $tag}}) DELETE n", params={"tag": tag}) |
There was a problem hiding this comment.
Make the FTS cleanup unconditional.
create_text_index() runs before the try, so any exception there skips cleanup entirely. drop_text_index() also runs before node deletion, so a failed drop leaves the seeded node behind. On shared/reused test servers, this leaks data across runs.
🧹 Safer cleanup pattern
- client.create_text_index(idx_name, label, "body")
- try:
+ idx_created = False
+ try:
+ client.create_text_index(idx_name, label, "body")
+ idx_created = True
results = client.text_search(label, "machine learning", limit=5)
...
finally:
- client.drop_text_index(idx_name)
- client.cypher(f"MATCH (n:{label} {{tag: $tag}}) DELETE n", params={"tag": tag})
+ try:
+ if idx_created:
+ client.drop_text_index(idx_name)
+ finally:
+ client.cypher(f"MATCH (n:{label} {{tag: $tag}}) DELETE n", params={"tag": tag})Apply the same pattern to the fuzzy and hybrid tests.
Also applies to: 673-685, 696-722
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/test_sdk.py` around lines 613 - 637, Wrap the index
creation and test actions in a try/finally so cleanup always runs even if
create_text_index or later steps raise; specifically, move the call to
client.create_text_index(idx_name, label, "body") inside the try, and in the
finally first attempt to drop_text_index(idx_name) inside its own try/except (so
a failing drop doesn't prevent node deletion), then always run client.cypher(...
DELETE ...) to remove the seeded node; apply the same pattern to the fuzzy and
hybrid tests that call create_text_index / drop_text_index.




Summary
Changes
DETACH DELETEin cleanup (replaces two-step to avoid duplicate-edge error from undirected MATCH)Test plan
Closes #29