Skip to content

fix(claude_agent_sdk): log ResultMessage.is_error and AssistantMessage.error#269

Open
Mahhheshh (Mahhheshh) wants to merge 1 commit intobraintrustdata:mainfrom
Mahhheshh:log_claude_agent_sdk
Open

fix(claude_agent_sdk): log ResultMessage.is_error and AssistantMessage.error#269
Mahhheshh (Mahhheshh) wants to merge 1 commit intobraintrustdata:mainfrom
Mahhheshh:log_claude_agent_sdk

Conversation

@Mahhheshh
Copy link
Copy Markdown
Contributor

  • If the ResultMessage has the is_error field. it would be logged to root_span.
  • If AssistantMessage has error field, it would be logged to associate span.

ref 4, 6 #149.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also update the test to assert on this right?

@Mahhheshh
Copy link
Copy Markdown
Contributor Author

I was not sure about the way this will be tested

--- a/py/src/braintrust/integrations/claude_agent_sdk/test_claude_agent_sdk.py
+++ b/py/src/braintrust/integrations/claude_agent_sdk/test_claude_agent_sdk.py
@@ -2083,6 +2083,8 @@ async def test_setup_claude_agent_sdk_repro_import_before_setup(memory_logger, m
     loop_errors = []
     received_types = []
 
+    result_message = None
+    assistant_message = None
     with _patched_claude_sdk():
         assert setup_claude_agent_sdk(project=PROJECT_NAME, api_key=logger.TEST_API_KEY)
         assert getattr(consumer_module, "ClaudeSDKClient") is not original_client
@@ -2107,7 +2109,12 @@ async def test_setup_claude_agent_sdk_repro_import_before_setup(memory_logger, m
             async with getattr(consumer_module, "ClaudeSDKClient")(options=options, transport=transport) as client:
                 await client.query("Say hi")
                 async for message in client.receive_response():
-                    received_types.append(type(message).__name__)
+                    message_type = type(message).__name__
+                    if message_type == "ResultMessage":
+                        result_message = message
+                    if message_type == AssistantMessage:
+                        assistant_message = message
+                    received_types.append(message_type)
 
         await main()
 
@@ -2117,10 +2124,17 @@ async def test_setup_claude_agent_sdk_repro_import_before_setup(memory_logger, m
 
     spans = memory_logger.pop()
     task_spans = [s for s in spans if s["span_attributes"]["type"] == SpanTypeAttribute.TASK]
+    llm_spans = [s for s in spans if s["span_attributes"]["type"] == SpanTypeAttribute.LLM]
     assert len(task_spans) == 1
     assert task_spans[0]["span_attributes"]["name"] == "Claude Agent"
     assert task_spans[0]["input"] == "Say hi"
 
+    if result_message is not None and getattr(result_message, "is_error", False):
+        assert task_spans[0]["error"] == result_message.is_error
+
+    if assistant_message is not None and getattr(assistant_message, "error", False):
+        assert llm_spans[0]["error"] == assistant_message.error
+```

llm_export = ctx.llm_span.export() if ctx.llm_span else None
error_field = getattr(message, "error", None)
if error_field is not None and ctx.llm_span is not None:
ctx.llm_span.log(error=type(error_field).__name__)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be just
(error=error_field). will fix it once I have idea about tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants