Skip to content

feat: restore per-field confidence scoring in write interpretation #81

@MaxLinCode

Description

@MaxLinCode

Background

rawWriteInterpretationSchema previously included a confidence: z.record(z.string(), z.number()) field intended to let the LLM express per-field confidence (e.g. {"scheduleFields.time": 0.6}). This was removed in #80 because z.record() generates additionalProperties in JSON Schema, which OpenAI's structured outputs API rejects.

write-commit.ts uses confidence thresholds (CONFIDENCE_THRESHOLD = 0.75, CORRECTION_THRESHOLD = 0.9) to decide whether to commit a field or push it to needsClarification. With confidence removed, all fields now default to 1.0 (fully trusted), so this signal is lost.

Goal

Restore confidence scoring in a way that's compatible with OpenAI structured outputs.

Approach

Replace z.record(z.string(), z.number()) with an explicit fixed-key object covering all possible field paths:

z.object({
  "scheduleFields.time": z.number().min(0).max(1).optional(),
  "scheduleFields.day": z.number().min(0).max(1).optional(),
  "scheduleFields.duration": z.number().min(0).max(1).optional(),
  "taskFields.priority": z.number().min(0).max(1).optional(),
  "taskFields.label": z.number().min(0).max(1).optional(),
  "taskFields.sourceText": z.number().min(0).max(1).optional(),
}).optional()

OpenAI structured outputs supports optional object properties, so this avoids the additionalProperties rejection.

Also update the write interpretation prompt to instruct the LLM to populate these scores.

Acceptance Criteria

  • rawWriteInterpretationSchema includes a structured confidence object accepted by OpenAI structured outputs
  • LLM prompt instructs the model to populate per-field confidence scores
  • write-commit.ts ?? 1 defaults remain so absent scores are still treated as fully trusted
  • Existing commit-policy tests cover the confidence threshold behavior

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions