Workshop: auditable-llm-editing

Goal: test whether explicit, sparse writing state can make LLM-assisted editing more convergent by preventing accidental claim drift while still allowing useful prose improvements.

Status: active. Initial phase is protocol design plus small manual trials on real drafts.

Starting claim

Many LLM editing operations are locally improving but not convergent. Repeated prompts such as "make it clearer", "make it more concise", or "make it more professional" can improve a draft for a few passes, then start smoothing away meaning, changing emphasis, weakening voice, or replacing specific claims with generic prose.

The proposed control surface is a structured writing state:

S = (text, claim_ledger, presentation_spec, rubric, gap_policy)

Edits are typed:

  • Text edit — changes wording, organization, transitions, or local emphasis while preserving the claim ledger.
  • Claim edit — explicitly adds, removes, merges, splits, weakens, strengthens, or reanchors a claim.
  • Presentation-spec edit — changes how the piece should sound, frame its claims, or preserve its rhetorical constraints.
  • Rubric edit — changes what counts as improvement for this piece.
  • Gap-policy edit — changes how unsupported theory bridges are marked, closed in text, or extracted as note candidates.

The first invariant to test: claims should not change accidentally. They may change, but only through an explicit claim-revision step.

Why this needs a workshop

The obvious expansion is a richer state object: evidence, assumptions, definitions, caveats, examples, discourse roles, target reader, argumentative dependencies, and so on. That may help, but it can also create a new failure mode: the model over-extracts structure, invents obligations, or turns tentative interpretations into false invariants.

This workshop therefore starts with lazy extraction:

  1. Extract only a minimal claim ledger.
  2. Anchor each claim to text spans.
  3. Mark claims by status.
  4. Add new structure only when a real editing failure shows the existing state was insufficient.

The experiment is not to build a complete handcrafted ontology of writing. The structure is a human-facing control surface for auditability and intervention. The scalable substrate should be search, comparison, learned evaluators, preference feedback, and accumulated edit history.

Initial artifacts

  • protocol — trial design, prompts, measures, and stop conditions.
  • v0-two-loop-algorithm — first operational loop: select one tuple part, write a complete candidate state, verify against the frozen rest.
  • editor-prompt — sub-agent prompt for producing complete candidate states.
  • verifier-prompt — sub-agent prompt for independently checking candidates.
  • state-template — minimal writing-state schema for trials.
  • failure-log — place to record drift, over-extraction, rubric failure, and presentation damage.
  • experiment-conclusion — current takeaway from the first two trials.

Experiments

Experimental question

Does a sparse, anchored claim ledger reduce semantic drift across repeated LLM editing passes compared with ordinary edit prompts, without freezing useful revision?

Secondary questions:

  • Which failures remain after claim anchoring?
  • Which extra structures are justified by observed failures?
  • When does the structure become too expensive or too brittle?
  • What parts should become reusable review gates, instructions, or learned evaluators?

What would close this workshop

The workshop closes when it produces one of:

  • A negative result: claim-ledger editing does not reduce drift enough to justify the overhead.
  • A narrow procedure: a reusable instruction for stateful editing with a minimal claim ledger.
  • A broader methodology note: when sparse writing-state control surfaces are useful, where they fail, and how they fit the bitter-lesson boundary.
  • Candidate review gates or validator checks derived from repeated failure patterns.

Relevant local context