Inspiration
Having been using AI for my development work for the past few months, I have always been intrigued by multi-agent systems where multiple agents take on different tasks and edit code in the repository simultaneously.
One thing irked me each time. When two agents tried to edit two unrelated things in the same file, there would often be silent errors where one agent would fail and retry later. As I was reading up on database concurrency, specifically row-level locking and MVCC, I realised that databases had been solving this shape of problem for decades. The concepts were there, but they had not really been applied to code agents.
So I built SemLock: a small coordination layer that brings region-level locking to multi-agent Python editing.
Problem
The first obvious solution is to create a worktree for each task. That is very safe because the agents are isolated. I largely agree with this. Worktrees are great; they keep things clean and separated.
But conflicts are not eliminated. They are pushed back to merge time. At the end, someone or something has to reconcile all the changes before merging. With agents spinning out many PRs in parallel, we are not just dealing with text conflicts, but semantic ones too: Agent A restructures the module, Agent B adds a feature against the old structure, and now reconciliation requires understanding what both agents intended.
The second obvious solution is file-level locking. When Agent A is writing, the whole file is locked. Once it is done, Agent B can edit the file. The issue is that we lose the parallelism we wanted. If Agent A is editing process_payment() and Agent B wants to edit send_notification(), why should B wait if they are unrelated functions in the same file?
| Approach | Parallelism | Conflict handling |
|---|---|---|
| Worktrees | Full | Deferred to merge time |
| File-level lock | None | No conflicts, but no parallelism |
| SemLock | Within-file | Caught at write time |
What we actually want is this: agents work in parallel when their edits do not overlap, and when they do overlap, they get a clean structured signal that explains what went wrong and what to do next.
Region Discovery
Before locking, the most important question is what is being locked. We do not want it too coarse, because that loses parallelism. We also do not want it too granular, because then we lose meaningful semantic boundaries.
My first idea was locking by line numbers. Parse the whole file, store functions and their start and end lines. But this is brittle. The moment another agent commits a change above your target function, all line numbers shift.
The second contender was locking by function names. This seems better, but Python allows duplicate top-level names. We need something that identifies a region in the current file and tells us which bytes it occupies. Byte offsets are rediscovered fresh every time by parsing the current file. We never store them permanently and trust they will still be valid later.
A RegionDescriptor has four fields:
- A stable string ID, encoded as
kind::file_path::qualified_name. - Byte offsets for where the region starts and ends in the current file.
- A region kind:
top_level_function,top_level_class,shared_header, orfile. - A content hash of the bytes in this region right now.
shared_header
import stripe
from decimal import Decimal
TAX_RATE = 0.08
top_level_function: charge()
def charge(amount: Decimal, customer_id: str) -> dict:
total = amount * (1 + TAX_RATE)
return stripe.charge(amount=total, customer=customer_id)
top_level_class: PaymentProcessor
class PaymentProcessor:
def __init__(self):
self.history = []
top_level_function: apply_discount()
def apply_discount(amount: Decimal, pct: float) -> Decimal:
return amount * (1 - pct)
SemLock uses tree-sitter to parse Python files into a syntax tree and walk the tree to identify regions. Since tree-sitter operates on raw bytes, it gives exact byte offsets for every node, which is what we need when splicing content in and out of a file. It also handles syntactically broken Python better than a normal Python parser, which matters because agents can leave code mid-edit.
In v1, SemLock does not support methods inside classes, nested functions, or arbitrary statement blocks. The boundaries get harder to enforce cleanly at that granularity, and the added complexity was not worth it for a first version.
Leases
Knowing what regions we have solves the first part of the problem. But two agents can still both call list_regions on the same file at the same time, both see that charge() is available, and both start building a replacement. Neither knows the other is doing the same thing.
This is a time-of-check/time-of-use problem. The window between “I saw it was free” and “I am writing to it” is where things fall apart. Databases deal with this with row locks. Operating systems deal with this with mutexes. SemLock deals with it with leases.
flowchart LR
A1["Agent A reads charge()"]
B1["Agent B reads charge()"]
D1["Both build edits against same baseline"]
A2["Agent A writes"]
B2["Agent B overwrites A"]
A1 --> D1
B1 --> D1
D1 --> A2
A2 --> B2
classDef read fill:#fffaf2,stroke:#c9bba8,color:#211f1b;
classDef danger fill:#f3dfd6,stroke:#b66a55,color:#5f2b20;
classDef write fill:#dfe9df,stroke:#285f4d,color:#214c3e;
class A1,B1 read;
class D1,B2 danger;
class A2 write;
flowchart LR
A1["Agent A acquire_lock(charge())"]
A2["Lease granted"]
B1["Agent B acquire_lock(charge())"]
B2["LOCK_CONFLICT"]
A3["Agent A commits and releases"]
B3["Agent B retries on fresh state"]
A1 --> A2 --> A3 --> B3
B1 --> B2 --> B3
classDef step fill:#fffaf2,stroke:#c9bba8,color:#211f1b;
classDef ok fill:#dfe9df,stroke:#285f4d,color:#214c3e;
classDef retry fill:#f5e6bf,stroke:#b58a2a,color:#5d4510;
class A1,B1 step;
class A2,A3,B3 ok;
class B2 retry;
A lease is server-issued proof that you have exclusive write permission for a specific region right now. The server has checked, granted you the lock, and will reject any other agent that tries to acquire the same region until your lease is released or expires.
Each lease comes with:
- A
lease_token, used in subsequent calls. - An
agent_id, which identifies who holds the lease. - An
expires_attimestamp, so abandoned locks eventually clear. - An
acquisition_id, linking together all regions acquired in the same atomic request.
Atomicity is important. If you need to edit both charge() and apply_discount(), you acquire both in one request. You either get both locks or neither. Partial lock state is worse than no lock state because it is hard to reason about and easy to deadlock on.
Narrow region locks only conflict if they are the exact same region. Two agents can hold locks on different functions in the same file at the same time. But shared_header and file locks are file-wide. They conflict with every other lock in the same file.
The downside is that leases are in memory only. If the SemLock server goes down, lock state is gone. This was a conscious design choice: in-memory locks keep the implementation simple and fast, but persisted lock recovery is left to another layer.
Commit Validation
So we have a lease and we have a candidate change. What happens when we commit?
flowchart LR
A([commit])
B["Lock + reread"]
C{"Hash ok?"}
D["Splice"]
E{"Python ok?"}
F{"In bounds?"}
G{"Semantic"}
H([Write])
X["Changed"]
Y["Parse error"]
Z["Scope error"]
L["More locks"]
M["File lock"]
A --> B --> C
C -->|yes| D --> E
C -->|no| X
E -->|yes| F
E -->|no| Y
F -->|yes| G
F -->|no| Z
G -->|allow| H
G -->|locks| L
G -->|file| M
classDef ok fill:#dfe9df,stroke:#285f4d,color:#214c3e;
classDef fail fill:#f3dfd6,stroke:#b66a55,color:#5f2b20;
classDef retry fill:#f5e6bf,stroke:#b58a2a,color:#5d4510;
classDef step fill:#fffaf2,stroke:#c9bba8,color:#211f1b;
class A,B,D,C,E,F,G step;
class H ok;
class X,L,M retry;
class Y,Z fail;
The hash check matters because we do not want to write to a region that has changed. The server rereads the file from disk, resolves the locked region from the current file state, computes its hash, and compares it with the expected_region_hash sent in the commit request. If they do not match, the write is rejected with REGION_CHANGED.
Even if two agents are editing different functions in the same file, their commits are serialised through a per-file mutex. The expensive part, generating the actual code change, still happens concurrently. The serialisation only happens at write time, and it ensures each commit sees the result of the previous commit.
Before anything is written to disk, SemLock also runs ast.parse and compile() on the full candidate file. If the replacement produces invalid Python, it is rejected with PARSE_INVALID.
After splicing the replacement in, the system reparses the full candidate file with tree-sitter and checks that the replacement did not escape the boundary of the locked region. If it did, the system returns OUT_OF_SCOPE_EDIT.
If everything passes, the write happens atomically: temp file, fsync, then os.replace.
Semantic Admission
The hardest issue is that non-overlapping regions can still be semantically related.
def a():
b(3)
def b(value: int):
return value * 1 + 1
Agent A can lock b() and change its signature from b(value: int) to b(value: int, scale: int). Agent B can lock a(), which calls b(value). Both agents hold different regions and both pass lexical validation. But now a() is broken because it still calls b with one argument.
Semantic admission tries to catch this. After lexical checks pass, SemLock builds a semantic index of the current file and candidate file using Python’s ast module and symtable.
The index captures:
- Callable interfaces: parameter names, types, defaults, and required arguments.
- Class interfaces: base classes and metaclasses.
- Outgoing same-file dependency edges: function calls, decorators, and inheritance.
- Dynamic markers:
getattr,setattr,eval,exec, wildcard imports,__import__, and starred calls.
It then classifies the change:
| Outcome | Meaning |
|---|---|
ALLOW | The change is local or compatible. |
REQUIRE_ADDITIONAL_LOCKS | The change affects a precise same-file dependency. Acquire those regions and retry. |
ESCALATION_REQUIRED | The change is too broad or uncertain. Escalate to a file lock. |
What SemLock guarantees is narrow but real: same-file, one-hop, direct name references are caught. It does not try to solve attribute dispatch, cross-file imported calls, or transitive dependency chains.
Blackboard Layer
SemLock answers one question: is this write safe right now? It does not answer what to do when the write is not safe. It does not remember the goal. It does not retry. It does not coordinate across agents.
That is where the blackboard layer comes in.
A candidate is the replacement text generated for a specific region at a specific moment. It is tied to the file state when it was read. If another agent commits to the same file, the candidate is stale.
Intent is what the agent was trying to do: for example, “add a discount parameter to charge()”. Intent survives file changes. When REGION_CHANGED comes back, the agent should regenerate from intent against fresh region state rather than retrying the stale candidate.
The blackboard is a persistent shared state store for a run. All agents read from it and write to it. No agent talks directly to another.
There are four roles:
- Planner: decomposes the goal into work items.
- Editor: acquires locks, reads regions, generates candidates, and calls
commit_region. - Recovery Coordinator: watches SemLock outcomes and updates the lock plan.
- Validator: checks the final result.
sequenceDiagram
participant P as Planner
participant BB as Blackboard
participant E as Editor
participant SL as SemLock
participant RC as Recovery Coordinator
P->>BB: Create work item: add discount parameter to charge()
Note over E,SL: Attempt 1
E->>SL: acquire_lock(charge())
SL-->>E: lease granted
E->>SL: commit_region(candidate)
SL-->>E: REQUIRE_ADDITIONAL_LOCKS: apply_discount()
E->>BB: Record outcome
BB->>RC: Request recovery decision
RC->>BB: Update lock plan: charge() + apply_discount()
Note over E,SL: Attempt 2
E->>SL: acquire_lock(charge(), apply_discount())
SL-->>E: both leases granted
E->>SL: commit_region(regenerated candidate)
SL-->>E: write succeeded
E->>BB: Mark work item done
No direct agent-to-agent messaging happens in that flow. Every decision goes through the blackboard. This makes the run replayable: you can inspect the event log and reconstruct what each agent did, what SemLock returned, and why the run ended the way it did.
Conclusion
SemLock is one narrow answer to one narrow version of multi-agent code editing. I am not sure I got most of it right, but it was one of the more interesting problems I have worked on.
For v2, I would want to tackle durable leases, method-level locking, cross-file semantic edges, better transitive enforcement, and language support beyond Python.
Check out the code here.