Reply Matching — PendingMatcher¶
The host owns the "is this inbound frame the reply I'm waiting for?" decision. The Gateway firmware sits in Continuous RX and forwards every matching frame over USB; the host's matcher layer routes each inbound event to whichever caller registered an expectation that accepts it.
This page documents the unified mechanism shipped under Option D
(2026-05). The earlier split — a PendingRequestRegistry for unicast
1-reply waits next to an add_listener-backed collector for N-reply
broadcasts — has been retired.
Why one mechanism¶
Both legacy paths were doing the same thing in different words: wait for matching RX events until N collected or timeout. They differed only in their lookup strategy and their wait loop, not in what information they needed from the caller. Folding them together:
- Removes the "choose the right primitive" burden when adding new request/response paths.
- Eliminates a class of misleading log lines
(
MISS opc=ACK ... pending_keys=0) that appeared on every legitimate stream / discovery / status ACK because the registry never knew the collector had a parallel listener for it. - Halves the conceptual surface of
racelink/services/pending_requests.py(one dataclass + one registry, one wait loop).
Components¶
| Type | Lives in | Responsibility |
|---|---|---|
PendingMatcher |
racelink/services/pending_requests.py |
Per-call expectation: filters (sender / opcode / ack_of / discriminator), collection semantics (count / idle / max), and a threading.Condition for the waiter. |
PendingMatcherRegistry |
same | Thread-safe routing of inbound events to matching matchers. Unicast bucket (O(1)) plus a small broadcast list (linear scan). |
GatewayService.send_and_match(send_fn, matcher) |
racelink/services/gateway_service.py |
Public primitive: register the matcher, invoke send_fn, block on the matcher's condition until done / idle / ceiling, always cancel in the finally. |
GatewayService.send_and_wait_with_retries(...) |
same | Thin retry wrapper on top of send_and_match for unicast paths that need bounded retransmits. |
Matcher fields¶
A PendingMatcher carries four kinds of state:
Filters (what counts as a match)¶
| Field | Meaning | Wildcard value |
|---|---|---|
sender_filter |
Set of accepted sender-last3 addresses (3-byte tail of the device MAC). Singleton sets enable the fast-bucket lookup. | None — any sender |
expected_opcode |
For specific-reply paths (GET_CONFIG_REPLY, STATUS_REPLY, IDENTIFY_REPLY). |
None — any opcode |
expected_ack_of |
For ACK paths: the original request's opcode7 that an OPC_ACK echoes back via the ack_of field. Implies opc == OPC_ACK. |
None — not an ACK waiter |
discriminator_field / discriminator_value |
Final equality gate against an arbitrary parsed-event field. Used today for option (GET_CONFIG_REPLY per-option routing) and reply (IDENTIFY_REPLY / STATUS_REPLY string markers). |
None — no discriminator |
Filters are AND-combined: every set field must hold. Unset fields are wildcard. A fully wildcard matcher accepts every event — useful for a "catch the first thing that comes in" test, less so in production.
Exactly one of expected_opcode and expected_ack_of should be set;
the registry's fast-bucket key uses whichever the matcher advertises.
Collection semantics (when to stop)¶
| Field | Meaning |
|---|---|
expected_count |
Stop after collecting this many matching events (reason="count"). For unicast 1-reply this is 1; for an N-device group ACK it equals len(target_last3); for open-ended discovery sweeps a large sentinel (e.g. 2**31) disables count-based exit and relies on idle/max only. |
idle_timeout_s |
Once the first match has arrived, exit if no further match shows up for this many seconds (reason="idle"). 0.0 disables — the matcher uses max_timeout_s only (legacy unicast semantics). |
max_timeout_s |
Hard ceiling from register(). Exits as "max_timeout" if at least one match arrived, "no_reply" if none. |
Mutable state (registry-managed)¶
| Field | Notes |
|---|---|
collected |
List of matched events in arrival order. Read it after wait() returns. |
last_match_ts |
time.monotonic() of the most recent match — drives the idle window. |
_cond / _done |
Internal threading.Condition and flag. The registry signals on each match; the wait loop re-checks idle/max-timeout each time it wakes. |
Lookup strategy¶
PendingMatcherRegistry.try_match(ev) walks two lookup paths per event:
- Fast unicast bucket — for matchers with exactly one sender in
sender_filterand a concreteexpected_opcodeorexpected_ack_of. Keyed by(sender_last3, opcode_or_ack_of)and queried in O(1). - Broadcast list — every other matcher (wildcard sender, multi- sender set, wildcard opcode). Scanned linearly. Production has at most a handful of outstanding ops at any time, so the scan is negligible.
The first matcher whose full matches(ev) returns True consumes the
event (appends, signals its condition) and try_match returns it. We
do not fan out to multiple matchers on the same event — this preserves
the historical "ACK wakes exactly one waiter" semantics.
Wait loop¶
PendingMatcher.wait() (called by send_and_match) implements a
three-phase timeout:
+----------------+
register() → send_fn() → | wait loop |
+----------------+
│
┌────────────────────┴────────────────────┐
│ │
first match? hard ceiling reached?
│ │
┌─────┴─────┐ ┌──────┴──────┐
no yes 0 collected ≥1 collected
│ │ │ │
hard ceiling start idle timer no_reply max_timeout
│ (last_match_ts + idle_s)
no_reply │
┌─────┴─────┐
count reached? another match arrives → reset idle window
│
count
reason="count" is the success path the unicast caller expects.
reason="idle" and reason="max_timeout" both mean "we got some
replies but the operation ended on a timeout"; the caller decides
whether to treat the partial result as success.
Typical matcher constructions¶
Unicast OPC_CONFIG → ACK¶
matcher = PendingMatcher(
sender_filter=frozenset({recv3_bytes}),
expected_ack_of=int(LP.OPC_CONFIG) & 0x7F,
expected_count=1,
idle_timeout_s=0.0,
max_timeout_s=rf_timing.UNICAST_ATTEMPT_TIMEOUT_S,
)
Used internally by send_and_wait_with_retries; callers normally use
that helper instead of constructing the matcher by hand.
Unicast OPC_GET_CONFIG → GET_CONFIG_REPLY with option discriminator¶
matcher = PendingMatcher(
sender_filter=frozenset({recv3_bytes}),
expected_opcode=int(LP.OPC_GET_CONFIG) & 0x7F,
discriminator_field="option",
discriminator_value=opt_byte,
expected_count=1,
max_timeout_s=rf_timing.UNICAST_ATTEMPT_TIMEOUT_S,
)
Two concurrent read_config() calls on the same device but for
different options route their replies to the right caller because the
codec sets ev["option"] on the parsed reply and the matcher checks
it.
Group OPC_STREAM → N ACKs¶
matcher = PendingMatcher(
sender_filter=frozenset(target_last3_set),
expected_ack_of=int(LP.OPC_STREAM) & 0x7F,
expected_count=len(target_last3_set),
idle_timeout_s=rf_timing.COLLECT_IDLE_TIMEOUT_S,
max_timeout_s=..., # scaled by target count
)
The startblock service uses this per retry attempt, shrinking
sender_filter to the still-unacked subset on each retry so a flaky
device cannot waste budget on already-acked targets.
Wildcard OPC_DEVICES discovery¶
matcher = PendingMatcher(
sender_filter=None, # any device may answer
expected_opcode=int(LP.OPC_DEVICES) & 0x7F,
discriminator_field="reply",
discriminator_value="IDENTIFY_REPLY",
expected_count=2**31, # idle/max only
idle_timeout_s=rf_timing.COLLECT_IDLE_TIMEOUT_S,
max_timeout_s=rf_timing.COLLECT_MAX_CEILING_S,
)
Discovery cannot know how many devices will reply, so the count gate is effectively disabled and the matcher terminates when the last late-comer goes quiet for the idle window.
Diagnostic logging¶
The registry emits debug-level messages on three events:
matcher.register— sender filter, opcode/ack_of, expected count, idle/max timeouts, total outstanding matchers.matcher.cancel— collected count, done flag, elapsed time.matcher.try_match HIT— when an event is consumed.
It also emits a matcher.try_match NO_MATCH line — but only when at
least one matcher was a candidate (same bucket key) yet its full
matches() check rejected the event. That is the genuinely
diagnostic case: an ACK arrived with the right sender + opcode but the
discriminator disagreed (e.g. wrong option byte), suggesting a race
or a codec bug. The plain "no waiter cared about this event" path
no longer logs — that fires on every unsolicited STATUS_REPLY or
untracked broadcast and would only flood the log.
Threading¶
- The registry's internal
_lockis held only for bucket lookups and membership changes — never acrossmatcher.matches(ev)orcond.notify_all(). - Each matcher owns its own
Condition, so two matchers can be signalled in parallel without contending on a single registry-level condition. send_and_matchregisters the matcher before invokingsend_fn. The Gateway's RX reader thread cannot deliver a reply beforesend_fnreturns (the reply is causally posterior to the request), so the registration-then-send ordering is sufficient — there is no separate "drain stale events" race window to defend against.
Locking interaction with the state repository¶
The rule on
Architecture → "Never hold state_repository.lock across RF I/O"
applies unchanged. send_and_match is RF I/O; callers must release
the state lock before invoking it.
Migration history (for context)¶
The unified matcher landed in three reviewable phases under the plan
file aktuell-teste-ich-den-lazy-willow.md:
- Phase 1 introduced
PendingMatcher/PendingMatcherRegistry/send_and_matchand turned the oldsend_and_wait_for_replyandsend_and_collectinto thin adapters. Every test stayed green without modification. - Phase 2 migrated each call site (
config_service,gateway_service.send_stream,discovery_service,status_service) to build a structured matcher and callsend_and_matchdirectly. Adapters remained in place for any code path not yet touched. - Phase 3 removed the adapters, retired the
custom_predclosure hook fromPendingMatcher, and updated the test suite to construct matchers directly.send_and_wait_with_retrieswas kept because it adds genuine retry semantics on top of the primitive — it is now a short retry loop oversend_and_match, not an alias.
There are no compatibility shims left. New code should build a
PendingMatcher and call send_and_match (or use
send_and_wait_with_retries for unicast with retries).