MTU Clamping (Bug 8a): Clamp link MTU signalling in LINKREQUEST packets
when forwarding through transport node, matching Python reference impl.
Without this, TCP endpoints negotiate 8192-byte segments that exceed the
V3's 1064-byte HDLC buffer, causing silent truncation and permanent
resource transfer stalls at ~70%.
Fixed MTU declaration (Bug 8b): Set FIXED_MTU=true on TcpInterface so
Transport uses HW_MTU for clamping decisions.
Oversized frame detection (Bug 8c): Track truncated HDLC frames and
drop them with a diagnostic log instead of silently delivering corrupt
data to Transport.
Echo-back prevention (v1.0.10): Track which TCP client originated each
inbound frame and skip that client in send_outgoing() to prevent
flooding TCP buffers.
Register local client interface: Enable Transport forwarding of
announces, link packets, and proofs to TCP clients.
Document all discovered microReticulum bugs in MICRORETICULUM_BUGS.md.
Root cause: Python Reticulum trims random_blobs per destination entry
(MAX_RANDOM_BLOBS=64 in-memory, PERSIST_RANDOM_BLOBS=32 on disk).
The C++ firmware had these constants defined but NEVER enforced them,
causing unbounded growth. With 21 paths x 60+ blobs x ~90 bytes each,
the destination table alone consumed ~57KB of the ESP32 324KB heap.
Fixes:
- Trim random_blobs after insert (matching Python behavior)
- Trim random_blobs on deserialization from flash
- Trim random_blobs to PERSIST_RANDOM_BLOBS on serialization
- Enforce _path_table_maxpersist when writing path table (was declared
but never used - write_path_table saved everything)
- Reduce MCU constants: MAX_RANDOM_BLOBS 64->16, PERSIST_RANDOM_BLOBS 32->8
- Reduce path_table_maxsize 128->24, maxpersist 32->12
- Add memory diagnostic after path table load
- Trim loaded paths to maxsize on startup via cull_path_table()
Results: destination_table 21KB->5.8KB, free heap 63K(22%)->156K(49%)
Bug fixes:
- Fix path_request_handler hops: use DestinationEntry._hops instead of
stale cached announce_packet.hops(). The cached packet retains its
original wire hops (pre-increment), but Python Transport.py explicitly
overwrites packet.hops from path_table after retrieval (line 2736).
This caused PATH_RESPONSE to report fewer hops than actual, making
the sender's expected_hops too low, which caused LRPROOF hop-count
validation to silently fail. (ROOT CAUSE of link timeout)
- Fix std::map::insert() no-op: erase before insert at 3 locations in
_announce_table. Unlike Python dict assignment, C++ map::insert()
does not overwrite existing keys. This prevented announce table
updates from taking effect. (Caused PATH-RESP delivery failure)
- Defer packet hash filtering for link table entries and LRPROOF
packets. Matching Python Transport behavior (line 1544), packets
belonging to active links are not added to the filter hashlist
until link transport processing determines it is our turn to
forward them. Prevents premature filtering that breaks link transport.
- Pass DestinationEntry and LinkEntry by reference instead of by value
to avoid stale copies and unnecessary allocations.
- Add link_table check before requesting paths for link_id destinations.
Link data packets are handled by link transport, not standard path
lookup, so spurious path requests are avoided.
- Add culling for _held_announces (60s timeout, cap 32) and
_boundary_local_addresses to prevent unbounded memory growth.
- TcpInterface: detect and log partial writes.
Root cause: heltec_V4_boundary build was missing -DRNS_USE_TLSF=1 and
-DRNS_USE_ALLOCATOR=1 flags, causing ALL C++ new/delete to use internal
SRAM (239KB) instead of the PSRAM-backed TLSF pool (~1.6MB). Transport
data structures consumed internal heap until WiFi driver could not
allocate RX buffers (ESP_ERR_NO_MEM).
Changes:
- platformio.ini: Add TLSF/allocator flags to heltec_V4_boundary env,
re-enable NDEBUG
- Transport.cpp: Add periodic culling of _path_requests (was unbounded,
grew one entry per unique destination forever). Cull entries older than
DESTINATION_TIMEOUT. Also cull _pending_local_path_requests for removed
interfaces, and fix missing .erase() (Python .pop() equivalent).
- RNode_Firmware.ino: Replace WiFi watchdog halt-serial with auto-reboot.
Add heap pressure check (reboot if free heap < 20KB). Increase WiFi
grace period from 5s to 15s. Remove orphaned boundary_done label.
Performance optimizations:
- Move TLSF allocator pool to PSRAM (frees ~170KB internal SRAM)
- Raise TCP_IF_MAX_CLIENTS from 4 to 8 in BOUNDARY_MODE
- Raise path_table_maxsize from 48 to 128, persist from 16 to 32
- Add -DNDEBUG to boundary build: compiles out TRACE/DEBUG macros
- Log level defaults to LOG_VERBOSE when NDEBUG defined
- Serial baud 115200 -> 921600 in BOUNDARY_MODE (reduces CPU blocking)
Previous changes included in this commit:
- Comprehensive boundary filter with transitive whitelisting (7 checks)
- destination_table erase+insert fix (std::map::insert no-overwrite bug)
- Backbone-to-backbone routing guard in next-hop forwarding
- KISS serial output disabled for boundary mode
- flash.py updates for boundary mode support
Vendor microReticulum library with boundary mode transport fixes:
- Two-whitelist system gates backbone traffic (local addresses +
mentioned addresses from local devices)
- Allow control_hashes and local destinations through boundary filter
(fixes backbone→LoRa path discovery)
- Fix get_cached_packet() to call unpack() instead of update_hash()
(fixes empty destination_hash in path responses)
- LRPROOF Identity::recall null guard
- remaining_hops HEADER_1/BROADCAST fix for final-hop delivery
- PROOF packets excluded from boundary wrapping
- Iterator invalidation fix in transport table cleanup
- is_backbone flag replaces string matching for interface identification
Firmware changes:
- Set is_backbone(true) on backbone TCP interface
- Rename default TcpInterface name to BackboneInterface
- Update comments for dual-use TcpInterface (backbone + local AP)
- Use vendored lib/microReticulum instead of PlatformIO registry