You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per the symbol-bloat analysis of the latest ESP32-S3 Blink build (fbuild, post #2968):
Total flash: 330,054 B across 3,428 live symbols
8 distinct fl::* functions land in the 1-4 KB band, totaling 13,738 B
ESP-IDF / libc cluster (~28 KB libc printf, ~10 KB ESP-IDF drivers, ~5 KB Arduino HAL) dominates the absolute top, but those are vendor-controlled. The 1-4 KB band is where FastLED has real leverage.
The smoking-gun pattern
Every single 1-4 KB FastLED symbol is dominated by FL_WARN/FL_ERROR/FL_INFO machinery. Verified across all 8:
These callees are referenced (so they each cost their full size once). But the per-call-site inline cost is what bloats individual functions:
Per FL_WARN/FL_ERROR call site:
~200-300 B inline (sstream ctor + format << chain + log_emit call)
~100-200 B exception landing pad metadata (string/sstream dtors can throw)
A function with 3-5 log sites ends up with ~1-2 KB of pure logging overhead.
Constraint
Logging must stay enabled — FASTLED_LOG_VERBOSITY=1 (the default for debug + when SKETCH_HAS_LARGE_MEMORY) must continue to produce full log output. Sub-issues that try to no-op log call sites are off-table.
What IS in scope: make the logging infrastructure itself cheaper. Examples:
Mark string/sstream destructors noexcept -> eliminate landing pads (saves ~100-200 B per call site)
Push more inline boilerplate into log_emit -> shrink the per-call-site burst
Consolidate multiple log sites in one function through expected<>::failure(...) carrying the error info to a single bottom-of-function log
Specialize basic_string hot path so writes from sstream don't pay the variant-storage dispatch
Extract validation/error paths into [[gnu::cold]] helpers (out-of-line, doesn't bloat the caller)
Convert dynamic-driver dispatch (which currently inlines multiple writer templates) to virtual dispatch
If the cross-cutting fixes (A + B + C) land and at least 4 of the per-function fixes ship, we should recover ~6-8 KB of flash on ESP32-S3 Blink — roughly 2% of total flash, but more importantly it removes the pattern that makes every FastLED function pay the logging tax twice (call site + cleanup metadata).
Background
Per the symbol-bloat analysis of the latest ESP32-S3 Blink build (fbuild, post #2968):
fl::*functions land in the 1-4 KB band, totaling 13,738 BThe smoking-gun pattern
Every single 1-4 KB FastLED symbol is dominated by
FL_WARN/FL_ERROR/FL_INFOmachinery. Verified across all 8:fl::detail::log_emit(log_kind, char const*, int, fl::sstream&)fl::fastled_file_offset(char const*)fl::sstream::appendFormatted(long)fl::sstream::appendFormatted(unsigned long)fl::string::string()fl::basic_string::~basic_string()fl::basic_string::append(char const*)_Unwind_Resume(C++ exception unwinding)__stack_chk_fail+__stack_chk_guardThese callees are referenced (so they each cost their full size once). But the per-call-site inline cost is what bloats individual functions:
Constraint
Logging must stay enabled —
FASTLED_LOG_VERBOSITY=1(the default for debug + when SKETCH_HAS_LARGE_MEMORY) must continue to produce full log output. Sub-issues that try to no-op log call sites are off-table.What IS in scope: make the logging infrastructure itself cheaper. Examples:
noexcept-> eliminate landing pads (saves ~100-200 B per call site)log_emit-> shrink the per-call-site burstexpected<>::failure(...)carrying the error info to a single bottom-of-function log[[gnu::cold]]helpers (out-of-line, doesn't bloat the caller)Sub-issues (will be linked as filed)
Cross-cutting (affects all 8 functions):
fl::basic_string+fl::sstreamdestructorsnoexceptto eliminate per-call-site landing pad metadatalog_emit(shrink call-site cost from ~200 B to ~50 B)fl::basic_string::write(char const*, unsigned int)Per-function:
fl::ChannelEngineRMTImpl::reconfigureForNetwork()(2,378 B) +createChannel(...)(2,198 B) — consolidate error paths through single log site viaexpected<>::failurefl::Channel::showPixels(PixelController<RGB,1,-1>&)(1,884 B) — virtualize dynamic driver dispatch to stop inlining multiple writer templatesfl::ChannelManager::addDriver(int, shared_ptr<IChannelDriver>)(1,756 B) — extract validation/log paths into[[gnu::cold]]helperfl::RmtMemoryManager::handleAllocateTxFailure(...)(1,710 B) — extract format-and-log machinery into shared cold helperfl::detail::Rmt5EncoderImpl::initialize(...)(1,466 B) — consolidate multiple log sites + extract encoder-builder error wrappingfl::ChannelEngineRMTImpl::processPendingChannels()(1,078 B) — reduce dispatch-loop landing pad countExpected combined savings
If the cross-cutting fixes (A + B + C) land and at least 4 of the per-function fixes ship, we should recover ~6-8 KB of flash on ESP32-S3 Blink — roughly 2% of total flash, but more importantly it removes the pattern that makes every FastLED function pay the logging tax twice (call site + cleanup metadata).
Investigation tooling
.build/symbols/esp32s3-fbuild/report.md(3,428 symbols).build/symbols/esp32s3-fbuild/graphs/0001..0269.dotci/tmp/fl_bloat_investigation.py.build/fl_bloat_investigation.mdRefs