cpu-o3: O3 CPU inter-stage unblock signal causes 1-cycle bubble due to propagation delay#3191
Open
zhongchengyong wants to merge 1 commit into
Open
cpu-o3: O3 CPU inter-stage unblock signal causes 1-cycle bubble due to propagation delay#3191zhongchengyong wants to merge 1 commit into
zhongchengyong wants to merge 1 commit into
Conversation
When a downstream stage (decode/rename/IEW) blocks the upstream stage, it previously only sent the unblock signal after its skid buffer was completely drained. Due to signal propagation delay (e.g., decodeToFetchDelay), this caused a 1-cycle bubble before the upstream stage could resume sending instructions. Fix: send unblock early when skidBuffer.size() <= stageDelay * stageWidth, i.e., when the remaining entries can be consumed during the signal propagation time. The stage only transitions to Running state when the skid buffer is truly empty, maintaining correctness. Applied uniformly to decode->fetch, rename->decode, and IEW->rename paths. Remove stale assertions that assumed unblock is only received once.
Contributor
|
@zhongchengyong: Hello, could you investigate the CI test failures on this PR when you get a chance? Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #3190
When a downstream stage (decode/rename/IEW) blocks the upstream stage, it previously only sent the unblock signal after its skid buffer was completely drained. Due to signal propagation delay (e.g., decodeToFetchDelay), this caused a 1-cycle bubble before the upstream stage could resume sending instructions.
Fix: send unblock early when skidBuffer.size() <= stageDelay * stageWidth, i.e., when the remaining entries can be consumed during the signal propagation time. The stage only transitions to Running state when the skid buffer is truly empty, maintaining correctness.
Applied uniformly to decode->fetch, rename->decode, and IEW->rename paths. Remove stale assertions that assumed unblock is only received once.