Skip to content

fix: log eigenvalue monitor values#8049

Open
he-yufeng wants to merge 1 commit into
deepspeedai:masterfrom
he-yufeng:fix/eigenvalue-monitor-values
Open

fix: log eigenvalue monitor values#8049
he-yufeng wants to merge 1 commit into
deepspeedai:masterfrom
he-yufeng:fix/eigenvalue-monitor-values

Conversation

@he-yufeng
Copy link
Copy Markdown

Summary

  • replace the nonexistent self.ev_values reference in eigenvalue monitor logging
  • avoid indexing the dict_values view returned by self.block_eigenvalue.values()
  • add a small regression helper/test for the event payloads

Fixes #7983.

To verify

  • python -m py_compile deepspeed\runtime\engine.py tests\unit\runtime\test_engine_eigenvalue.py
  • git diff --check
  • AST-extracted helper smoke test: passed

I also tried python -m pytest tests\unit\runtime\test_engine_eigenvalue.py -q on this Windows checkout. Collection failed before the test body because the local package import resolves deepspeed.accelerator differently in this clone/environment. The touched code and the helper itself both compile, and the helper smoke test exercises the exact event construction without importing the full package.

@he-yufeng he-yufeng force-pushed the fix/eigenvalue-monitor-values branch from e1d9360 to 9b2996a Compare June 5, 2026 08:30
Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com>
@he-yufeng he-yufeng force-pushed the fix/eigenvalue-monitor-values branch from 9b2996a to 483fbba Compare June 6, 2026 19:03
@he-yufeng
Copy link
Copy Markdown
Author

Rebased this branch onto the latest upstream master and force-pushed the cleaned head.

Validated locally:

  • python -m py_compile deepspeed/runtime/engine.py tests/unit/runtime/test_engine_eigenvalue.py
  • git diff --check origin/master..HEAD

I also tried the focused pytest target locally:

  • python -m pytest -q tests/unit/runtime/test_engine_eigenvalue.py

That collection failed in this Windows checkout before running the test because deepspeed/accelerator is checked out as a small link file containing ../accelerator/, so importing deepspeed.accelerator fails locally. The prior upstream CI for this PR had the DeepSpeed test job green; this rebase should let CI rerun on a normal Linux checkout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect variable name: self.ev_values

1 participant