gem5-dev@gem5.org

The gem5 Developer List

View all threads

Request for Assistance with Cache Wear-Leveling Modifications in gem5

JV
Jatin Verma
Mon, Mar 3, 2025 11:01 AM

Dear Gem5 Developers,

I am working on implementing a wear-leveling mechanism in the cache
subsystem of Gem5 as part of my research. My approach involves intra-set
and inter-set wear-leveling by dynamically selecting cache sets based on
write counts.

I have modified several functions in CacheMemory.cc, including:

  • allocateL2(), cacheAvailL2(), cacheProbeL2(), and isTagPresentL2() to
    ensure allocations are directed toward the least-worn sets.
  • incrementWriteCount() to track write operations and maintain per-set
    and per-way write counts.
  • checkAndUpdateInterval() and analyzeWriteCounts() to periodically
    analyze write patterns and update a bitmap marking high-wear sets.
  • deallocate() to handle deallocation across adjacent sets.

Issues Faced:

  1. Process gets killed: Occasionally, during execution, the simulation
    terminates with the message "Killed," which might indicate excessive memory
    usage.
  2. Panic: Invalid Transition: I encounter an "Invalid transition"
    panic during cache accesses, particularly in allocateL2(). I suspect
    this is related to handling of cache permissions or an incorrect state
    update in the coherence protocol.

I would greatly appreciate any insights into debugging these issues.
Specifically:

  • Could modifying allocateL2() to select neighboring sets for allocation
    be causing unintended state transitions?
  • Are there any best practices for debugging "Killed" errors in Gem5?

I have attached modified CacheMemory.cc and .hh with
MESI_Two_Level_L2-cache.sm files.

Any guidance or suggestions on debugging strategies would be immensely
helpful.

Best regards,
Jatin Verma

Dear Gem5 Developers, I am working on implementing a wear-leveling mechanism in the cache subsystem of Gem5 as part of my research. My approach involves intra-set and inter-set wear-leveling by dynamically selecting cache sets based on write counts. I have modified several functions in CacheMemory.cc, including: - allocateL2(), cacheAvailL2(), cacheProbeL2(), and isTagPresentL2() to ensure allocations are directed toward the least-worn sets. - incrementWriteCount() to track write operations and maintain per-set and per-way write counts. - checkAndUpdateInterval() and analyzeWriteCounts() to periodically analyze write patterns and update a bitmap marking high-wear sets. - deallocate() to handle deallocation across adjacent sets. Issues Faced: 1. *Process gets killed*: Occasionally, during execution, the simulation terminates with the message "Killed," which might indicate excessive memory usage. 2. *Panic: Invalid Transition*: I encounter an "Invalid transition" panic during cache accesses, particularly in allocateL2(). I suspect this is related to handling of cache permissions or an incorrect state update in the coherence protocol. I would greatly appreciate any insights into debugging these issues. Specifically: - Could modifying allocateL2() to select neighboring sets for allocation be causing unintended state transitions? - Are there any best practices for debugging "Killed" errors in Gem5? I have attached modified CacheMemory.cc and .hh with MESI_Two_Level_L2-cache.sm files. Any guidance or suggestions on debugging strategies would be immensely helpful. Best regards, Jatin Verma