Hello everyone,
I’m an undergraduate student working on a project where I am to evaluate a hardware accelerator my PhD student is working on in a chiplet-based system. We are running garnet2.0, and it would be infeasible to port it to garnet3.0.
So far, I’ve had success configuring a chiplet configuration in Python and I’ve also been able to achieve a 1:n (n chiplets) memory to LLC mapping. Unfortunately, we found that there was deadlock under certain workloads. This led to the discovery that cache lines were not able to be shared across multiple chiplets (from memory -> cache).
To simulate accurately, this is a requirement. I’ve been attempting to look into this issue, but I wanted to ask here for insight. I was able to find success with creating “cluster-private” LLC chiplets with built-in gem5 configurations. It doesn’t seem likely this feature would be provided without also allowing cache lines to be shared across the clusters.
Could anyone point me in the right direction?
Thanks,
Arteen
Hi Arteen,
This is dependent on the Ruby coherence protocol you are using. It sounds
like you're using MESI_Three_Level or MESI_Two_Level in which the LLC is a
single logical cache that is banked between different instances. If you
want something more like "cluster-private" LLC slices, then you'll have to
use a different coherence protocol. CHI may be the most flexible for your
use case.
See the following documentation for more information
https://www.gem5.org/documentation/general_docs/ruby/
https://www.gem5.org/documentation/general_docs/ruby/CHI/
In the sidebar you can find descriptions of each of the other protocols as
well.
Cheers,
Jason
On Sun, Sep 17, 2023 at 11:32 AM Arteen Abrishami via gem5-users <
gem5-users@gem5.org> wrote:
Hello everyone,
I’m an undergraduate student working on a project where I am to evaluate a
hardware accelerator my PhD student is working on in a chiplet-based
system. We are running garnet2.0, and it would be infeasible to port it to
garnet3.0.
So far, I’ve had success configuring a chiplet configuration in Python and
I’ve also been able to achieve a 1:n (n chiplets) memory to LLC mapping.
Unfortunately, we found that there was deadlock under certain workloads.
This led to the discovery that cache lines were not able to be shared
across multiple chiplets (from memory -> cache).
To simulate accurately, this is a requirement. I’ve been attempting to
look into this issue, but I wanted to ask here for insight. I was able to
find success with creating “cluster-private” LLC chiplets with built-in
gem5 configurations. It doesn’t seem likely this feature would be provided
without also allowing cache lines to be shared across the clusters.
Could anyone point me in the right direction?
Thanks,
Arteen
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org
Hi Jason,
Thanks for your response. I’ve been working on looking into CHI, and so on. However, there’s something I’m confused about.
You are right that we are using an extension of MESI_Three_Level, which my advisor has spent years extending. However, the standard MESI_Three_Level specifically mentions that you can specify the number of clusters in a design where you have private caches. Of course, I found the clusters were able to be made private upon modifying some of the SLICC protocols, which was what I had mentioned in my original message.
I want to ask: what is the motivation for this if it is not possible to have the memory controller share cache lines across the clusters?
The question I am asking is the fundamental problem that I am trying to solve to get our simulation working.
Best,
Arteen
On Sep 18, 2023, at 9:27 AM, Jason Lowe-Power jason@lowepower.com wrote:
Hi Arteen,
This is dependent on the Ruby coherence protocol you are using. It sounds like you're using MESI_Three_Level or MESI_Two_Level in which the LLC is a single logical cache that is banked between different instances. If you want something more like "cluster-private" LLC slices, then you'll have to use a different coherence protocol. CHI may be the most flexible for your use case.
See the following documentation for more information
https://www.gem5.org/documentation/general_docs/ruby/
https://www.gem5.org/documentation/general_docs/ruby/CHI/
In the sidebar you can find descriptions of each of the other protocols as well.
Cheers,
Jason
On Sun, Sep 17, 2023 at 11:32 AM Arteen Abrishami via gem5-users <gem5-users@gem5.org mailto:gem5-users@gem5.org> wrote:
Hello everyone,
I’m an undergraduate student working on a project where I am to evaluate a hardware accelerator my PhD student is working on in a chiplet-based system. We are running garnet2.0, and it would be infeasible to port it to garnet3.0.
So far, I’ve had success configuring a chiplet configuration in Python and I’ve also been able to achieve a 1:n (n chiplets) memory to LLC mapping. Unfortunately, we found that there was deadlock under certain workloads. This led to the discovery that cache lines were not able to be shared across multiple chiplets (from memory -> cache).
To simulate accurately, this is a requirement. I’ve been attempting to look into this issue, but I wanted to ask here for insight. I was able to find success with creating “cluster-private” LLC chiplets with built-in gem5 configurations. It doesn’t seem likely this feature would be provided without also allowing cache lines to be shared across the clusters.
Could anyone point me in the right direction?
Thanks,
Arteen
gem5-users mailing list -- gem5-users@gem5.org mailto:gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org mailto:gem5-users-leave@gem5.org