gem5-users@gem5.org

The gem5 Users mailing list

View all threads

Creating Checkpoint with KVM cpu and restore with Timing cpu

HW
Hansika Weerasena
Fri, Jun 30, 2023 10:30 PM

Hi,

I want to boot up Linux with X86KvmCPU and switch to TimingCpu to run some benchmarks, and I needed to run them with garnet interconnection network. I’m planning to do switching via checkpoint (loading checkpoint with TimingCPU).

Following is the command I used to run gem5 in full system mode and create the checkpoint. The details on versions can be found at the end.

./build/X86/gem5.opt configs/example/fs.py --disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4 --num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB --l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory --mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2 --script=config/boot/hack_back_ckpt.rcS

system.pc.com_1.device shows the last message as "Checkpointing simulation..." and I can see a folder cpt.<clocknumber> created at m5out directory. But the simulation is hanging there for more than two hours. I would like to know :

  1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this?
  2. Does checkpointing works with KvmCPU ?
  3. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py)

gem5 version : 22.0.0.1

disk image : x86-ubuntu-18.04-img from the gem5 resources : https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0

kernel : x86-linux-kernel-4.4.186 from gem5 resources : https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0

Regards,
Hansika Weerasena

Hansika Weerasena

Hi, I want to boot up Linux with X86KvmCPU and switch to TimingCpu to run some benchmarks, and I needed to run them with garnet interconnection network. I’m planning to do switching via checkpoint (loading checkpoint with TimingCPU). Following is the command I used to run gem5 in full system mode and create the checkpoint. The details on versions can be found at the end. ./build/X86/gem5.opt configs/example/fs.py --disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4 --num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB --l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory --mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2 --script=config/boot/hack_back_ckpt.rcS system.pc.com_1.device shows the last message as "Checkpointing simulation..." and I can see a folder cpt.<clocknumber> created at m5out directory. But the simulation is hanging there for more than two hours. I would like to know : 1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this? 2. Does checkpointing works with KvmCPU ? 3. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py) gem5 version : 22.0.0.1 disk image : x86-ubuntu-18.04-img from the gem5 resources : https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0 kernel : x86-linux-kernel-4.4.186 from gem5 resources : https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0 Regards, Hansika Weerasena Hansika Weerasena
AA
Ayaz Akram
Sat, Jul 1, 2023 6:39 PM

Hi Hansika,

1. Is there anything I'm doing wrong or missing or does checkpointing
usually take hours like this?

I don't think there is anything wrong with how you are trying to create

the checkpoint. If the cpt.tick_value directory is created and has contents
in it, that means the checkpoint is created. Depending on the size of the
checkpoint and your filesystem, taking the checkpoint can be slow but it
should not take hours. I actually tried your command with the same
kernel/disk versions and used the gem5  "release-staging-v23-0" branch and
things seem to work fine for me i.e., the checkpoint is taken and the
simulation terminates after that.

1. Is there a better way to do fast forwarding without using
checkpoints? (I cannot use new gem5 standard library because it does
not support garnet, therefore I need to use fs.py)

The alternative is to just switch CPUs at runtime (once you are about to
start your benchmark of interest). Example of how to do that (assuming that
you cannot use standard library) can be found in older versions of
gem5-resources (for example:
https://gem5.googlesource.com/public/gem5-resources/+/refs/tags/v21.0.1.0/src/npb/configs/run_npb.py).
However, please note that those scripts might not work with the latest gem5
version and require small changes.

-Ayaz

On Fri, Jun 30, 2023 at 3:32 PM Hansika Weerasena via gem5-users <
gem5-users@gem5.org> wrote:

Hi,

I want to boot up Linux with *X86KvmCPU *and switch to TimingCpu to run
some benchmarks, and I needed to run them with garnet interconnection
network. I’m planning to do switching via checkpoint (loading checkpoint
with TimingCPU).

Following is the command I used to run gem5 in full system mode and create
the checkpoint. The details on versions can be found at the end.

./build/X86/gem5.opt configs/example/fs.py
--disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4
--num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB
--l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory
--mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2
--script=config/boot/hack_back_ckpt.rcS

system.pc.com_1.device shows the last message as "*Checkpointing
simulation..." *and I can see a folder cpt.<clocknumber> created at
m5out directory. But the simulation is hanging there for more than two
hours. I would like to know :

1. Is there anything I'm doing wrong or missing or does checkpointing
usually take hours like this?
2. Does checkpointing works with KvmCPU ?
3. Is there a better way to do fast forwarding without using
checkpoints? (I cannot use new gem5 standard library because it does not
support garnet, therefore I need to use fs.py)

gem5 version : 22.0.0.1

disk image : x86-ubuntu-18.04-img from the gem5 resources :
https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0

kernel : x86-linux-kernel-4.4.186 from gem5 resources :
https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0

Regards,
Hansika Weerasena

Hansika Weerasena


gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org

Hi Hansika, > 1. Is there anything I'm doing wrong or missing or does checkpointing > usually take hours like this? > > I don't think there is anything wrong with how you are trying to create the checkpoint. If the cpt.tick_value directory is created and has contents in it, that means the checkpoint is created. Depending on the size of the checkpoint and your filesystem, taking the checkpoint can be slow but it should not take hours. I actually tried your command with the same kernel/disk versions and used the gem5 "release-staging-v23-0" branch and things seem to work fine for me i.e., the checkpoint is taken and the simulation terminates after that. > 1. Is there a better way to do fast forwarding without using > checkpoints? (I cannot use new gem5 standard library because it does > not support garnet, therefore I need to use fs.py) > > The alternative is to just switch CPUs at runtime (once you are about to start your benchmark of interest). Example of how to do that (assuming that you cannot use standard library) can be found in older versions of gem5-resources (for example: https://gem5.googlesource.com/public/gem5-resources/+/refs/tags/v21.0.1.0/src/npb/configs/run_npb.py). However, please note that those scripts might not work with the latest gem5 version and require small changes. -Ayaz On Fri, Jun 30, 2023 at 3:32 PM Hansika Weerasena via gem5-users < gem5-users@gem5.org> wrote: > Hi, > > I want to boot up Linux with *X86KvmCPU *and switch to TimingCpu to run > some benchmarks, and I needed to run them with garnet interconnection > network. I’m planning to do switching via checkpoint (loading checkpoint > with TimingCPU). > > Following is the command I used to run gem5 in full system mode and create > the checkpoint. The details on versions can be found at the end. > > *./build/X86/gem5.opt configs/example/fs.py > --disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4 > --num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB > --l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory > --mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2 > --script=config/boot/hack_back_ckpt.rcS* > > > system.pc.com_1.device shows the last message as "*Checkpointing > simulation..." *and I can see a folder *cpt.<clocknumber>* created at > m5out directory. But the simulation is hanging there for more than two > hours. I would like to know : > > 1. Is there anything I'm doing wrong or missing or does checkpointing > usually take hours like this? > 2. Does checkpointing works with KvmCPU ? > 3. Is there a better way to do fast forwarding without using > checkpoints? (I cannot use new gem5 standard library because it does not > support garnet, therefore I need to use fs.py) > > > *gem5 version : 22.0.0.1* > > disk image : x86-ubuntu-18.04-img from the gem5 resources : > https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0 > > kernel : x86-linux-kernel-4.4.186 from gem5 resources : > https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0 > > Regards, > Hansika Weerasena > > Hansika Weerasena > > _______________________________________________ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-leave@gem5.org >
HW
Hansika Weerasena
Sat, Jul 1, 2023 9:45 PM

Hi Akram,

Thank you very much, somehow in my setup it seems to be hang forever. Even though the checkpoint directory is created I can’t load from it, I think the checkpointing is not done completely. If you can run the same command to completion, it would be an issue with my setup.

I will try the alternative method with switching CPUs at runtime. Thank you for the pointer to the script.

From: Ayaz Akram yazakram@ucdavis.edu
Date: Saturday, July 1, 2023 at 2:39 PM
To: The gem5 Users mailing list gem5-users@gem5.org
Cc: Hansika Weerasena hansikam.lokukat@ufl.edu
Subject: Re: [gem5-users] Creating Checkpoint with KVM cpu and restore with Timing cpu
[External Email]
Hi Hansika,

  1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this?
    I don't think there is anything wrong with how you are trying to create the checkpoint. If the cpt.tick_value directory is created and has contents in it, that means the checkpoint is created. Depending on the size of the checkpoint and your filesystem, taking the checkpoint can be slow but it should not take hours. I actually tried your command with the same kernel/disk versions and used the gem5  "release-staging-v23-0" branch and things seem to work fine for me i.e., the checkpoint is taken and the simulation terminates after that.

  2. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py)

The alternative is to just switch CPUs at runtime (once you are about to start your benchmark of interest). Example of how to do that (assuming that you cannot use standard library) can be found in older versions of gem5-resources (for example: https://gem5.googlesource.com/public/gem5-resources/+/refs/tags/v21.0.1.0/src/npb/configs/run_npb.py). However, please note that those scripts might not work with the latest gem5 version and require small changes.

-Ayaz

On Fri, Jun 30, 2023 at 3:32 PM Hansika Weerasena via gem5-users <gem5-users@gem5.orgmailto:gem5-users@gem5.org> wrote:
Hi,

I want to boot up Linux with X86KvmCPU and switch to TimingCpu to run some benchmarks, and I needed to run them with garnet interconnection network. I’m planning to do switching via checkpoint (loading checkpoint with TimingCPU).

Following is the command I used to run gem5 in full system mode and create the checkpoint. The details on versions can be found at the end.

./build/X86/gem5.opt configs/example/fs.py --disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4 --num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB --l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory --mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2 --script=config/boot/hack_back_ckpt.rcS

system.pc.com_1.device shows the last message as "Checkpointing simulation..." and I can see a folder cpt.<clocknumber> created at m5out directory. But the simulation is hanging there for more than two hours. I would like to know :

  1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this?
  2. Does checkpointing works with KvmCPU ?
  3. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py)

gem5 version : 22.0.0.1

disk image : x86-ubuntu-18.04-img from the gem5 resources : https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0

kernel : x86-linux-kernel-4.4.186 from gem5 resources : https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0

Regards,
Hansika Weerasena

Hansika Weerasena


gem5-users mailing list -- gem5-users@gem5.orgmailto:gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.orgmailto:gem5-users-leave@gem5.org

Hi Akram, Thank you very much, somehow in my setup it seems to be hang forever. Even though the checkpoint directory is created I can’t load from it, I think the checkpointing is not done completely. If you can run the same command to completion, it would be an issue with my setup. I will try the alternative method with switching CPUs at runtime. Thank you for the pointer to the script. From: Ayaz Akram <yazakram@ucdavis.edu> Date: Saturday, July 1, 2023 at 2:39 PM To: The gem5 Users mailing list <gem5-users@gem5.org> Cc: Hansika Weerasena <hansikam.lokukat@ufl.edu> Subject: Re: [gem5-users] Creating Checkpoint with KVM cpu and restore with Timing cpu [External Email] Hi Hansika, 1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this? I don't think there is anything wrong with how you are trying to create the checkpoint. If the cpt.tick_value directory is created and has contents in it, that means the checkpoint is created. Depending on the size of the checkpoint and your filesystem, taking the checkpoint can be slow but it should not take hours. I actually tried your command with the same kernel/disk versions and used the gem5 "release-staging-v23-0" branch and things seem to work fine for me i.e., the checkpoint is taken and the simulation terminates after that. 1. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py) The alternative is to just switch CPUs at runtime (once you are about to start your benchmark of interest). Example of how to do that (assuming that you cannot use standard library) can be found in older versions of gem5-resources (for example: https://gem5.googlesource.com/public/gem5-resources/+/refs/tags/v21.0.1.0/src/npb/configs/run_npb.py). However, please note that those scripts might not work with the latest gem5 version and require small changes. -Ayaz On Fri, Jun 30, 2023 at 3:32 PM Hansika Weerasena via gem5-users <gem5-users@gem5.org<mailto:gem5-users@gem5.org>> wrote: Hi, I want to boot up Linux with X86KvmCPU and switch to TimingCpu to run some benchmarks, and I needed to run them with garnet interconnection network. I’m planning to do switching via checkpoint (loading checkpoint with TimingCPU). Following is the command I used to run gem5 in full system mode and create the checkpoint. The details on versions can be found at the end. ./build/X86/gem5.opt configs/example/fs.py --disk-image=dist/x86-ubuntu.img --kernel=dist/vmlinux-4.4.186 --num-cpus=4 --num-dirs=4 --cpu-type=X86KvmCPU --cpu-clock=2GHz --caches --l1d_size=16kB --l1i_size=16kB --l2cache --num-l2cache=4 --mem-type=SimpleMemory --mem-size=2GB --ruby --network=garnet --topology=Mesh_XY --mesh-rows=2 --script=config/boot/hack_back_ckpt.rcS system.pc.com_1.device shows the last message as "Checkpointing simulation..." and I can see a folder cpt.<clocknumber> created at m5out directory. But the simulation is hanging there for more than two hours. I would like to know : 1. Is there anything I'm doing wrong or missing or does checkpointing usually take hours like this? 2. Does checkpointing works with KvmCPU ? 3. Is there a better way to do fast forwarding without using checkpoints? (I cannot use new gem5 standard library because it does not support garnet, therefore I need to use fs.py) gem5 version : 22.0.0.1 disk image : x86-ubuntu-18.04-img from the gem5 resources : https://resources.gem5.org/resources/x86-ubuntu-18.04-img?database=gem5-resources&version=1.0.0 kernel : x86-linux-kernel-4.4.186 from gem5 resources : https://resources.gem5.org/resources/x86-linux-kernel-4.4.186?version=1.0.0 Regards, Hansika Weerasena Hansika Weerasena _______________________________________________ gem5-users mailing list -- gem5-users@gem5.org<mailto:gem5-users@gem5.org> To unsubscribe send an email to gem5-users-leave@gem5.org<mailto:gem5-users-leave@gem5.org>