gem5-users@gem5.org

The gem5 Users mailing list

View all threads

Transient execution during Page Fault - X86 O3 FS simulation

RG
reverent.green@web.de
Fri, Dec 29, 2023 6:32 PM

Hello everyone,

I am currently looking into transient execution vulnerabilities using the gem5 simulator. I successfully tried out the spectre example and want to create something similar for meltdown.

For these experiments, I am using the O3 CPU model.

Now my question:

When I remove this user/supervisor check in gem5 specifically for my address, the PoC leaks the "secret" value via Flush+Reload without a problem, because it can be accessed and therefore should be in the cache. But when the Page Fault is created, the value is not leaked.

I try to follow the execution chain for the specific PC and address shown for the triggered page fault for multiple weeks, but up until now failed to identify the exact problems here.

Can someone enlighten me, how gem5 handles an instruction after this specific point? In my understanding, the transient execution should still be visible after the security check. Where is the exact point the instruction results are "removed", s.t. they aren't available in the cache anymore and in which step of the out-of-order pipeline does the magic happen?

If you have further questions, I try to answer them as detailed as possible.

Any tips would be really appreciated, thank you in advance.

Kind regards

Robin

RG
reverent.green@web.de
Mon, Jan 8, 2024 9:04 AM

I analyzed a few debug log outputs.

The control flow varies here https://github.com/gem5/gem5/blob/stable/src/cpu/o3/commit.cc#L1172, when the instruction triggers the page fault in the TLB. Whether or not I remove this check for a fault in the commit step for my address, the instruction remains unchanged if the PageFault is triggered earlier.

Without PageFault: "MOVSX_B_R_M : ld t1b, DS:[rax] : MemRead : D=0x0000000000401753 A=0x7f5bd1b11000"

With PageFault: "MOVSX_B_R_M : ld t1b, DS:[rax] : MemRead : A=0x7f5bd1b11000"

Is my assumption correct, that a PageFault leads to an earlier "ignoring" of the instruction and therefore, the value isn't even computed, rather than computed upfront and "deleted" after?

Only without PageFault, the next parts of the instruction is also executed:

"MOVSEX_B_R_M : sexti rcx, t1, 0x7 : IntAlu : D=0000000000000053"

Kind regards

Robin

Gesendet: Freitag, 29. Dezember 2023 um 19:32 Uhr
Von: "reverent.green--- via gem5-users" <gem5-users@gem5.org>
An: gem5-users@gem5.org
Cc: reverent.green@web.de
Betreff: [gem5-users] Transient execution during Page Fault - X86 O3 FS simulation

Hello everyone,

I am currently looking into transient execution vulnerabilities using the gem5 simulator. I successfully tried out the spectre example and want to create something similar for meltdown.

For these experiments, I am using the O3 CPU model.

Now my question:

When I remove this user/supervisor check in gem5 specifically for my address, the PoC leaks the "secret" value via Flush+Reload without a problem, because it can be accessed and therefore should be in the cache. But when the Page Fault is created, the value is not leaked.

I try to follow the execution chain for the specific PC and address shown for the triggered page fault for multiple weeks, but up until now failed to identify the exact problems here.

Can someone enlighten me, how gem5 handles an instruction after this specific point? In my understanding, the transient execution should still be visible after the security check. Where is the exact point the instruction results are "removed", s.t. they aren't available in the cache anymore and in which step of the out-of-order pipeline does the magic happen?

If you have further questions, I try to answer them as detailed as possible.

Any tips would be really appreciated, thank you in advance.

Kind regards

Robin

_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-leave@gem5.org