2021-01-14 22:27:55 -05:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2021-12-09 08:12:28 -05:00
|
|
|
#if !defined(KVM_X86_OP) || !defined(KVM_X86_OP_OPTIONAL)
|
2021-01-14 22:27:55 -05:00
|
|
|
BUILD_BUG_ON(1)
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
2021-12-09 08:12:28 -05:00
|
|
|
* KVM_X86_OP() and KVM_X86_OP_OPTIONAL() are used to help generate
|
|
|
|
* both DECLARE/DEFINE_STATIC_CALL() invocations and
|
|
|
|
* "static_call_update()" calls.
|
|
|
|
*
|
|
|
|
* KVM_X86_OP_OPTIONAL() can be used for those functions that can have
|
|
|
|
* a NULL definition, for example if "static_call_cond()" will be used
|
2022-02-15 13:07:10 -05:00
|
|
|
* at the call sites. KVM_X86_OP_OPTIONAL_RET0() can be used likewise
|
|
|
|
* to make a definition optional, but in this case the default will
|
|
|
|
* be __static_call_return0.
|
2021-01-14 22:27:55 -05:00
|
|
|
*/
|
2022-11-30 18:09:23 -05:00
|
|
|
KVM_X86_OP(check_processor_compatibility)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP(hardware_enable)
|
|
|
|
KVM_X86_OP(hardware_disable)
|
|
|
|
KVM_X86_OP(hardware_unsetup)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(has_emulated_msr)
|
|
|
|
KVM_X86_OP(vcpu_after_set_cpuid)
|
|
|
|
KVM_X86_OP(vm_init)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(vm_destroy)
|
2022-04-19 11:45:10 -04:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(vcpu_create)
|
|
|
|
KVM_X86_OP(vcpu_free)
|
|
|
|
KVM_X86_OP(vcpu_reset)
|
KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
Rename a variety of kvm_x86_op function pointers so that preferred name
for vendor implementations follows the pattern <vendor>_<function>, e.g.
rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will
allow vendor implementations to be wired up via the KVM_X86_OP macro.
In many cases, VMX and SVM "disagree" on the preferred name, though in
reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
the kvm_x86_ops name. Justification for using the VMX nomenclature:
- set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
event that has already been "set" in e.g. the vIRR. SVM's relevant
VMCB field is even named event_inj, and KVM's stat is irq_injections.
- prepare_guest_switch => prepare_switch_to_guest because the former is
ambiguous, e.g. it could mean switching between multiple guests,
switching from the guest to host, etc...
- update_pi_irte => pi_update_irte to allow for matching match the rest
of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
- start_assignment => pi_start_assignment to again follow VMX's posted
interrupt naming scheme, and to provide context for what bit of code
might care about an otherwise undescribed "assignment".
The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave
that change for another time as the Hyper-V hooks always start as NULL,
i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
names requires an astounding amount of churn.
VMX and SVM function names are intentionally left as is to minimize the
diff. Both VMX and SVM will need to rename even more functions in order
to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
inevitable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220128005208.4008533-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-27 19:51:50 -05:00
|
|
|
KVM_X86_OP(prepare_switch_to_guest)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(vcpu_load)
|
|
|
|
KVM_X86_OP(vcpu_put)
|
|
|
|
KVM_X86_OP(update_exception_bitmap)
|
|
|
|
KVM_X86_OP(get_msr)
|
|
|
|
KVM_X86_OP(set_msr)
|
|
|
|
KVM_X86_OP(get_segment_base)
|
|
|
|
KVM_X86_OP(get_segment)
|
|
|
|
KVM_X86_OP(get_cpl)
|
|
|
|
KVM_X86_OP(set_segment)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP(get_cs_db_l_bits)
|
2023-06-13 16:30:35 -04:00
|
|
|
KVM_X86_OP(is_valid_cr0)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(set_cr0)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(post_set_cr3)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(is_valid_cr4)
|
|
|
|
KVM_X86_OP(set_cr4)
|
|
|
|
KVM_X86_OP(set_efer)
|
|
|
|
KVM_X86_OP(get_idt)
|
|
|
|
KVM_X86_OP(set_idt)
|
|
|
|
KVM_X86_OP(get_gdt)
|
|
|
|
KVM_X86_OP(set_gdt)
|
|
|
|
KVM_X86_OP(sync_dirty_debug_regs)
|
|
|
|
KVM_X86_OP(set_dr7)
|
|
|
|
KVM_X86_OP(cache_reg)
|
|
|
|
KVM_X86_OP(get_rflags)
|
|
|
|
KVM_X86_OP(set_rflags)
|
2021-12-09 10:52:57 -05:00
|
|
|
KVM_X86_OP(get_if_flag)
|
KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
Rename a variety of kvm_x86_op function pointers so that preferred name
for vendor implementations follows the pattern <vendor>_<function>, e.g.
rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will
allow vendor implementations to be wired up via the KVM_X86_OP macro.
In many cases, VMX and SVM "disagree" on the preferred name, though in
reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
the kvm_x86_ops name. Justification for using the VMX nomenclature:
- set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
event that has already been "set" in e.g. the vIRR. SVM's relevant
VMCB field is even named event_inj, and KVM's stat is irq_injections.
- prepare_guest_switch => prepare_switch_to_guest because the former is
ambiguous, e.g. it could mean switching between multiple guests,
switching from the guest to host, etc...
- update_pi_irte => pi_update_irte to allow for matching match the rest
of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
- start_assignment => pi_start_assignment to again follow VMX's posted
interrupt naming scheme, and to provide context for what bit of code
might care about an otherwise undescribed "assignment".
The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave
that change for another time as the Hyper-V hooks always start as NULL,
i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
names requires an astounding amount of churn.
VMX and SVM function names are intentionally left as is to minimize the
diff. Both VMX and SVM will need to rename even more functions in order
to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
inevitable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220128005208.4008533-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-27 19:51:50 -05:00
|
|
|
KVM_X86_OP(flush_tlb_all)
|
|
|
|
KVM_X86_OP(flush_tlb_current)
|
2023-10-18 15:23:25 -04:00
|
|
|
#if IS_ENABLED(CONFIG_HYPERV)
|
2023-04-04 20:31:32 -04:00
|
|
|
KVM_X86_OP_OPTIONAL(flush_remote_tlbs)
|
|
|
|
KVM_X86_OP_OPTIONAL(flush_remote_tlbs_range)
|
2023-10-18 15:23:25 -04:00
|
|
|
#endif
|
KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
Rename a variety of kvm_x86_op function pointers so that preferred name
for vendor implementations follows the pattern <vendor>_<function>, e.g.
rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will
allow vendor implementations to be wired up via the KVM_X86_OP macro.
In many cases, VMX and SVM "disagree" on the preferred name, though in
reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
the kvm_x86_ops name. Justification for using the VMX nomenclature:
- set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
event that has already been "set" in e.g. the vIRR. SVM's relevant
VMCB field is even named event_inj, and KVM's stat is irq_injections.
- prepare_guest_switch => prepare_switch_to_guest because the former is
ambiguous, e.g. it could mean switching between multiple guests,
switching from the guest to host, etc...
- update_pi_irte => pi_update_irte to allow for matching match the rest
of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
- start_assignment => pi_start_assignment to again follow VMX's posted
interrupt naming scheme, and to provide context for what bit of code
might care about an otherwise undescribed "assignment".
The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave
that change for another time as the Hyper-V hooks always start as NULL,
i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
names requires an astounding amount of churn.
VMX and SVM function names are intentionally left as is to minimize the
diff. Both VMX and SVM will need to rename even more functions in order
to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
inevitable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220128005208.4008533-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-27 19:51:50 -05:00
|
|
|
KVM_X86_OP(flush_tlb_gva)
|
|
|
|
KVM_X86_OP(flush_tlb_guest)
|
KVM: VMX: Reject KVM_RUN if emulation is required with pending exception
Reject KVM_RUN if emulation is required (because VMX is running without
unrestricted guest) and an exception is pending, as KVM doesn't support
emulating exceptions except when emulating real mode via vm86. The vCPU
is hosed either way, but letting KVM_RUN proceed triggers a WARN due to
the impossible condition. Alternatively, the WARN could be removed, but
then userspace and/or KVM bugs would result in the vCPU silently running
in a bad state, which isn't very friendly to users.
Originally, the bug was hit by syzkaller with a nested guest as that
doesn't require kvm_intel.unrestricted_guest=0. That particular flavor
is likely fixed by commit cd0e615c49e5 ("KVM: nVMX: Synthesize
TRIPLE_FAULT for L2 if emulation is required"), but it's trivial to
trigger the WARN with a non-nested guest, and userspace can likely force
bad state via ioctls() for a nested guest as well.
Checking for the impossible condition needs to be deferred until KVM_RUN
because KVM can't force specific ordering between ioctls. E.g. clearing
exception.pending in KVM_SET_SREGS doesn't prevent userspace from setting
it in KVM_SET_VCPU_EVENTS, and disallowing KVM_SET_VCPU_EVENTS with
emulation_required would prevent userspace from queuing an exception and
then stuffing sregs. Note, if KVM were to try and detect/prevent the
condition prior to KVM_RUN, handle_invalid_guest_state() and/or
handle_emulation_failure() would need to be modified to clear the pending
exception prior to exiting to userspace.
------------[ cut here ]------------
WARNING: CPU: 6 PID: 137812 at arch/x86/kvm/vmx/vmx.c:1623 vmx_queue_exception+0x14f/0x160 [kvm_intel]
CPU: 6 PID: 137812 Comm: vmx_invalid_nes Not tainted 5.15.2-7cc36c3e14ae-pop #279
Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
RIP: 0010:vmx_queue_exception+0x14f/0x160 [kvm_intel]
Code: <0f> 0b e9 fd fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
RSP: 0018:ffffa45c83577d38 EFLAGS: 00010202
RAX: 0000000000000003 RBX: 0000000080000006 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000010002 RDI: ffff9916af734000
RBP: ffff9916af734000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000006
R13: 0000000000000000 R14: ffff9916af734038 R15: 0000000000000000
FS: 00007f1e1a47c740(0000) GS:ffff99188fb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1e1a6a8008 CR3: 000000026f83b005 CR4: 00000000001726e0
Call Trace:
kvm_arch_vcpu_ioctl_run+0x13a2/0x1f20 [kvm]
kvm_vcpu_ioctl+0x279/0x690 [kvm]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: syzbot+82112403ace4cbd780d8@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211228232437.1875318-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-28 18:24:36 -05:00
|
|
|
KVM_X86_OP(vcpu_pre_run)
|
KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
Rename a variety of kvm_x86_op function pointers so that preferred name
for vendor implementations follows the pattern <vendor>_<function>, e.g.
rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will
allow vendor implementations to be wired up via the KVM_X86_OP macro.
In many cases, VMX and SVM "disagree" on the preferred name, though in
reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
the kvm_x86_ops name. Justification for using the VMX nomenclature:
- set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
event that has already been "set" in e.g. the vIRR. SVM's relevant
VMCB field is even named event_inj, and KVM's stat is irq_injections.
- prepare_guest_switch => prepare_switch_to_guest because the former is
ambiguous, e.g. it could mean switching between multiple guests,
switching from the guest to host, etc...
- update_pi_irte => pi_update_irte to allow for matching match the rest
of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
- start_assignment => pi_start_assignment to again follow VMX's posted
interrupt naming scheme, and to provide context for what bit of code
might care about an otherwise undescribed "assignment".
The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave
that change for another time as the Hyper-V hooks always start as NULL,
i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
names requires an astounding amount of churn.
VMX and SVM function names are intentionally left as is to minimize the
diff. Both VMX and SVM will need to rename even more functions in order
to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
inevitable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220128005208.4008533-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-27 19:51:50 -05:00
|
|
|
KVM_X86_OP(vcpu_run)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP(handle_exit)
|
|
|
|
KVM_X86_OP(skip_emulated_instruction)
|
|
|
|
KVM_X86_OP_OPTIONAL(update_emulated_instruction)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(set_interrupt_shadow)
|
|
|
|
KVM_X86_OP(get_interrupt_shadow)
|
|
|
|
KVM_X86_OP(patch_hypercall)
|
KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
Rename a variety of kvm_x86_op function pointers so that preferred name
for vendor implementations follows the pattern <vendor>_<function>, e.g.
rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will
allow vendor implementations to be wired up via the KVM_X86_OP macro.
In many cases, VMX and SVM "disagree" on the preferred name, though in
reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
the kvm_x86_ops name. Justification for using the VMX nomenclature:
- set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
event that has already been "set" in e.g. the vIRR. SVM's relevant
VMCB field is even named event_inj, and KVM's stat is irq_injections.
- prepare_guest_switch => prepare_switch_to_guest because the former is
ambiguous, e.g. it could mean switching between multiple guests,
switching from the guest to host, etc...
- update_pi_irte => pi_update_irte to allow for matching match the rest
of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
- start_assignment => pi_start_assignment to again follow VMX's posted
interrupt naming scheme, and to provide context for what bit of code
might care about an otherwise undescribed "assignment".
The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave
that change for another time as the Hyper-V hooks always start as NULL,
i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
names requires an astounding amount of churn.
VMX and SVM function names are intentionally left as is to minimize the
diff. Both VMX and SVM will need to rename even more functions in order
to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
inevitable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220128005208.4008533-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-27 19:51:50 -05:00
|
|
|
KVM_X86_OP(inject_irq)
|
|
|
|
KVM_X86_OP(inject_nmi)
|
KVM: x86: Add support for SVM's Virtual NMI
Add support for SVM's Virtual NMIs implementation, which adds proper
tracking of virtual NMI blocking, and an intr_ctrl flag that software can
set to mark a virtual NMI as pending. Pending virtual NMIs are serviced
by hardware if/when virtual NMIs become unblocked, i.e. act more or less
like real NMIs.
Introduce two new kvm_x86_ops callbacks so to support SVM's vNMI, as KVM
needs to treat a pending vNMI as partially injected. Specifically, if
two NMIs (for L1) arrive concurrently in KVM's software model, KVM's ABI
is to inject one and pend the other. Without vNMI, KVM manually tracks
the pending NMI and uses NMI windows to detect when the NMI should be
injected.
With vNMI, the pending NMI is simply stuffed into the VMCB and handed
off to hardware. This means that KVM needs to be able to set a vNMI
pending on-demand, and also query if a vNMI is pending, e.g. to honor the
"at most one NMI pending" rule and to preserve all NMIs across save and
restore.
Warn if KVM attempts to open an NMI window when vNMI is fully enabled,
as the above logic should prevent KVM from ever getting to
kvm_check_and_inject_events() with two NMIs pending _in software_, and
the "at most one NMI pending" logic should prevent having an NMI pending
in hardware and an NMI pending in software if NMIs are also blocked, i.e.
if KVM can't immediately inject the second NMI.
Signed-off-by: Santosh Shukla <Santosh.Shukla@amd.com>
Co-developed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20230227084016.3368-11-santosh.shukla@amd.com
[sean: rewrite shortlog and changelog, massage code comments]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-27 03:40:15 -05:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(is_vnmi_pending)
|
|
|
|
KVM_X86_OP_OPTIONAL_RET0(set_vnmi_pending)
|
2022-08-30 19:16:00 -04:00
|
|
|
KVM_X86_OP(inject_exception)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(cancel_injection)
|
|
|
|
KVM_X86_OP(interrupt_allowed)
|
|
|
|
KVM_X86_OP(nmi_allowed)
|
|
|
|
KVM_X86_OP(get_nmi_mask)
|
|
|
|
KVM_X86_OP(set_nmi_mask)
|
|
|
|
KVM_X86_OP(enable_nmi_window)
|
|
|
|
KVM_X86_OP(enable_irq_window)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(update_cr8_intercept)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(refresh_apicv_exec_ctrl)
|
2022-02-08 13:08:19 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(hwapic_irr_update)
|
|
|
|
KVM_X86_OP_OPTIONAL(hwapic_isr_update)
|
2022-02-15 13:07:10 -05:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(guest_apic_has_interrupt)
|
2022-02-08 13:08:19 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(load_eoi_exitmap)
|
|
|
|
KVM_X86_OP_OPTIONAL(set_virtual_apic_mode)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(set_apic_access_page_addr)
|
2022-01-27 19:51:48 -05:00
|
|
|
KVM_X86_OP(deliver_interrupt)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(sync_pir_to_irr)
|
2022-02-15 13:07:10 -05:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(set_tss_addr)
|
|
|
|
KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
|
2022-07-14 11:37:07 -04:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(get_mt_mask)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(load_mmu_pgd)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP(has_wbinvd_exit)
|
2021-05-26 14:44:13 -04:00
|
|
|
KVM_X86_OP(get_l2_tsc_offset)
|
|
|
|
KVM_X86_OP(get_l2_tsc_multiplier)
|
2021-05-26 14:44:15 -04:00
|
|
|
KVM_X86_OP(write_tsc_offset)
|
2021-06-07 06:54:38 -04:00
|
|
|
KVM_X86_OP(write_tsc_multiplier)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(get_exit_info)
|
|
|
|
KVM_X86_OP(check_intercept)
|
|
|
|
KVM_X86_OP(handle_exit_irqoff)
|
|
|
|
KVM_X86_OP(sched_in)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(update_cpu_dirty_logging)
|
|
|
|
KVM_X86_OP_OPTIONAL(vcpu_blocking)
|
|
|
|
KVM_X86_OP_OPTIONAL(vcpu_unblocking)
|
|
|
|
KVM_X86_OP_OPTIONAL(pi_update_irte)
|
|
|
|
KVM_X86_OP_OPTIONAL(pi_start_assignment)
|
KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
When running android emulator (which is based on QEMU 2.12) on
certain Intel hosts with kernel version 6.3-rc1 or above, guest
will freeze after loading a snapshot. This is almost 100%
reproducible. By default, the android emulator will use snapshot
to speed up the next launching of the same android guest. So
this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
running command "loadvm" after "savevm". The same issue is
observed. At the same time, none of our AMD platforms is impacted.
More experiments show that loading the KVM module with
"enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
fire timer when it is migrated and expired, and in oneshot mode").
However, as is pointed out by Sean Christopherson, it is introduced
by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
it is migrated and expired, and in oneshot mode") just makes it
easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately
inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
platforms with APIC virtualization and posted interrupt processing,
this eventually leads to setting the corresponding PIR bit. However,
the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of
the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
What vmx_apicv_post_state_restore does is actually clearing any
former apicv state and this behavior is more suitable to carry out
in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Haitao Shan <hshan@google.com>
Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-09-12 19:55:45 -04:00
|
|
|
KVM_X86_OP_OPTIONAL(apicv_pre_state_restore)
|
2022-02-08 13:08:19 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
|
2022-02-15 13:07:10 -05:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(set_hv_timer)
|
|
|
|
KVM_X86_OP_OPTIONAL(cancel_hv_timer)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(setup_mce)
|
2022-09-29 13:20:14 -04:00
|
|
|
#ifdef CONFIG_KVM_SMM
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(smi_allowed)
|
2021-06-09 14:56:19 -04:00
|
|
|
KVM_X86_OP(enter_smm)
|
|
|
|
KVM_X86_OP(leave_smm)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(enable_smi_window)
|
2022-09-29 13:20:14 -04:00
|
|
|
#endif
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
|
|
|
|
KVM_X86_OP_OPTIONAL(mem_enc_register_region)
|
|
|
|
KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
|
|
|
|
KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
|
|
|
|
KVM_X86_OP_OPTIONAL(vm_move_enc_context_from)
|
2022-04-20 23:14:07 -04:00
|
|
|
KVM_X86_OP_OPTIONAL(guest_memory_reclaimed)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(get_msr_feature)
|
2023-08-24 21:36:20 -04:00
|
|
|
KVM_X86_OP(check_emulate_instruction)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(apic_init_signal_blocked)
|
2022-11-01 10:53:43 -04:00
|
|
|
KVM_X86_OP_OPTIONAL(enable_l2_tlb_flush)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(migrate_timers)
|
2021-01-14 22:27:55 -05:00
|
|
|
KVM_X86_OP(msr_filter_changed)
|
2021-12-09 08:12:28 -05:00
|
|
|
KVM_X86_OP(complete_emulated_msr)
|
2022-01-27 19:51:51 -05:00
|
|
|
KVM_X86_OP(vcpu_deliver_sipi_vector)
|
2022-03-22 13:40:49 -04:00
|
|
|
KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
|
2023-09-13 08:42:19 -04:00
|
|
|
KVM_X86_OP_OPTIONAL(get_untagged_addr)
|
KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
Implement a workaround for an SNP erratum where the CPU will incorrectly
signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the
RMP entry of a VMCB, VMSA or AVIC backing page.
When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
backing pages as "in-use" via a reserved bit in the corresponding RMP
entry after a successful VMRUN. This is done for _all_ VMs, not just
SNP-Active VMs.
If the hypervisor accesses an in-use page through a writable
translation, the CPU will throw an RMP violation #PF. On early SNP
hardware, if an in-use page is 2MB-aligned and software accesses any
part of the associated 2MB region with a hugepage, the CPU will
incorrectly treat the entire 2MB region as in-use and signal a an RMP
violation #PF.
To avoid this, the recommendation is to not use a 2MB-aligned page for
the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure
that the page returned is not 2MB-aligned and is safe to be used when
SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA
pages of nested guests.
[ mdr: Squash in nested guest handling from Ashish, commit msg fixups. ]
Reported-by: Alper Gun <alpergun@google.com> # for nested VMSA case
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-22-michael.roth@amd.com
2024-01-25 23:11:21 -05:00
|
|
|
KVM_X86_OP_OPTIONAL(alloc_apic_backing_page)
|
2021-01-14 22:27:55 -05:00
|
|
|
|
|
|
|
#undef KVM_X86_OP
|
2021-12-09 08:12:28 -05:00
|
|
|
#undef KVM_X86_OP_OPTIONAL
|
2022-02-15 13:07:10 -05:00
|
|
|
#undef KVM_X86_OP_OPTIONAL_RET0
|