Notes on Linux#
1 June, 2026 Issue with AMD GPU Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]#
When doing VFIO passthrough this can sometimes happen:
(...)
amdgpu 0000:03:00.0: failed to read discovery info from memory, vram size read: 0
amdgpu 0000:03:00.0: [drm] *ERROR* discovery failed: -2
amdgpu 0000:03:00.0: Fatal error during GPU init
amdgpu 0000:03:00.0: probe with driver amdgpu failed with error -2
^^^^^^^ dedicated GPU
amdgpu 0000:12:00.0: enabling device (0006 -> 0007)
amdgpu 0000:12:00.0: initializing kernel modesetting (IP DISCOVERY 0x1002:0x164E 0x1849:0x364E 0xC1).
^^^^^^^ integrated GPU
(...)The card is in a bad state and needs to be recovered, but the current kernel drivers do not do it. When re-attaching it again on the host it may cause system hang:
(...)
kernel: pci 0000:03:00.0: [1002:73bf] type 00 class 0x030000 PCIe Legacy Endpoint
kernel: pci 0000:03:00.0: BAR 0 [mem 0x00000000-0x0fffffff 64bit pref]
kernel: pci 0000:03:00.0: BAR 2 [mem 0x00000000-0x001fffff 64bit pref]
kernel: pci 0000:03:00.0: BAR 4 [io 0x0000-0x00ff]
kernel: pci 0000:03:00.0: BAR 5 [mem 0x00000000-0x000fffff]
kernel: pci 0000:03:00.0: ROM [mem 0x00000000-0x0001ffff pref]
kernel: pci 0000:03:00.0: PME# supported from D1 D2 D3hot D3cold
kernel: pci 0000:03:00.0: Adding to iommu group 14
kernel: pci 0000:03:00.0: vgaarb: bridge control possible
kernel: pci 0000:03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
kernel: pci 0000:03:00.0: BAR 0 [mem 0xf800000000-0xf80fffffff 64bit pref]: assigned
kernel: pci 0000:03:00.0: BAR 2 [mem 0xf810000000-0xf8101fffff 64bit pref]: assigned
kernel: pci 0000:03:00.0: BAR 5 [mem 0xf6800000-0xf68fffff]: assigned
kernel: pci 0000:03:00.0: ROM [mem 0xf6900000-0xf691ffff pref]: assigned
kernel: pci 0000:03:00.0: BAR 4 [io 0xf000-0xf0ff]: assigned
kernel: amdgpu 0000:03:00.0: enabling device (0000 -> 0003)
kernel: amdgpu 0000:03:00.0: initializing kernel modesetting (SIENNA_CICHLID 0x1002:0x73BF 0x148C:0x2406 0xC1).
kernel: amdgpu 0000:03:00.0: register mmio base: 0xF6800000
kernel: amdgpu 0000:03:00.0: register mmio size: 1048576
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: amdgpu 0000:03:00.0: trn=2 ACK should not assert! wait again !
kernel: xgpu_nv_mailbox_trans_msg: 2271 callbacks suppressedIn some instances this can happen:
(...)
[drm:amdgpu_preempt_mgr_init [amdgpu]] *ERROR* Failed to create device file mem_info_preempt_used
[drm:amdgpu_ttm_init.cold [amdgpu]] *ERROR* Failed initializing PREEMPT heap.
[drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <gmc_v10_0> failed -17
(...)
[drm:amdgpu_preempt_mgr_init [amdgpu]] *ERROR* Failed to create device file mem_info_preempt_used
amdgpu 0000:03:00.0: Failed initializing PREEMPT heap.
amdgpu 0000:03:00.0: sw_init of IP block <gmc_v10_0> failed -17
amdgpu 0000:03:00.0: amdgpu_device_ip_init failed
amdgpu 0000:03:00.0: Fatal error during GPU init
amdgpu 0000:03:00.0: finishing device.
amdgpu 0000:03:00.0: probe with driver amdgpu failed with error -17
BUG: kernel NULL pointer dereference, address: 0000000000000050
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: Oops: 0002 [#1] SMP NOPTI
CPU: 3 UID: 0 PID: 25412 Comm: rpc-libvirtd Tainted: G OE 7.0.10-arch1-1 #1 PREEMPT(full) b38726df0ec1c5aec6f05d4eab858505a5944d02
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: ASRock B650M Pro RS/B650M Pro RS, BIOS 3.08 09/18/2024
RIP: 0010:_raw_spin_lock+0x17/0x30
Code: 0b 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 ff 05 b0 29 f6 01 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 c3 cc cc cc cc 89 c6 e8 f7 01 00 00 90 c3 cc cc
RSP: 0018:ffffd144ca41fa38 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8e8955112480
RDX: 0000000000000001 RSI: ffffffff86ac8f80 RDI: 0000000000000050
RBP: ffff8e8f52adcb18 R08: 0000000000000000 R09: ffffffffc1804e16
R10: fffff70845544480 R11: ffff8e8900042200 R12: ffff8e8f52adcb18
R13: ffff8e8902fbb0d0 R14: ffffffff85f5e4ff R15: 0000000000000000
FS: 00007f68b37fe6c0(0000) GS:ffff8e9877627000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 000000048ca5f000 CR4: 0000000000f50ef0
PKRU: 55555554
Call Trace:
<TASK>
amdgpu_discovery_fini+0x39/0x290 [amdgpu f9977e37bbc2a3aa5df6ace61c7932d7dd0975b9]
? free_vmap_area_noflush+0xe7/0x1e0
amdgpu_device_fini_sw+0x337/0x4e0 [amdgpu f9977e37bbc2a3aa5df6ace61c7932d7dd0975b9]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu f9977e37bbc2a3aa5df6ace61c7932d7dd0975b9]
devm_drm_dev_init_release+0x4d/0x80
release_nodes+0x5d/0x100
devres_release_all+0x9e/0xf0
device_unbind_cleanup+0xe/0xa0
really_probe+0x231/0x3a0
? __pm_runtime_resume+0x5f/0x90
? __pfx___device_attach_driver+0x10/0x10
__driver_probe_device+0x8b/0x190
driver_probe_device+0x1f/0xa0
__device_attach_driver+0x7e/0x120
bus_for_each_drv+0xa0/0x100
__device_attach+0xb0/0x1c0
bus_rescan_devices_helper+0x3c/0x90
drivers_probe_store+0x3c/0x80
kernfs_fop_write_iter+0x171/0x220
vfs_write+0x281/0x570
ksys_write+0x7b/0x110
do_syscall_64+0x119/0x1640
? do_getname+0x62/0x1d0
? dput.part.0+0x27/0x110
? do_readlinkat+0xcd/0x190
? __x64_sys_readlink+0x1e/0x30
? do_syscall_64+0x119/0x1640
? exc_page_fault+0x90/0x1f0
? __irq_exit_rcu+0x4c/0xf0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f68b86a0a52
Code: 08 0f 85 c1 3f ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 bf 01 00
RSP: 002b:00007f68b37fd418 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000001d RCX: 00007f68b86a0a52
RDX: 000000000000000c RSI: 00007f68a404e920 RDI: 000000000000001d
RBP: 00007f68b37fd440 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007f68a404e920 R14: 000000000000001d R15: 0000000000000000
</TASK>The solution is to use the vendor-reset kernel module. Additional details in this Reddit thread.
Arch Linux users have a handy AUR package which builds the out-of-tree kernel module with DKMS: vendor-reset-git
18 May, 2026 Issue with TPM2 and libvirtd.service on Arch Linux#
For some reason this happened on a boot after abrupt system crash:
~ # journalctl -u libvirtd
(...)
systemd[1]: Starting libvirt legacy monolithic daemon...
libvirtd[25267]: WARNING:esys:src/tss2-esys/api/Esys_Load.c:324:Esys_Load_Finish() Received TPM Error
libvirtd[25267]: ERROR:esys:src/tss2-esys/api/Esys_Load.c:112:Esys_Load() Esys Finish ErrorCode (0x00000921)
(libvirtd)[25267]: libvirtd.service: Failed to unseal secret using TPM2: No locks available
(libvirtd)[25267]: libvirtd.service: Failed to set up credentials: No locks available
(libvirtd)[25267]: libvirtd.service: Failed at step CREDENTIALS spawning /usr/bin/libvirtd: No locks available
systemd[1]: libvirtd.service: Main process exited, code=exited, status=243/CREDENTIALS
systemd[1]: libvirtd.service: Failed with result 'exit-code'.
systemd[1]: Failed to start libvirt legacy monolithic daemon.
(...)This is the same issue as in this Reddit thread: https://old.reddit.com/r/VFIO/comments/1rlxbnc/tpm_key_integrity_check_failed_following_vm_crash/
The main culprit is virt-secret-init-encryption.service trying to use systemd-creds:
~ # journalctl -u virt-secret-init-encryption
systemd[1]: Starting virt-secret-init-encryption.service...
sh[17582]: WARNING:esys:src/tss2-esys/api/Esys_Create.c:399:Esys_Create_Finish() Received TPM Error
sh[17582]: ERROR:esys:src/tss2-esys/api/Esys_Create.c:134:Esys_Create() Esys Finish ErrorCode (0x00000921)
systemd-creds[17582]: TPM2 sealing didn't work, continuing without TPM2: State not recoverable
sh[17582]: systemd-creds: ../systemd/src/shared/creds-util.c:987: encrypt_credential_and_warn: Assertion `n_blobs == 1' failed.
systemd-coredump[17585]: [🡕] Process 17582 (systemd-creds) of user 0 dumped core.
Stack trace of thread 17582:
#0 0x00007b8da929a29c n/a (libc.so.6 + 0x9a29c)
#1 0x00007b8da923e7d0 raise (libc.so.6 + 0x3e7d0)
#2 0x00007b8da9225681 abort (libc.so.6 + 0x25681)
#3 0x00007b8da9226700 n/a (libc.so.6 + 0x26700)
#4 0x00007b8da9236532 __assert_fail (libc.so.6 + 0x36532)
#5 0x00007b8da967bc6b n/a (libsystemd-shared-260.1-2.so + 0x7bc6b)
#6 0x000056335dc8e776 n/a (/usr/bin/systemd-creds + 0x5776)
#7 0x000056335dc8c67f n/a (/usr/bin/systemd-creds + 0x367f)
#8 0x00007b8da9227741 n/a (libc.so.6 + 0x27741)
#9 0x00007b8da9227879 __libc_start_main (libc.so.6 + 0x27879)
#10 0x000056335dc8c8e5 n/a (/usr/bin/systemd-creds + 0x38e5)
ELF object binary architecture: AMD x86-64
systemd[1]: virt-secret-init-encryption.service: Main process exited, code=exited, status=134/n/a
systemd[1]: virt-secret-init-encryption.service: Failed with result 'exit-code'.
systemd[1]: Failed to start virt-secret-init-encryption.service.However, none of the commands succeeded apart from some of these:
~ # tpm2_shutdown --clear
~ # echo 5 | tee /sys/class/tpm/tpm0/ppi/request
~ # tpm2_dictionarylockout -Tdevice:/dev/tpmrm0 --setup-parameters --max-tries=5 --clear-lockoutAfter all that restarting virt-secret-init-encryption.service, virtsecretd.service and libvirtd.service should be successful.
25 March, 2026 Setting up Cloud-Init with Proxmox#
~ # wget 'https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/cloud/nocloud_alpine-3.23.3-x86_64-uefi-cloudinit-r0.qcow2' \
&& qemu-img resize ./*alpine*.qcow2 2G
~ # qm create 8002 --name "alpine-3-nocloud-template" --ostype l26 \
--cpu host --socket 1 --cores $(nproc) --numa 1 --memory 2096 \
--bios ovmf --machine q35 --efidisk0 local-zfs:0,pre-enrolled-keys=0 \
--scsihw virtio-scsi-pci --virtio0 local-zfs:0,discard=on,import-from=$(pwd)/nocloud_alpine-3.23.3-x86_64-uefi-cloudinit-r0.qcow2 \
--boot order=virtio0 --scsi1 local-zfs:cloudinit \
--net0 virtio,bridge=vmbr0 --vga serial0 --serial0 socket \
--agent 1
~ # qm set 8002 --tags alpine-template,3,cloudinit \
--ipconfig0 ip=dhcp \
--ciuser <user> --cipassword <password> --sshkeys ~/.ssh/authorized_keys