Compiling a custom kernel on Proxmox
Why would you build your own kernel?
Because I needed two patches that aren’t in the stock Proxmox kernel, plus a ZFS commit that hadn’t made it into a release yet. I was trying to pass a GPU and a couple of onboard USB controllers through to a VM on my Ryzen box (elysium, the 2700X on a Crosshair VII Hero), and second-gen Ryzen has opinions about that. The full why (IOMMU groups, the reset bug) is its own post: VFIO passthrough on 2nd-gen Ryzen. This one is how to build it.
Every “compile your own kernel” guide I found assumes you want a vanilla mainline kernel. I didn’t. I wanted the Proxmox kernel (pve-kernel), with all of Proxmox’s patches intact, plus a few of mine bolted on. What I tripped on wasn’t in any of the generic guides.
Getting the source
Proxmox keeps its kernel build tree in its own git repo:
git://git.proxmox.com/git/pve-kernel.git
Clone that onto the host. The repo doesn’t contain the kernel source itself. It’s a build harness plus a submodules/ directory and a patches/ directory. The actual Ubuntu kernel source comes in as a git submodule (for the Proxmox 6.x line I was on, ubuntu-focal), and make submodule is supposed to fetch it for you.
While I was in here I also wanted a ZFS fix sitting in openzfs master but not yet in a tagged release: this commit from openzfs/zfs.
Trap #1: the submodule that won’t fetch
Ran make submodule:
root@elysium:/usr/src/pve-kernel# make submodule
test -f "submodules/ubuntu-focal/README" || git submodule update --init submodules/ubuntu-focal
fatal: the remote end hung up unexpectedly
Fetched in submodule path 'submodules/ubuntu-focal', but it did not contain 51ee04d9d3b464e9aa8509013779491f0b001ebc. Direct fetching of that commit failed.
make: *** [Makefile:120: submodule] Error 1
Git fetched something from the submodule’s pinned remote, just not the specific commit (51ee04d9d3b4...) the build wants. “Direct fetching of that commit failed” usually means the remote has had that ref garbage-collected or shuffled, and a shallow submodule update can’t reach it.
Fix: skip the submodule machinery and clone ubuntu-focal straight from the Ubuntu kernel git, where that commit lives:
git clone git://kernel.ubuntu.com/ubuntu/ubuntu-focal.git
The commit is reachable there: here in the upstream tree. Drop the full clone into submodules/ubuntu-focal, check out the pinned commit, and make submodule stops whining because the README test passes and the tree’s already there.
Adding your patches
Proxmox applies a numbered series of patches from patches/kernel/ during the build, so adding your own is just dropping a 00NN-something.patch in there. Mine:
root@elysium:~# cat /usr/src/pve-kernel/patches/kernel/000
0001-Make-mkcompile_h-accept-an-alternate-timestamp-strin.patch 0005-Revert-KVM-VMX-enable-nested-virtualization-by-defau.patch
0002-bridge-keep-MAC-of-first-assigned-port.patch 0006-Revert-scsi-lpfc-Fix-broken-Credit-Recovery-after-dr.patch
0003-pci-Enable-overrides-for-missing-ACS-capabilities-4..patch 0007-cgroup-fix-cgroup_sk_alloc-for-sk_clone_lock.patch
0004-kvm-disable-default-dynamic-halt-polling-growth.patch 0008-VCiYJ-ryzen-pci.patch
0001 through 0007 are Proxmox’s own (the ACS-override patch at 0003 ships with the pve-kernel already; I just needed it to stay). The one I added is 0008, the Ryzen PCI reset patch. Both 0003 and 0008 exist for VFIO reasons. Gory detail on what each one fixes is in the VFIO post. They’re plain patch files next to Proxmox’s and the build picks them up.
Trap #2: the version string keeps growing a “+”
Building a kernel out of a git tree makes the build append a + to the version string (and -dirty if your tree has uncommitted changes), because the scripts sniff the git state. So instead of 5.4.something-pve you get 5.4.something-pve+, and that suffix leaks into uname -r, the modules directory name, and GRUB entries. Your modules land in /lib/modules/<version>+/, the kernel goes looking in /lib/modules/<version>/, nothing lines up. Great.
Exact problem in this StackOverflow thread. The fix is to explicitly blank out LOCALVERSION:
LOCALVERSION= make
Empty value, not unset. With LOCALVERSION= in the make environment, the suffix logic gets a defined-but-empty string and stops tacking on the +. Clean version string, modules path matches, the .debs install.
Build, install, reboot
From there it’s a normal Proxmox kernel build. Let make chew through it (go make coffee, or in my case go stare at the IOMMU groups some more), and you get a set of pve-kernel-*.deb packages. dpkg -i them, update GRUB if it didn’t already, pin the new kernel as default, reboot into it. uname -r should show your clean version string with no rogue +.
Why I patched a kernel just to plug a USB controller into a VM: the VFIO post.