AIO(All in Boom[爆炸就是艺术])现在很流行ESXI或者PVE, 我不太喜欢索性直接用Rocky Linux做宿主, 使用SSH + Cockpit Web控制台管理, 安装Rocky Linux很简单这里就不再赘述.


网卡直通这个一点不奇怪,不过大多数是用vfio-pci的ids来设置PCI设备的唯一标识符, 但是有个问题, 如果是多张卡或者多口卡那么就会全部被替代, 另外插PCIe插槽要注意, 只有PCH通道才能全部拆分IOMMU组, 如果是CPU通道, 那么大概率会跟多个设备一起被编在同一个IOMMU组.
使用命令行脚本查看IOMMU的组别情况, 如果我们要直通的设备并不是独立组, 那就无法直通了.
1 2 3 4 5 6 7 8 9 | # for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done; IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07) IOMMU Group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 07) IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP108 [GeForce GT 1030] [10de:1d01] (rev a1) IOMMU Group 1 01:00.1 Audio device [0403]: NVIDIA Corporation GP108 High Definition Audio Controller [10de:0fb8] (rev a1) IOMMU Group 1 02:00.0 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) IOMMU Group 1 02:00.1 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) IOMMU Group 1 02:00.2 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) IOMMU Group 1 02:00.3 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) |
这个是插在CPU通道的情况,可以看到四口Intel 82580千兆网卡是在同一个IOMMU组, 这样就无法进行直通的, 我们在看一下插在PCH通道上的X710-DA4万兆网卡.
1 2 3 4 5 6 7 | # for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done; IOMMU Group 0 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:1918] (rev 07) ... IOMMU Group 20 0a:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 02) IOMMU Group 21 0a:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 02) IOMMU Group 22 0a:00.2 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 02) IOMMU Group 23 0a:00.3 Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 02) |
这里可以看到0a:00.0、0a:00.1、0a:00.2、0a:00.3分别是X710-DA4的D、C、B、A, 是的PCI插槽序号正好跟光口的顺序相反, 他们的IOMMU组已经是独立的编号, 这样就可以分别直通, 记下需要直通的ID,我们这里是C、B、A三口, 即0000:0a:00.1 0000:0a:00.2 0000:0a:00.3。
建立启动脚本的dracut配置
1 2 3 4 5 | # mkdir -p /usr/lib/dracut/modules.d/99vfio-pci # touch /usr/lib/dracut/modules.d/99vfio-pci/module-setup.sh # chmod +x /usr/lib/dracut/modules.d/99vfio-pci/module-setup.sh # vi /usr/lib/dracut/modules.d/99vfio-pci/module-setup.sh |
将dracut的配置代码输入进去.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #!/bin/bash check() { if [ -d "/sys/module/vfio_pci" ]; then return 0 else return 1 fi } depends() { return 0 } install() { inst_hook initqueue/start 05 declare moddir = \${moddir} inst_hook pre-udev 00 "\${moddir}/vfio-pci-init-script.sh" } |
编辑自动启动脚本
1 | # vi /usr/lib/dracut/modules.d/99vfio-pci/vfio-pci-init-script.sh |
1 2 3 4 5 6 7 | #!/bin/sh DEVS= "0000:0a:00.1 0000:0a:00.2 0000:0a:00.3" for DEV in $DEVS ; do echo "vfio-pci" > /sys/bus/pci/devices/ $DEV /driver_override echo $DEV > /sys/bus/pci/drivers/vfio-pci/bind done modprobe -i vfio-pci |
这里的DEVS填写我们需要直通的PCI插槽序号, 即之前查看IOMMU组看到的序号.
1 2 3 4 | ; 设置可执行 # chmod +x /usr/lib/dracut/modules.d/99vfio-pci/vfio-pci-init-script.sh ; 重新生成Linux初始内存文件系统(initramfs) # dracut /boot/initramfs-$(uname -r).img $(uname -r) --force |
重启后查看vfio_pci设备信息, 可以看到驱动已经加载成功.
1 2 3 4 5 6 7 8 9 10 11 | # ls /sys/bus/pci/drivers/vfio-pci -l 总用量 0 lrwxrwxrwx. 1 root root 0 3月 7 22:09 0000:0a:00.1 -> ../../../../devices/pci0000:00/0000:00:1d.0/0000:0a:00.1 lrwxrwxrwx. 1 root root 0 3月 7 22:09 0000:0a:00.2 -> ../../../../devices/pci0000:00/0000:00:1d.0/0000:0a:00.2 lrwxrwxrwx. 1 root root 0 3月 7 22:09 0000:0a:00.3 -> ../../../../devices/pci0000:00/0000:00:1d.0/0000:0a:00.3 --w-------. 1 root root 4096 3月 4 23:01 bind lrwxrwxrwx. 1 root root 0 3月 7 22:09 module -> ../../../../module/vfio_pci --w-------. 1 root root 4096 3月 7 22:09 new_id --w-------. 1 root root 4096 3月 7 22:09 remove_id --w-------. 1 root root 4096 3月 7 22:09 uevent --w-------. 1 root root 4096 3月 7 22:09 unbind |
获取需要直通PCIe设备的详细信息.
1 2 3 4 5 6 7 8 | # virsh nodedev-list --tree |grep pci ... +- pci_0000_00_1d_0 | +- pci_0000_0a_00_0 | +- pci_0000_0a_00_1 | +- pci_0000_0a_00_2 | +- pci_0000_0a_00_3 ... |
Dump出需要直通的设备信息, pci_0000_0a_00_3 和 pci_0000_0a_00_2, 即X710的A口和B口.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | # virsh nodedev-dumpxml pci_0000_0a_00_3 <device> <name>pci_0000_0a_00_3</name> <path>/sys/devices/pci0000:00/0000:00:1d.0/0000:0a:00.3</path> <parent>pci_0000_00_1d_0</parent> <driver> <name>vfio-pci</name> </driver> <capability type= 'pci' > <class>0x020000</class> <domain>0</domain> <bus>10</bus> <slot>0</slot> < function >3</ function > <product id= '0x1572' >Ethernet Controller X710 for 10GbE SFP+</product> <vendor id= '0x8086' >Intel Corporation</vendor> <capability type= 'virt_functions' maxCount= '32' /> <iommuGroup number= '23' > <address domain= '0x0000' bus= '0x0a' slot= '0x00' function = '0x3' /> </iommuGroup> <pci-express> <link validity= 'cap' port= '0' speed= '8' width= '4' /> <link validity= 'sta' speed= '8' width= '4' /> </pci-express> </capability> </device> # virsh nodedev-dumpxml pci_0000_0a_00_2 <device> <name>pci_0000_0a_00_2</name> <path>/sys/devices/pci0000:00/0000:00:1d.0/0000:0a:00.2</path> <parent>pci_0000_00_1d_0</parent> <driver> <name>vfio-pci</name> </driver> <capability type= 'pci' > <class>0x020000</class> <domain>0</domain> <bus>10</bus> <slot>0</slot> < function >2</ function > <product id= '0x1572' >Ethernet Controller X710 for 10GbE SFP+</product> <vendor id= '0x8086' >Intel Corporation</vendor> <capability type= 'virt_functions' maxCount= '32' /> <iommuGroup number= '22' > <address domain= '0x0000' bus= '0x0a' slot= '0x00' function = '0x2' /> </iommuGroup> <pci-express> <link validity= 'cap' port= '0' speed= '8' width= '4' /> <link validity= 'sta' speed= '8' width= '4' /> </pci-express> </capability> </device> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ; 下载OpenWrt x86 ext4固件 ; 解压缩 # gzip -d openwrt-24.10.0-x86-64-generic-ext4-combined.img.gz gzip: openwrt-24.10.0-x86-64-generic-ext4-combined.img.gz: decompression OK, trailing garbage ignored ; 将img镜像转换成qcow2硬盘镜像 # qemu-img convert -p -f raw -O qcow2 openwrt-24.10.0-x86-64-generic-ext4-combined.img openwrt-24.10.0-x86-64-generic-ext4-combined.qcow2 ; 修改容量到20G # qemu-img resize openwrt-24.10.0-x86-64-generic-ext4-combined.qcow2 20G ; 填充扩展容量到真实大小 # qemu-img resize -f raw openwrt-24.10.0-x86-64-generic-ext4-combined.qcow2 20G |
编写XML openwrt.xml 来定义一台新的虚拟机.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | # vi openwrt.xml <domain type= 'kvm' > <name>openwrt_default_gateway</name> <memory unit= 'MiB' >2048</memory> <currentMemory unit= 'MiB' >2048</currentMemory> <vcpu>2</vcpu> <os> <type arch= 'x86_64' machine= 'pc-i440fx-8.2' >hvm</type> <boot dev= 'hd' /> </os> <features> <acpi/> <apic/> </features> <clock offset= 'utc' /> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type= 'file' device= 'disk' > <driver name= 'qemu' type= 'qcow2' /> <source file= '/data/libvirt/disk/openwrt-24.10.0-x86-64-generic-ext4-combined.qcow2' /> <target dev= 'vda' bus= 'virtio' /> </disk> <hostdev mode= 'subsystem' type= 'pci' managed= 'yes' > <driver name= 'vfio' /> <source> <address domain= '0x0000' bus= '0x0a' slot= '0x00' function = '0x3' /> </source> </hostdev> <hostdev mode= 'subsystem' type= 'pci' managed= 'yes' > <driver name= 'vfio' /> <source> <address domain= '0x0000' bus= '0x0a' slot= '0x00' function = '0x2' /> </source> </hostdev> <interface type= 'network' > <source network= 'v10gswitch' /> <model type= 'virtio' /> </interface> <controller type= 'usb' index= '0' > <address type= 'pci' domain= '0x0000' bus= '0x00' solt= '0x02' function = '0x0' /> </controller> <input type= 'mouse' bus= 'ps2' /> <controller type= 'ide' index= '0' > <address type= 'pci' domain= '0x0000' bus= '0x00' solt= '0x01' function = '0x0' /> </controller> <graphics type= 'vnc' port= '-1' autoport= 'yes' listen= '0.0.0.0' > <listen type= 'address' address= '0.0.0.0' /> </graphics> </devices> </domain> |
创建虚拟机
1 | # virsh define openwrt.xml |
由于X710的驱动i40e在OpenWrt默认是没有的, 需要手动下载并拷贝进虚拟机, 否则因为没网络而无法继续.
拷贝 i40e 驱动到虚拟机中
1 2 3 4 5 | # virt-copy-in -d openwrt_default_gateway kmod-i40e_6.6.73-r1_x86_64.ipk /root ; 如果提示 bash: virt- copy-in :未找到命令 则安装软件包. # dnf install -y libguestfs-tools ; 如果提示 libguestfs: error: stat: /usr/libexec/qemu-kvm: 没有那个文件或目录 则做一下软连接. # ln -s $(which qemu-kvm) /usr/libexec/qemu-kvm |
在虚拟机中运行
1 | # opkg kmod-i40e_6.6.73-r1_x86_64.ipk |
在虚拟机中运行自动扩容脚本, OpenWrt官网有详细介绍, 见 https://openwrt.org/docs/guide-user/advanced/expand_root
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | # 直接配置自动扩容脚本 cat << "EOF" > /etc/uci-defaults/70-rootpt-resize if [ ! -e /etc/rootpt-resize ] \ && type parted > /dev/null \ && lock -n /var/lock/root-resize then ROOT_BLK= "$(readlink -f /sys/dev/block/" $(awk -e \ '$9=="/dev/root"{print $3}' /proc/self/mountinfo) ")" ROOT_DISK= "/dev/$(basename " ${ROOT_BLK%/*} ")" ROOT_PART= "${ROOT_BLK##*[^0-9]}" parted -f -s "${ROOT_DISK}" \ resizepart "${ROOT_PART}" 100% mount_root done touch /etc/rootpt-resize reboot fi exit 1 EOF cat << "EOF" > /etc/uci-defaults/80-rootfs-resize if [ ! -e /etc/rootfs-resize ] \ && [ -e /etc/rootpt-resize ] \ && type losetup > /dev/null \ && type resize2fs > /dev/null \ && lock -n /var/lock/root-resize then ROOT_BLK= "$(readlink -f /sys/dev/block/" $(awk -e \ '$9=="/dev/root"{print $3}' /proc/self/mountinfo) ")" ROOT_DEV= "/dev/${ROOT_BLK##*/}" LOOP_DEV= "$(awk -e '$5==" /overlay "{print $9}' \ /proc/self/mountinfo)" if [ -z "${LOOP_DEV}" ] then LOOP_DEV= "$(losetup -f)" losetup "${LOOP_DEV}" "${ROOT_DEV}" fi resize2fs -f "${LOOP_DEV}" mount_root done touch /etc/rootfs-resize reboot fi exit 1 EOF cat << "EOF" >> /etc/sysupgrade.conf /etc/uci-defaults/70-rootpt-resize /etc/uci-defaults/80-rootfs-resize EOF |
1 2 3 4 5 6 7 8 9 | ; 或者用wget获取自动扩容脚本 # wget -U "" -O expand-root.sh "https://openwrt.org/_export/code/docs/guide-user/advanced/expand_root?codeblock=0" ; 执行扩容脚本 # . ./expand-root.sh ; 安装需要的软件包 # opkg update # opkg install parted losetup resize2fs ; 开始进行扩容, 这时候会自动重启, 等待扩容完毕即可. # sh /etc/uci-defaults/70-rootpt-resize |
至此OpenWrt已经创建完毕, 剩下就是更新源、安装简体中文语言支持以及安装需要的功能了.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | # opkg update # opkg install luci-i18n-base-zh-cn install luci-i18n-package-manager-zh-cn ; WOL网卡唤醒APP # opkg install luci-i18n-wol-zh-cn ; 络流量监视器,它使用内核提供的网络接口统计信息 # opkg install luci-i18n-vnstat2-zh-cn ; 通用即插即用UPnP(端口自动转发) # opkg install luci-i18n-upnp-zh-cn ; BT下载工具 # opkg install luci-i18n-transmission-zh-cn ; 流量监控工具 # opkg install luci-i18n-statistics-zh-cn ; 网络共享(Samba4) # opkg install luci-i18n-samba4-zh-cn ; MWAN3负载均衡 # opkg install luci-i18n-mwan3-zh-cn ; 网页文件管理器 # opkg install luci-i18n-filebrowser-zh-cn ; 动态域名 DNS # opkg install luci-i18n-ddns-zh-cn ; Aria2下载工具 # opkg install luci-i18n-aria2-zh-cn ; BanIP # opkg install luci-i18n-banip-zh-cn |
最后秀一把深水宝的双模光模块。
