Proxmox VE pve-firewall : status update error: iptables_restore_cmdlist : Try `iptables-restore -h'
我们在近期运营中,PVE节点开通的VM客户反馈了一些异常情况:
1、PVE VM启用firewall/ macfilter/ ipfilter,ipset rules正确,但没有网络,重装系统/ 防火墙关闭也无法恢复;但在PVE VM网卡停用firewall网络即恢复。
2、PVE VM网络正常,但firewall rules不生效。
3、pve-firewall出现错误: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
对于第1个问题,通过简单的排错可以很明确知道是firewall异常,但在PVE GUI发现防火墙处于运行状态。登入SSH运行:
iptables-save -c | grep 2804(This is vmid)
找到异常VM的iptables规则:
[0:0] -A tap2804i0-OUT -m mac ! --mac-source bc:24:11:b8:97:88 -j DROP
tap2804i0中,2804是vmid,i0是第0个网卡;根据这个信息,对比iptables规则与vm的mac确认了mac是不匹配的。由此可以确认firewall已经异常,无法正常更新iptables rules。(tip:pve-firewall并不是独立组件,它最终会生成命令载入iptables。)
如果我们简单pve-firewall restart,此时VM网络即恢复正常。看似一切正常,但第2个问题即出现,所有防火墙规则失效。当我再次执行iptables-save -c,返回已经只剩下:
~# iptables-save -c
# Generated by iptables-save v1.8.9 on Wed Aug 21 19:00:45 2024
*raw
:PREROUTING ACCEPT [348735666585:213325671547273]
:OUTPUT ACCEPT [4591886294:3380972084311]
COMMIT
# Completed on Wed Aug 21 19:00:45 2024
# Generated by iptables-save v1.8.9 on Wed Aug 21 19:00:45 2024
*filter
:INPUT ACCEPT [6813060:3638198461]
:FORWARD ACCEPT [215129855:125354233180]
:OUTPUT ACCEPT [5768979:4211817661]
COMMIT
这是因为pve-firewall生成的iptables rules存在错误,无法被载入到iptables。执行systemctl status pvefw-logger pve-firewall可以看到类似的错误日志:
root@testnode:~# systemctl status pvefw-logger pve-firewall
● pvefw-logger.service - Proxmox VE firewall logger
Loaded: loaded (/lib/systemd/system/pvefw-logger.service; enabled; preset: enabled)
Active: active (running) since Wed 2024-08-21 00:00:09 HKT; 19h ago
Main PID: 1649446 (pvefw-logger)
Tasks: 2 (limit: 629145)
Memory: 444.0K
CPU: 8.328s
CGroup: /system.slice/pvefw-logger.service
└─1649446 /usr/sbin/pvefw-logger
Aug 21 00:00:09 testnode systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Aug 21 00:00:09 testnode pvefw-logger[1649446]: starting pvefw logger
Aug 21 00:00:09 testnode systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
● pve-firewall.service - Proxmox VE firewall
Loaded: loaded (/lib/systemd/system/pve-firewall.service; enabled; preset: enabled)
Active: active (running) since Wed 2024-08-21 19:07:43 HKT; 16min ago
Process: 4058179 ExecStartPre=/usr/bin/update-alternatives --set ebtables /usr/sbin/ebtables-legacy (code=exited, status=0/SUCCESS)
Process: 4058181 ExecStartPre=/usr/bin/update-alternatives --set iptables /usr/sbin/iptables-legacy (code=exited, status=0/SUCCESS)
Process: 4058182 ExecStartPre=/usr/bin/update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy (code=exited, status=0/SUCCESS)
Process: 4058183 ExecStart=/usr/sbin/pve-firewall start (code=exited, status=0/SUCCESS)
Process: 4088640 ExecReload=/usr/sbin/pve-firewall restart (code=exited, status=0/SUCCESS)
Main PID: 4058255 (pve-firewall)
Tasks: 1 (limit: 629145)
Memory: 117.2M
CPU: 1min 52.846s
CGroup: /system.slice/pve-firewall.service
└─4058255 pve-firewall
Aug 21 19:21:32 testnode pve-firewall[4058255]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 21 19:21:42 testnode pve-firewall[4058255]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Aug 21 19:21:52 testnode pve-firewall[4058255]: status update error: iptables_restore_cmdlist: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
除此外,我们还可以debug启动:pve-firewall stop; pve-firewall start -debug
这样我们会知道具体的错误类型(如--dport)、行号,这个信息非常的不清晰,我查询了很多文档好像无法知道这行的内容;只能运行:pve-firewall compile 来检查每个客户的firewal rules看有没错误。
为了快速锁定错误内容,既然-debug可以提示错误行,那么它就有完整的iptables rules。我查看了pve-firewall源码(https://github.com/proxmox/pve-firewall/blob/master/src/PVE/Firewall.pm),我们可以直接修改Firewall.pm源码。vi编辑/usr/share/perl5/PVE/Firewall.pm找到这个sub:
sub iptables_restore_cmdlist {
my ($cmdlist, $table) = @_;
$table = 'filter' if !$table;
run_command(['iptables-restore', '-T', $table, '-n'], input => $cmdlist, errmsg => "iptables_restore_cmdlist");
}
在$table增加一个行打印所有cmdlist(iptables rules):
sub iptables_restore_cmdlist {
my ($cmdlist, $table) = @_;
$table = 'filter' if !$table;
# 打印 cmdlist
warn "Restoring iptables rules: $cmdlist\n"; # 使用 warn 打印到标准错误
run_command(['iptables-restore', '-T', $table, '-n'], input => $cmdlist, errmsg => "iptables_restore_cmdlist");
}
现在我们再次执行:pve-firewall stop; pve-firewall start -debug
现在会输出所有它要执行的iptables rules、错误类型、错误行号,根据完整的iptables rules,我们就可以很轻松找到错误行。最终发现是客户增加的 udplite 协议rules导致了iptables错误。接下来就好办了:
~# pve-firewall compile | grep udplite
-A tap2744i0-IN -p udplite --dport 19885 -j ACCEPT
-A tap2744i0-IN -p udplite --dport 20715 -j ACCEPT
-A tap2744i0-IN -p udplite --dport 19885 -j ACCEPT
-A tap2744i0-IN -p udplite --dport 20715 -j ACCEPT
我们已经找到了错误rules,现在去客户VM2744 Firewall rules删除udplite相关记录。然后 pve-firewall restart。现在不再提示Try `iptables-restore -h' or 'iptables-restore --help' for more information,启动正常,问题解决。记得在/usr/share/perl5/PVE/Firewall.pm注释或删除warn "Restoring iptables rules: $cmdlist\n"。
如果希望完全不受其困扰,可以在客户管理平台不再允许添加udplite rules;如果因为某些原因无法进行,也可以编辑 /usr/share/perl5/PVE/Firewall.pm(不建议,因为每次pve-firewall更新后修改可能被覆盖),找到verify_rule方法,添加检查,修改后的完整方法:
if ($rule->{proto}) {
eval { pve_fw_verify_protocol_spec($rule->{proto}); };
&$add_error('proto', $@) if $@;
&$set_ip_version(4) if $rule->{proto} eq 'icmp';
&$set_ip_version(6) if $rule->{proto} eq 'icmpv6';
&$set_ip_version(6) if $rule->{proto} eq 'ipv6-icmp';
$is_icmp = $proto_is_icmp->($rule->{proto});
# 插入对 udplite 协议的检查
if ($rule->{proto} eq 'udplite') {
&$add_error('proto', "'udplite' protocol is not supported for firewall rules.");
}
}
重启pve-firewall后客户再添加udplite rules不会再造成iptables错误,但会返回错误,如下所示:
~# systemctl status pvefw-logger pve-firewall
● pvefw-logger.service - Proxmox VE firewall logger
Loaded: loaded (/lib/systemd/system/pvefw-logger.service; enabled; preset: enabled)
Active: active (running) since Thu 2024-08-22 13:13:18 HKT; 17s ago
Process: 4165930 ExecStart=/usr/sbin/pvefw-logger (code=exited, status=0/SUCCESS)
Main PID: 4165933 (pvefw-logger)
Tasks: 2 (limit: 629145)
Memory: 480.0K
CPU: 88ms
CGroup: /system.slice/pvefw-logger.service
└─4165933 /usr/sbin/pvefw-logger
Aug 22 13:13:18 testnode systemd[1]: Starting pvefw-logger.service - Proxmox VE firewall logger...
Aug 22 13:13:18 testnode pvefw-logger[4165933]: starting pvefw logger
Aug 22 13:13:18 testnode systemd[1]: Started pvefw-logger.service - Proxmox VE firewall logger.
● pve-firewall.service - Proxmox VE firewall
Loaded: loaded (/lib/systemd/system/pve-firewall.service; enabled; preset: enabled)
Active: active (running) since Fri 2024-07-05 06:35:42 HKT; 1 month 17 days ago
Process: 3723 ExecStartPre=/usr/bin/update-alternatives --set ebtables /usr/sbin/ebtables-legacy (code=exited, status=0/SUCCESS)
Process: 3725 ExecStartPre=/usr/bin/update-alternatives --set iptables /usr/sbin/iptables-legacy (code=exited, status=0/SUCCESS)
Process: 3727 ExecStartPre=/usr/bin/update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy (code=exited, status=0/SUCCESS)
Process: 3729 ExecStart=/usr/sbin/pve-firewall start (code=exited, status=0/SUCCESS)
Process: 4165880 ExecReload=/usr/sbin/pve-firewall restart (code=exited, status=0/SUCCESS)
Main PID: 3744 (pve-firewall)
Tasks: 1 (limit: 629145)
Memory: 123.0M
CPU: 4d 18h 11min 46.192s
CGroup: /system.slice/pve-firewall.service
└─3744 pve-firewall
Aug 22 13:13:15 testnode pve-firewall[3744]: received signal HUP
Aug 22 13:13:15 testnode pve-firewall[3744]: server shutdown (restart)
Aug 22 13:13:15 testnode systemd[1]: Reloaded pve-firewall.service - Proxmox VE firewall.
Aug 22 13:13:15 testnode pve-firewall[3744]: restarting server
Aug 22 13:13:16 testnode pve-firewall[3744]: /etc/pve/firewall/4076.fw (line 37) - errors in rule parameters: IN ACCEPT -i net4 -p udpl>
Aug 22 13:13:16 testnode pve-firewall[3744]: proto: 'udplite' protocol is not supported for firewall rules.