burnley
Regular Pleskian
Sorry folks for posting it on the Plesk forum, couldn't find an equivalent forum for Virtuozzo ( OpenVZ forum maybe?)
But here it is, on the weekend we've got a pretty nasty crash on one of our VZ 4.7.0 nodes. The panic is (was at the time, at least for me (c) ) 100% reproducible, here are the steps:
1. Initial crash, no backtrace.
2. After rebooting the node, all the services on the node are coming up properly, up to the point where the containers are started. As soon as the first container starts, panic.
3. Rinse and repeat step 2.
I've nailed it down to the NFS service starting *before* the 3 VZ containers are started. To avoid the kernel panic I had to disable NFS server automatic start by running "chkconfig --level 2345 nfs off" and start it manually after all the containers are aup and running.
uname -a
Linux vz-node 2.6.32-042stab117.10 #1 SMP Fri Jul 29 23:55:56 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux
Stacktrace follows:
[...]
Aug 6 20:46:32 vz-node kernel: [ 657.370188] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
Aug 6 20:46:32 vz-node kernel: [ 657.370981] IP: [<ffffffffa04bf998>] nfsd_inetaddr_event+0x68/0xa0 [nfsd]
Aug 6 20:46:32 vz-node kernel: [ 657.371441] PGD edc904067 PUD f3a203067 PMD 0
Aug 6 20:46:32 vz-node kernel: [ 657.372050] Oops: 0000 [#1] SMP
Aug 6 20:46:32 vz-node kernel: [ 657.372566] last sysfs file: /sys/devices/virtual/net/venet0/address
Aug 6 20:46:32 vz-node kernel: [ 657.372922] CPU 15
Aug 6 20:46:32 vz-node kernel: [ 657.373016] Modules linked in: ip_vzredir(P)(U) vzredir(P)(U) vzcompat(P)(U) vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vznetdev ip6_vzprivnet(P)(U) ip6_vzredir(P)(U) ip6_vznetstat(P)(U) ip_vzprivnet
(P)(U) vziolimit vzsnap(P)(U) vzfs(P)(U) vzcpt vzlinkdev(P)(U) vzethdev vzevent vzlist(P)(U) vzstat(P)(U) vzmon ip_vznetstat(P)(U) vznetstat(P)(U) vzdquota vzdev xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle xt_multiport xt_limit xt
_dscp ipt_REJECT iptable_filter ip_tables nfsd coretemp nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 tun ipmi_devintf iTCO_wdt iTC
O_vendor_support dcdbas power_meter acpi_ipmi ipmi_si ipmi_msghandler joydev sb_edac edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core sg ixgbe dca ptp pps_core mdio tcp_htcp ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahc
i megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_vzp
Aug 6 20:46:32 vz-node kernel: rivnet]
Aug 6 20:46:32 vz-node kernel: [ 657.386398]
Aug 6 20:46:32 vz-node kernel: [ 657.386741] Pid: 8497, comm: ifconfig veid: 206 Tainted: P -- ------------ 2.6.32-042stab117.10 #1 042stab117_9 Dell Inc. PowerEdge R620/0PXXHP
Aug 6 20:46:32 vz-node kernel: [ 657.387623] RIP: 0010:[<ffffffffa04bf998>] [<ffffffffa04bf998>] nfsd_inetaddr_event+0x68/0xa0 [nfsd]
[...]
This, and another similar kernel stacktrace details in the attached file. I've also got 3 lookalike crash dumps which I'll have to hand over to your Virtuozzo team by opening a support ticket.
But here it is, on the weekend we've got a pretty nasty crash on one of our VZ 4.7.0 nodes. The panic is (was at the time, at least for me (c) ) 100% reproducible, here are the steps:
1. Initial crash, no backtrace.
2. After rebooting the node, all the services on the node are coming up properly, up to the point where the containers are started. As soon as the first container starts, panic.
3. Rinse and repeat step 2.
I've nailed it down to the NFS service starting *before* the 3 VZ containers are started. To avoid the kernel panic I had to disable NFS server automatic start by running "chkconfig --level 2345 nfs off" and start it manually after all the containers are aup and running.
uname -a
Linux vz-node 2.6.32-042stab117.10 #1 SMP Fri Jul 29 23:55:56 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux
Stacktrace follows:
[...]
Aug 6 20:46:32 vz-node kernel: [ 657.370188] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
Aug 6 20:46:32 vz-node kernel: [ 657.370981] IP: [<ffffffffa04bf998>] nfsd_inetaddr_event+0x68/0xa0 [nfsd]
Aug 6 20:46:32 vz-node kernel: [ 657.371441] PGD edc904067 PUD f3a203067 PMD 0
Aug 6 20:46:32 vz-node kernel: [ 657.372050] Oops: 0000 [#1] SMP
Aug 6 20:46:32 vz-node kernel: [ 657.372566] last sysfs file: /sys/devices/virtual/net/venet0/address
Aug 6 20:46:32 vz-node kernel: [ 657.372922] CPU 15
Aug 6 20:46:32 vz-node kernel: [ 657.373016] Modules linked in: ip_vzredir(P)(U) vzredir(P)(U) vzcompat(P)(U) vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vznetdev ip6_vzprivnet(P)(U) ip6_vzredir(P)(U) ip6_vznetstat(P)(U) ip_vzprivnet
(P)(U) vziolimit vzsnap(P)(U) vzfs(P)(U) vzcpt vzlinkdev(P)(U) vzethdev vzevent vzlist(P)(U) vzstat(P)(U) vzmon ip_vznetstat(P)(U) vznetstat(P)(U) vzdquota vzdev xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle xt_multiport xt_limit xt
_dscp ipt_REJECT iptable_filter ip_tables nfsd coretemp nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 tun ipmi_devintf iTCO_wdt iTC
O_vendor_support dcdbas power_meter acpi_ipmi ipmi_si ipmi_msghandler joydev sb_edac edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core sg ixgbe dca ptp pps_core mdio tcp_htcp ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahc
i megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_vzp
Aug 6 20:46:32 vz-node kernel: rivnet]
Aug 6 20:46:32 vz-node kernel: [ 657.386398]
Aug 6 20:46:32 vz-node kernel: [ 657.386741] Pid: 8497, comm: ifconfig veid: 206 Tainted: P -- ------------ 2.6.32-042stab117.10 #1 042stab117_9 Dell Inc. PowerEdge R620/0PXXHP
Aug 6 20:46:32 vz-node kernel: [ 657.387623] RIP: 0010:[<ffffffffa04bf998>] [<ffffffffa04bf998>] nfsd_inetaddr_event+0x68/0xa0 [nfsd]
[...]
This, and another similar kernel stacktrace details in the attached file. I've also got 3 lookalike crash dumps which I'll have to hand over to your Virtuozzo team by opening a support ticket.