• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

server constantly rebooting

P

phingers

Guest
How do I troubleshoot a box that reboots itselfs about 8 times a day? Where do I troubleshoot? Or how?

/var/log/messages has the following entries:

May 22 09:08:41 bry syslog: syslogd startup succeeded
May 23 01:10:55 bry syslog: syslogd startup succeeded
May 23 16:53:58 bry syslog: syslogd startup succeeded
May 24 04:09:06 bry syslog: syslogd startup succeeded
May 25 04:54:08 bry syslog: syslogd startup succeeded
May 25 13:09:18 bry syslog: syslogd startup succeeded
May 25 14:15:06 bry syslog: syslogd startup succeeded
May 25 15:19:44 bry syslog: syslogd startup succeeded
May 25 19:05:37 bry syslog: syslogd startup succeeded

May 26 17:45:00 bry syslog: syslogd startup succeeded
May 26 23:32:09 bry syslog: syslogd startup succeeded
May 26 21:34:55 bry syslog: syslogd startup succeeded
May 26 22:52:27 bry syslog: syslogd startup succeeded

and the last command has the following

last -a | grep boot
reboot system boot Fri May 27 00:39 (10:16) 2.4.21-27.EL
reboot system boot Thu May 26 22:52 (12:04) 2.4.21-27.EL
reboot system boot Thu May 26 21:34 (13:21) 2.4.21-27.EL
reboot system boot Thu May 26 23:32 (-2:00) 2.4.21-27.EL
reboot system boot Thu May 26 17:44 (03:46) 2.4.21-27.EL
reboot system boot Wed May 25 19:05 (1+02:25) 2.4.21-27.EL
reboot system boot Wed May 25 15:19 (1+06:11) 2.4.21-27.EL
reboot system boot Wed May 25 14:15 (1+07:16) 2.4.21-27.EL
reboot system boot Wed May 25 13:09 (1+08:22) 2.4.21-27.EL
reboot system boot Wed May 25 04:54 (1+16:37) 2.4.21-27.EL
reboot system boot Tue May 24 04:09 (2+17:22) 2.4.21-27.EL
reboot system boot Mon May 23 16:53 (3+04:37) 2.4.21-27.EL
reboot system boot Mon May 23 01:10 (3+20:20) 2.4.21-27.EL
reboot system boot Sun May 22 09:08 (4+12:22) 2.4.21-27.EL
reboot system boot Sun May 22 01:12 (4+20:18) 2.4.21-27.EL
reboot system boot Sat May 21 09:10 (5+12:20) 2.4.21-27.EL
reboot system boot Fri May 20 14:51 (6+06:39) 2.4.21-27.EL
reboot system boot Thu May 19 18:16 (7+03:14) 2.4.21-27.EL
reboot system boot Thu May 19 18:15 (00:00) 2.4.21-27.EL
reboot system boot Thu May 19 13:04 (05:11) 2.4.21-27.EL
reboot system boot Thu May 19 11:43 (06:32) 2.4.21-27.EL
reboot system boot Thu May 19 09:54 (08:21) 2.4.21-27.EL
reboot system boot Thu May 19 07:37 (10:38) 2.4.21-27.EL
reboot system boot Tue May 17 09:23 (2+08:52) 2.4.21-27.EL
reboot system boot Mon May 16 09:00 (3+09:15) 2.4.21-27.EL
reboot system boot Sun May 15 21:00 (3+21:15) 2.4.21-27.EL
reboot system boot Sun May 15 09:24 (4+08:51) 2.4.21-27.EL
reboot system boot Sat May 14 09:07 (5+09:08) 2.4.21-27.EL
reboot system boot Fri May 13 06:59 (6+11:16) 2.4.21-27.EL
reboot system boot Thu May 12 11:31 (7+06:44) 2.4.21-27.EL
reboot system boot Wed May 4 07:54 (15+10:21) 2.4.21-27.EL
 
update and bump

I thought maybe it was the mysql process, I had set a new flag for it to use a different database type, and mysql cpu usage was quite high.

The server then stayed up for almost 2 days straight last week. Then something made it reboot early in the morning.

And its been downhill since, today again at 12:30 it didnt even reboot, just hung.

I NEED ideas on how to troubleshoot.

I've disabled almost all the crons.

I'm running redhat with plesk 7.5.2.

reboot system boot Wed Jun 8 12:48 (00:20) 2.4.21-27.EL
reboot system boot Wed Jun 8 11:08 (02:01) 2.4.21-27.EL
reboot system boot Wed Jun 8 05:25 (07:44) 2.4.21-27.EL
reboot system boot Wed Jun 8 02:49 (10:19) 2.4.21-27.EL
reboot system boot Tue Jun 7 02:52 (1+10:17) 2.4.21-27.EL
reboot system boot Tue Jun 7 02:36 (1+10:33) 2.4.21-27.EL
reboot system boot Tue Jun 7 02:20 (1+10:49) 2.4.21-27.EL
reboot system boot Mon Jun 6 03:24 (2+09:44) 2.4.21-27.EL
reboot system boot Sun Jun 5 16:43 (2+20:25) 2.4.21-27.EL
reboot system boot Sun Jun 5 04:36 (3+08:33) 2.4.21-27.EL
reboot system boot Sat Jun 4 10:11 (4+02:57) 2.4.21-27.EL
reboot system boot Fri Jun 3 04:09 (5+09:00) 2.4.21-27.EL
 
Run the command 'dmesg' and see what it says. It'll probably have more useful info than what you see in the logs.

M
 
dmesg, but what to look for?

I see a few things to troubleshoot, from the dmesg command, from all the reboots, the file system is whacked.

I'll run e2fsk on reboot tonight. But how does this help me find the culprit process or what not.

EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide1(22,1), internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
loop: loaded (max 8 devices)
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
 
Well, this assumes one particular process is constantly corrupting the file system. The file system may have been corrupted once and an fsck will clean it up.

But, I'm not sure what would be the best way to see if a particular process is doing the damage. :(

M
 
Generally, if your server is randomly rebooting, it's not going to be a process causing this. It's either going to be a: a kernel bug, or b: hardware (proc/RAM/HD).

Obviously, upgrading the kern is the easier/cheaper option so I'd try that first, otherwise I'd contact your host and let them know they probabally need to look into possible hardware issues with your server.

Regular processes can NOT cause the server to hard reboot like yours is doing. The fact that it's forcing an FSCK on reboot shows that the system never has time to do a proper shutdown (i.e., unmounting the filesystems fully).
 
Back
Top