• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

Issue Server Health email alerts

Agustin Cuadro

New Pleskian
Hello,

I discovered that Server Health Monitor email alerts are not being sent in none of my plesk servers, does anyone know what can be the cause? Or has anyone have the same problem?

I checked all the posible configurations and not find the problem.

Regards.
 
Have you found sending attempts in Plesk maillog at least?
 
On the failing Server, the e-mail address of the Panels admin ist set properly?

Could you post two of the configuration files? One working one and one what is not working? Use the Button "Download Configuration File", please!
 
On the failing Server, the e-mail address of the Panels admin ist set properly?

Could you post two of the configuration files? One working one and one what is not working? Use the Button "Download Configuration File", please!

I think it is correct, otherwise the Plesk Scheduler Notifications would not arrive, or not?

Here is the configuration file of the Server Health Monitor:
<?xml version="1.0" encoding="UTF-8"?>
<configdata>
<!--
IMPORTANT: To be able to change the configuration file properly,
you must be familiar with XML and XML Schema.
For information, refer to the W3Schools tutorials at
XML Tutorial and http://www.w3schools.com/schema/.

This configuration file format is described by the XML schema PRODUCT_ROOT_D/admin/plib/Health/Config/custom-health-config.xsd. Refer to the XML Schema for the information on the elements, their relationship, attributes, data types of the elements and their attributes, and fixed values for the elements and attributes. The following text explains elements and attributes the meaning of which might be not clear.


=trendInterval=
This attribute of a parameter element sets the time period (in hours) for calculating the parameter trend. Default trend interval is 1 hour.

Trend is a way to show how a parameter value changes over time. Health monitor compares the value average for the last trend interval with the value average for the interval before that.

For example, if a trend interval for the "Total number of processes" parameter ("Cpu/Processes" in this config) is 3 hours, Health Monitor calculates the number average for the last 3 hours (say, it's 120) and for the 3-hours period before that (say, it was 60). Comparing the averages, Health Monitor will display a rising trend, showing that the total number of processes increased by 2 times.

=minvalue=
This attribute of a parameter element affects how trend on the parameter changes is displayed. In particular, it makes Health Monitor show a zero trend when a parameter changes greatly but in a range of smaller insignificant values, which would show a great rising trend and call a false alarm.

The "minvalue" attribute sets a minimum value for the parameter: if a value is less then "minvalue", Health Monitor considers the value being equal to the defined minimum value.

= alarm =
This element defines when an alarm should appear. By default, every parameter has both yellow and red alarms set up. You may want to remove the alarms at all, in such a case the parameter is still monitored, but no alarms appear even if the parameter state gets critical.

The "level" attribute defines how critical the condition is for the server health: the "yellow" alarm is meant for warnings (displayed in the Panel with label "Requires attention"), and the "red" is intended for more critical conditions (displayed in the Panel with label "Problem"). It is recommended to set lower values for yellow alarms than for the red ones.

The "type" and "threshold" attributes set the condition upon which the alarm should appear. "Type" defines the way in which the parameter or its change is evaluated, and "threshold" defines the value:

* "absolute" means the parameter value in its units of measurement.
For example:
<alarm level="yellow" type="absolute" threshold="300" />
set for the "CPU > Processes" parameter means that a yellow alarm should appear when the total number of processes on the server reaches 300.

* "percent" means that the "threshold" value is a result of dividing a current parameter value by the maximum value. The parameter maximum value is either defined in this config (by the "maxvalue" attribute of the parameter or "device" element), or taken from the system.
For example:
<alarm level="red" threshold="90" type="percent"/>
set for the "HDD > Utilization" parameter means that a red alarm should appear when the disk utilization reaches 90%.

* "trend" means that the "threshold" value is a number of times by which the parameter value has increased. The parameter values compared here are the value average for the last "trendInterval" (1 hour by default) and the average for the "trendInterval" before that.
For example:
<Services>
<CpuTime>
<service name="MySql" monitor="true">
<alarm level="red" threshold="2" type="trend"/>
</service>
...
</CpuTime>
...
<Services>

means that a red alarm should appear when CPU utilization by the MySQL service during the last hour increases by 2 times as compared to the previous hour.

= Queue =
Nested in "Cpu", this element defines settings for monitoring the "Load average" parameter. Its attribute "period" defines the amount of time against which the load averages are calculated:
* "shortterm" - for the past 1 minute
* "midterm" - for the past 5 minutes
* "longterm" - for the past 15 minutes

= Misc =
This element defines how the alarm notification e-mails should be sent.
The "alarmsInterval" attribute defines how often (in minutes) the health monitor should be checked for new alarms appearance. As soon as such check-up detects a new alarm, a notification e-mail is sent.
The "notificationSubject" attribute sets the Subject line for alarm notification e-mails.

The "notificationEmail" child element defines an alarm notification recipient. If the element is omitted, the alarm notifications are sent to the Panel admin's e-mail address.
The "address" attribute defines e-mail address to which the notifications will be sent.
The "name" attribute defines the recipient name. For example, "John Doe".

PS If you need to mass-upload the health monitor config to multiple servers, you can do so by placing it directly to the server health system.
It is very IMPORTANT that you validate your custom configuration files against the XML schema PRODUCT_ROOT_D/admin/plib/Health/Config/custom-health-config.xsd.

The custom configuration locations are as follows:
* On Unix: PRODUCT_ROOT_D/var/custom-health-config.xml
* On Windows: %plesk_dir%\admin\conf\custom-health-config.xml
-->
<Misc alarmsInterval="5"/>
<Hdd>
<TotalUsage>
<device name="disk-C" monitor="true">
<alarm level="red" threshold="90" type="percent"/>
<alarm level="yellow" threshold="80" type="percent"/>
</device>
<device name="disk-E" monitor="true">
<alarm level="red" threshold="90" type="percent"/>
<alarm level="yellow" threshold="80" type="percent"/>
</device>
</TotalUsage>
<Bandwidth>
<device name="disk-C" monitor="true"/>
<device name="disk-E" monitor="true"/>
</Bandwidth>
</Hdd>
<Ram>
<TotalUsage>
<alarm level="red" threshold="95" type="percent"/>
<alarm level="yellow" threshold="90" type="percent"/>
</TotalUsage>
<SwapUsage>
<alarm level="red" threshold="90" type="percent"/>
<alarm level="yellow" threshold="80" type="percent"/>
</SwapUsage>
<SwapInOut/>
</Ram>
<Cpu>
<TotalUsage>
















</TotalUsage>
<Queue period="shortterm"/>
<Processes/>
</Cpu>
<Network>
<Bandwidth>
<device name="Ethernet 2" monitor="true"/>
</Bandwidth>
</Network>
<Services>
<CpuTime>
<service name="Web" monitor="true"/>
<service name="Mail" monitor="true"/>
<service name="MySql" monitor="true"/>
<service name="Panel" monitor="true"/>
</CpuTime>
<MemoryUsage>
<service name="Web" monitor="true">
<alarm level="red" threshold="25" type="percent"/>
<alarm level="yellow" threshold="20" type="percent"/>
</service>
<service name="Mail" monitor="true">
<alarm level="red" threshold="25" type="percent"/>
<alarm level="yellow" threshold="20" type="percent"/>
</service>
<service name="MySql" monitor="true">
<alarm level="red" threshold="25" type="percent"/>
<alarm level="yellow" threshold="20" type="percent"/>
</service>
<service name="Panel" monitor="true">
<alarm level="red" threshold="25" type="percent"/>
<alarm level="yellow" threshold="20" type="percent"/>
</service>
</MemoryUsage>
</Services>
</configdata>

It always worked but one day it stopped doing it without changes supposed
 
I checked all the posible configurations and not find the problem.
 
For anyone finding this thread, this might be the solution:
Login to your server via SSH as root and type:
service psa-health-monitor-notificationd status

The output should say something like this:
Active: active (running)

On my server the current status was:
Active: failed

To fix, type:
service psa-health-monitor-notificationd restart

Now check the status again.
 
Back
Top