Knowledge Base
linbit.com Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

CRMd High CPU Load Detected in a Pacemaker Cluster

This article explains high CPU load detected messages from Pacemaker’s Cluster Resource Management daemon (CRMd), and how you can adjust their threshold.

The following question gets asked fairly often:

What sets the threshold for this load average? crmd: notice: throttle_handle_load: High CPU load detected

This message is in no case an indication that there is a problem with the cluster. It is an indication that the cluster is reporting a high operating system workload.

This was an authoritative answer by the Pacemaker project lead a few years back:

Those messages indicate there is a real issue with the CPU load. When the cluster notices high load, it reduces the number of actions it will execute at the same time. This is generally a good idea, to avoid making the load worse.

`The messages don’t hurt anything, they just let you know that there is something worth investigating.

If you’ve investigated the load and it’s not something to be concerned about, you can change load-threshold to adjust what the cluster considers high. The load-threshold (cluster properties) works like this:

  • It defaults to 0.8 (which means Pacemaker should try to avoid consuming more than 80% of the system’s resources).
  •  On a single-core machine, load-threshold is multiplied by 0.6 (because with only one core you really don’t want to consume too many resources). On a multi-core machine, load-threshold is multiplied by the number of cores (to normalize the system load per core).
  •  That number is then multiplied by 1.2 to get the Noticeable CPU load detected message (debug level), by 1.6 to get the Moderate CPU load message, and 2.0 to get the High CPU load message. These are measured against the 1-minute system load average (the same number you would get with top, uptime, and others).

So, if you raise load-threshold above 0.8, you won’t see the log messages until the load gets even higher. But, that doesn’t do anything about the actual load problem if there is one.


Reviewed 2020/12/02 - DGT