# PagerDuty Incident: CPU Utilization GreaterThanOrEqualToThreshold

### Incident Detail

#### Symptoms

Cloudwatch will alert if Grid hits a dangerous CPU level.

PagerDuty incident triggers like:
- `Average CPUUtilization of 94.5 GreaterThanOrEqualToThreshold 75.0 for InstanceId i-0523c0f82a756c25f`

[Grafana "Average CPU Utilization"](https://grafana.internal.justin.tv/dashboard/db/smoca?panelId=15&fullscreen) will also indicate a spike

This results in degraded performance on the Grid system, such as browsers that time out, or take longer to respond to requests.

#### Likelihood of Occurrence

Low

#### Example

https://twitchoncall.pagerduty.com/incidents/P1MV5M3

### Cause

Often this occurs when Browsers are unable to close, so new tests create new processes.

Generally this shouldn't happen, but if the connection to the browser gets interrupted, it may not be able to close.

### Resolution

Step 1:
Check Grid Average CPU Utilization on [Health Metrics][Health Metrics].

Step 2:
Determine the problem severity.

- Severe: Utilization is high. Smoca Tests are also failing. Node is extremely unstable.
  - **Action:** Reboot the box immediately.
- Dangerous: Utilization is high. Tests are **not** failing.
  - **Action:** Reboot the box when it's not being used. Look Below.

#### Linux

[SSH into the machine][Connecting to Nodes]

What processes are taking the most resources? Try running `top` - [Documentation on Top](https://www.lifewire.com/linux-top-command-2201163)

[Prepare to perform maintenance](/resources/docs/grid/grid_maintenance.md), removing the Docker Container and Recreating it.

Still happening a few hours later? Try [changing to a debug node, and VNC'ing into the container.](/resources/docs/grid/connecting_to_nodes.md#vnc-into-linux-grid-node)

#### Windows

[VNC into the machine][Connecting to Nodes]

Are browsers dead and hanging? You can see if any sessions are currently active by focusing the Command Prompt and seeing if the last command is something like "session completed"

If there are no active sessions, you can close the browser.

[Connecting to Nodes]: /resources/docs/grid/connecting_to_nodes.md
[Health Metrics]: https://grafana.internal.justin.tv/dashboard/db/smoca
