# Emergency Contact

Create an incident in the [Notify PagerDuty](https://twitchoncall.pagerduty.com/services/P2Z0TGZ).

To find the person oncall, type "!oncall" in any channel, then look for "Growth."

# Deployments

Deployments are automatically promoted via Amazon pipelines to prod after merging and respect working hours. 
https://pipelines.amazon.com/pipelines/TwitchEmailValidatorService
 
# Checking service health

In most cases, PagerDuty should alert on any service issues. If PagerDuty does not alert on service issues, and there is still user impact, the following can be checked:

1. Check the [Grafana Dashboard](https://grafana.xarth.tv/d/000001420/growth-email-validator) for any change in trends, and increases in errors.
1. Check [Rollbar](https://rollbar.com/Twitch/EmailValidator/) for errors.
1. Check [AWS Status](https://status.aws.amazon.com/). This service depends on EC2, ELB, SNS, and DynamoDB in Oregon (us-west-2).

# Data Recovery Plan

Email validator relies on a DynamoDB table to store validation information. This table is backed up regularly
via [Point-in-time Backups](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/PointInTimeRecovery_Howitworks.html).
In case of data loss, the table can be recovered following this [Data Recovery Plan](https://docs.google.com/document/d/1L2F81PkV33xAqEvBfNF88OJSzNdnxW-hUyvo5wKRlHc/edit?usp=sharing).

### DynamoDB Tables

* `validations-prod`

# Service-Level Agreement

The Email Validator API has the following SLA.

|                   ﻿Endpoint                  | SLA (ms) |
|----------------------------------------------|----------|
| AddVerificationRequest                       | 1000     |
| GetVerificationRequest                       | 1000     |
| GetVerificationRequestByOpaqueID             | 1000     |
| Verify                                       | 100      |
| VerifyCode                                   | 100      |
| RegenerateCode                               | 350      |
| Delete                                       | 1000     |
| DeleteVerificationRequestByKey               | 1000     |
| ListVerificationRequestByKey                 | 1000     |
| Reject                                       | 1000     |
| Unreject                                     | 1000     |
| NotMe                                        | 1000     |


# Troubleshooting

Please don't hesitate to ask in #notifications for help in troubleshooting.

## Common errors

**DynamoDB Read/Writes Throttling Exception**

What it means: This can happen because of either two reasons: 
1) DynamoDB has throughput capacity limits for reads/writes, which can be increased by paying more, [more info](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html). 
2) If this occurs even if capacity limits are not reached, it can be because of "hot" keys (or partitions which are accessed frequently), [more info](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.AdaptiveCapacity).

What to do: Depending on the source issue, and whether it's sustained, either increase the read/write capacity of the DynamoDB instances using the console, or look into which keys/partition are being accessed frequently.

**Sum HTTPCode_(Backend | ELB)_XYZ GreaterThanOrEqualToThreshold XYZ for LoadBalancerName XYZ**

What it means: The load balancer is seeing more than expected HTTP errors. pushy, users-service, and visage will call this API.

Is it urgent?: If temporary, no.

What to do: Check Rollbar and Grafana.
