

### RunBook for Rechat Ingester

#### Where do I find Rechat Ingester?
 * There are 3 AWS EC2 boxes running rechat ingester - [Click Here](https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#Instances:search=rechat-ingester;sort=Name)
 * Only one of the boxes runs the ingester at any given time (the other two are backups). The current leader can be found at [Consul](http://consul.internal.justin.tv/ui/dist/#/sfo01/kv/service/rechat_ingester/leader/edit)

#### Where can I see the ingester graphs ?
 * [http://grafana.prod.us-west2.justin.tv/dashboard/db/rechat-ingester](http://grafana.prod.us-west2.justin.tv/dashboard/db/rechat-ingester)
 * Alarms are set for the regular chat and event chats ingest rates (Graph 1 - green and yellow), Chats Index rate (Graph 2 - this is the rate of indexing of chats to Elasticsearch)
and the Clearchat delete rate (Graph 6)

#### Where can I see the ingester output ?
* The ingester writes output to /var/log/jtv/rechat_ingester.log Note that the two backup ingesters will not produce any output - they just wait in a loop to become the leader.

### Where is the rechat elasticsearch cluster ?
* [https://us-west-2.console.aws.amazon.com/es/home?region=us-west-2#rechat:dashboard](https://us-west-2.console.aws.amazon.com/es/home?region=us-west-2#rechat:dashboard)


#### Common troubleshooting
* The easiest way to troubleshoot in case alarms go off - is to try to restart the ingester on the boxes (sudo svc -d /etc/service/rechat_ingester and svc -u /etc/service/rechat_ingester) 
* It's possible the either the regular chats firehose or events chat firehose are down. Folks in the chat team would know their status.
* If the alarm goes off for indexing - there is a possibility something is wrong with Elasticsearch. Look at the graphs in the elasticsearch cluster (link above) and troubleshoot.


#### Elasticsearch troubleshooting
* Elasticsearch graphs can be found in the Monitoring tab.
* Two indicators to look at would be - CPU Utilization and JVMMemory Pressure (normally not more than 85%). If either is high, a quick fix would be to configure cluster and increase the number of instances and/or upgrade to better instances. [This Link](https://aws.amazon.com/elasticsearch-service/pricing/) shows the different elasticsearch instance types and their specifics.
