# Statsite 

* [Main Build](../statsite/Dockerfile)
* [Plumbago Build](../statsite/build-plumbago.sh)

### carbon-relay-ng


## Container Env Vars

**required**

* `GRAPHITE_HOST` and `GRAPHITE_PORT` - set to configure the upstream graphite (typically will be the carbon-relay-ng cluster)

**optional**

* `STATSD_PORT` - port for UDP(statsd) protocol. (default `8125`)
* `TCP_PORT` - port for TCP stats. (default `8125`)
* `LOG_LEVEL` - maps to [log_level](https://github.com/statsite/statsite#configuration-options) default `INFO`
* `FLUSH_INTERVAL` - in seconds, how often to flush stats. Critically important to not make this disjoint from the carbon-relay interval. (default `10`)


## Configuration File Explanation

[The config file](../statsite/statsite.conf.tmpl) is templated using [Dockerize](https://github.com/jwilder/dockerize) which uses a Go template based language that looks similar to jinja. Env vars can be hoisted in from the container. The [statsite config docs](https://github.com/statsite/statsite#configuration-options) explains the output config in detail

[The main block](https://git.xarth.tv/edge/voncount/blob/0936696296dd90ed7a027a2537fcbfc0f71cee07/statsite/statsite.conf.tmpl#L2-L5) pulls in the basic env var configuration (hosts/ports)

[the stream_cmd](https://git.xarth.tv/edge/voncount/blob/0936696296dd90ed7a027a2537fcbfc0f71cee07/statsite/statsite.conf.tmpl#L6) configures [Plumbago](https://git.xarth.tv/edge/plumbago) to be invoked to send stats every `FLUSH_INTERVAL` seconds.


### Prefixes

The `global_prefix` sets the prefix before all stats. we set that to `stats.` to adhere to our old setup.

Furthermore `counts_prefix` is set to `counters.` which is different from statsd defaults... this means that emitting a counter `foo` uses both prefixes and comes out as `stats.counters.foo` in graphite-land.

### Aggregation choices

`extended_counters_include=count,rate,sum` has us include those three sub-aggregations. That means every counter actually ends up emitting four metrics to graphite, the original counter plus those three aggregations.

```
timers_include=count,lower,upper
quantiles=0.5,0.9,0.99
```

These have you emit six graphite metrics for every timer:  count/lower/upper/p50/p90/p99

## Scaling

The number of statsite shards can be increased or decreased in response to
changes in throughput requirements with minimal downtime (in practice only 1-2
minutes of inaccurate metrics).

1. Decide how many shards you want. Recall that each shard should have its own vCPU.
1. Add necessary EC2 capacity by editing the Auto Scaling Group for statsite.
1. Increase the task count in the ECS console. The scheduling strategy will BinPack these tasks onto the hosts you added in step 2.
1. Make sure `go-statsd-proxy` detects the change by tailing its logs.
1. Backport the changes to cluster_size into terraform so we don't undo the changes in the future.
1. Go get yourself a cookie, that was hard work!
