With any application, it's important to have metrics that measure the health of that application.

Smoca currently supports monitoring metrics within [qa-smoca-lambchops](http://jenkins-master-0.prod.us-west2.justin.tv/job/qa-smoca-lambchops/). This is enabled because of the `TRACKING=true` environment variable within the Jenkins Job. It is **not recommended** you pass this in locally or any other project, as it will skew results.

An existing [health dashboard is already created within Grafana][1].

## Metric Configuration

These are Statsd Metrics specifically for health monitoring. Mixpanel metrics for jobs other than lambchops are not currently supported.

Tracking metrics are sent to Statsd on port 8125. This is defined within the [StatsHelpers](#resources).

## Existing Tracking Metrics

### Datasource: Graphite

#### Durations
| Description | Endpoint| Notes |
|-----------|---------|-------|
| Successful test's rspec duration | `stats.gauges.smoca.sucess.duration`
| Failed test's rspec duration | `stats.gauges.smoca.failure.duration` | By having Success/Fail separate, we can indicate on the chart when tests are failing
| Duration by each parallel process | `stats.gauges.smoca.processes.*.*` | `*.*` combines everything. It's actually `#{process_num}.#{result}`

#### Failure Indications
| Description | Endpoint | Notes |
|-----------|---------|-------|
| Failed Login Rate | `stats.gauges.smoca.debug.login_failure.count` | Is represented as a percent
| Login Failure Errors | `stats.counters.smoca.warnings.login.*.count` | * combines all of the different errors reported
| Net Read Timeouts | `stats.gauges.smoca.warnings.net_read_timeout`
| API Failures - Gmail | `stats.counters.smoca.warnings.api.gmail.*.count` | * combines all of the different errors reported
| API Failures - Recurly | `stats.counters.smoca.warnings.api.recurly.*.count` | * combines all of the different errors reported

### Datasource: Cloudwatch
#### [Selenium Grid](docs/grid/README.md)
| Description | Endpoint | Notes |
|-----------|---------|-------|
| CPU Utilization | `us-west-2.AWS/EC2.CPUUtilization` | Instance ID must be provided in the Dimensions
| AWS Network In | `us-west-2.AWS/EC2.NetworkIn` | Instance ID must be provided in the Dimensions
| AWS Network Out | `us-west-2.AWS/EC2.NetworkOut` | Instance ID must be provided in the Dimensions


## Contributing to monitoring stats
Our [StatsHelpers](#resources) file already contains methods for interacting with Statsd. These methods automatically append `smoca.` to the front of the metric.

As an example, if you wanted to gauge an arbitrary value, such as seconds taken to log in, you could do this:

```
require './core/helpers/stats.rb`
include StatsHelpers

pre_start = Time.now
login('test', 'test')
total_duration = (Time.now - pre_start)

gauge('duration.login', total_duration)
```

The following example will send stats to the following endpoint:
`stats.gauges.smoca.duration.login`


#### Resources
- [Health Dashboard][1]
- [StatsUtils File](core/utils/stats_utils.rb)

[1]: http://grafana.prod.us-west2.justin.tv/dashboard/db/smoca
