# Necrophos

![necrophos hero](https://dota2.gamepedia.com/media/dota2.gamepedia.com/a/a6/Necrophos_icon.png)

## Overview

Necrophos is a long-term storage adapter for Prometheus.  It is capable of receiving scraped samples through the Prometheus remote write API,
and it stores those samples in CloudWatch.

Data can, in turn, be read back out through the Prometheus remote read API.

## Compatability

The ability to query data from CloudWatch through `necrophos` is limited due to the restrictions of the CloudWatch API.
Queries must contain only exact match comparators (`=`) and cannot contain not-equals, or regular expression comparators (`!=`, `~=`, `!~`).
If a set a query matches many CloudWatch timeseries, each one will be returned, unless the query returns more than the pre-configured maximum allowed (default: 25).
When issuing queries, the `statistic` label can be specified with a value of `Sum`, `Average`, `Minimum`, or `Maximum` to return the corresponding Cloudwatch statistic for the found metrics; if not specified, `Average` is used.

## Conversion Model, Prometheus <--> CloudWatch

Prometheus and CloudWatch have surprisingly similar data models, as far as naming goes. Here is a mapping between Prometheus and CloudWatch used in this adapter:

| Concept | Prometheus | CloudWatch |
| --- | --- | --- |
| Metric Name | `__name__` label | `MetricName` API field |
| Dimensional tags | labels | `Dimensions` API field |
| Computed statistic (queries) | `statistic` label | `Statistic` API field |
| Namespace | N/A | Configured in Necrophos |

## Gotchas

Raw Prometheus counters are a poor choice to store in CloudWatch; since they are monotonically increasing and CloudWatch expects to ingest individual samples,
sending ever-increasing counters causes CloudWatch's statistics to be useless (consider asking for `Sum` on samples that are always getting bigger... it's more like taking an integral actually!).
The suggested strategy would be to pre-aggregate the data using a recording rule, applying a `rate` function to store a more gauge-like timeseries in CloudWatch.

Important to note that the standard Prometheus scrape interval is 10 seconds.  Therefore, since CloudWatch buckets samples on 1 minute windows,
you can expect the `Sample Count` statistic for metrics sent through `necrophos` to be `6`.  This makes the accuracy of CloudWatch `Extended Statistics` a bit dubious,
since at the 1 minute resolution they will only ever be computed over 6 samples. As a result, support for both `Sample Count` and `Extended Statistics` queries are currently not implemented.
If you would like quantile data, consider tracking the data in Prometheus as quantile data before exporting to CloudWatch.


## API

The remote read API listens at `POST /read`, the remote write API listens at `POST /write`.

## Todo
* Create a used config
   * Namespace
   * Max metrics per ListMetrics call
   * Specify default statistic type
* Allow queries to specify a `namespace` to query from in cloudwatch
* CI pipeline
   * Deployable
   * Automatic testing / code coverage
* Implement non-equality label matcher logic
   * e.g. apply label matchers manually to resulting metrics found with `ListMetrics` API call
