# Valence Logs And Metrics

## Real-Time Metrics

| Type                                                                                                                                               | Example Metrics                         | Notes                                           |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- | ----------------------------------------------- |
| [ECS Cluster Stats](https://grafana.xarth.tv/d/W_7jvyyZz/valence)                                                                                  | Memory, CPU, Response Times, etc        |                                                 |
| [CloudWatch Dashboard]()                                                                                                                           | GQL Errors / Timings, Request by Region | Requires Isengard: "twitch-cpe+mweb@amazon.com" |
| [Fastly Grafana (TODO)]()                                                                                                                          | Total Requests, Status Code Rates, etc  |                                                 |
| [Sentry Clientside Errors](https://sentry.io/organizations/twitch/releases/?environment=production&project=5207323)                                | Error Rates by Device Segmentation      |                                                 |
| [Video Playback Metrics](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#dashboards:name=VLC-Product-Metrics;start=PT1H) | video-play, minute_watched, video_error |                                                 |
| [Fastly Dashboard (TODO)]()                                                                                                                        |                                         |                                                 |
| [GraphQL Dependency Health](https://grafana.xarth.tv/d/qe2X2XAmz/graphql-dependency-overview?from=now-1h&to=now-1m)                                | Availability %, Error Rates, etc        |                                                 |

## Weekly Metrics

| Type                           | Example Metrics             | Notes |
| ------------------------------ | --------------------------- | ----- |
| [Release Adoption (TODO)]()    | Build ID traffic over time  |       |
| [Device Segmentation (TODO)]() | OS, OS Version, Device, etc |       |

## Accessing Logs

### Server Logs

To access logs for Valence production and canary environments, authorize for the
"twitch-cpe+mweb@amazon.com" account via Isengard.

Note that all Insights queries default to the last 15 minutes, which is
generally what you want anyway when you've just been paged.

#### Filtering On Unexpected Errors

To search for errors in the tachyon production and canary logs, use this
[prebuilt Insights query](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logsV2:logs-insights$3FqueryDetail$3D$257E$2528end$257E0$257Estart$257E-900$257EtimeType$257E$2527RELATIVE$257Eunit$257E$2527seconds$257EeditorString$257E$2527fields*20*40timestamp*2c*20*40message*0a*7c*20filter*20*40message*20like*20*2f*28*3fi*29error*2f*0a*7c*20filter*20*40message*20not*20like*20*2f*28Failed*20GQL*20request*7cHandling*20request*20for*29*2f*0a*7c*20sort*20*40timestamp*20desc$257EisLiveTail$257Efalse$257EqueryId$257E$252768348535-e20a-484d-bc0a-c25f8f9b37d0$257Esource$257E$2528$257E$2527valence_wxytehjpbfegnjhpvtrn$2529$2529).

After you've found an error instance that you need to investigate more
thoroughly, an easy way to get more info is to then do an insight query like
[this one](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logsV2:logs-insights$3FqueryDetail$3D$257E$2528end$257E0$257Estart$257E-900$257EtimeType$257E$2527RELATIVE$257Eunit$257E$2527seconds$257EeditorString$257E$2527fields*20*40message*0a*7c*20filter*20*40logStream*20*3d*20*27sin*2ftachyon*2f8c3a413f-3fae-4a4c-80c8-5d8280432c70*27*0a*7c*20filter*20*40ingestionTime*20*3e*201571534285120*20and*20*40ingestionTime*20*3c*201571534285130*0a$257EisLiveTail$257Efalse$257EqueryId$257E$252768348535-e20a-484d-bc0a-c25f8f9b37d0$257Esource$257E$2528$257E$2527valence_wxytehjpbfegnjhpvtrn$2529$2529),
copying the error's `@logStream` value and narrowing to within a 5-10ms of the
errors' `@ingestionTime`.

#### GraphQL Errors

To search for GQL errors in the tachyon production and canary logs, use this
[prebuilt Insights query](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logsV2:logs-insights$3FqueryDetail$3D$257E$2528end$257E0$257Estart$257E-900$257EtimeType$257E$2527RELATIVE$257Eunit$257E$2527seconds$257EeditorString$257E$2527fields*20*40timestamp*2c*20*40message*0a*7c*20filter*20*40message*20like*20*2f*28*3fi*29error*2f*0a*7c*20filter*20*40message*20like*20*2f*28Failed*20GQL*20request*29*2f*0a*7c*20sort*20*40timestamp*20desc$257EisLiveTail$257Efalse$257EqueryId$257E$252768348535-e20a-484d-bc0a-c25f8f9b37d0$257Esource$257E$2528$257E$2527valence_wxytehjpbfegnjhpvtrn$2529$2529).

To see if GraphQL errors were localized to a specific region, use this
[prebuilt Insights query](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logsV2:logs-insights$3FqueryDetail$3D$257E$2528end$257E0$257Estart$257E-900$257EtimeType$257E$2527RELATIVE$257Eunit$257E$2527seconds$257EeditorString$257E$2527filter*20*40message*20like*20*2f*28Failed*20GQL*20request*29*2f*0a*7c*20parse*20*40logStream*20*22*2a*2ftachyon*2f*2a*22*20as*20*40region*2c*20*40whatever*0a*7c*20stats*20count*28*2a*29*20as*20*40count*20by*20*40region*0a*7c*20sort*20*40count*20desc$257EisLiveTail$257Efalse$257EqueryId$257E$252768348535-e20a-484d-bc0a-c25f8f9b37d0$257Esource$257E$2528$257E$2527valence_wxytehjpbfegnjhpvtrn$2529$2529)
