# Tomorrow Logs & Metrics

## Real-Time Metrics

| Type                                                                                                                             | Example Metrics                         | Notes                                           |
| -------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- | ----------------------------------------------- |
| [ECS Cluster Stats](https://grafana.xarth.tv/d/43ogtYfWk/tomorrow)                                                               | Memory, CPU, Response Times, etc        |                                                 |
| [CloudWatch Dashboard](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#dashboards:name=TMW;start=PT3H) | GQL Errors / Timings, Request by Region | Requires Isengard: "twitch-cpe+mweb@amazon.com" |
| [Fastly Grafana](https://grafana.xarth.tv/d/oAzqN1dMz/twilight-fastly-realtime-cdn?orgId=1&var-Service=Mobile%20Web)             | Total Requests, Status Code Rates, etc  |                                                 |
| [Video Playback Metrics](https://grafana.internal.justin.tv/d/AP0EfFkWz/mobile-web-video-playback-metrics?refresh=1m&orgId=1)    | video-play, minute_watched, video_error |                                                 |
| [Sentry Clientside Errors](https://sentry.io/organizations/twitch/issues/?project=5214452)                                       | Error Rates by Device Segmentation      |
| [GraphQL Dependency Health](https://grafana.xarth.tv/d/qe2X2XAmz/graphql-dependency-overview?from=now-1h&to=now-1m)              | Availability %, Error Rates, etc        |                                                 |
| [Fastly Dashboard](https://manage.fastly.com/stats/real-time/services/5qKeZ5T0TKLd66StWKgM9p/datacenters/all)                    |                                         |                                                 |

## Weekly Metrics

| Type                                                                    | Example Metrics             | Notes |
| ----------------------------------------------------------------------- | --------------------------- | ----- |
| [Release Adoption](https://app.mode.com/twitch/reports/c5c08f791138)    | Build ID traffic over time  |       |
| [Device Segmentation](https://app.mode.com/twitch/reports/42aa480753a0) | OS, OS Version, Device, etc |       |
| [Network Stats](https://app.mode.com/twitch/reports/c142a1a91a8d)       | Network Type, Speed, etc    |       |

## Accessing Logs

### Server Logs

To access logs for Tomorrow production and canary environments, authorize for
the "twitch-cpe+mweb@amazon.com" account via Isengard.

Note that all Insights queries default to the last 15 minutes, which is
generally what you want anyway when you've just been paged.

#### Filtering On Unexpected Errors

To search for errors in the tachyon production and canary logs, use this
[prebuilt Insights query](<https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logs-insights:queryDetail=~(end~0~start~-900~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40timestamp*2c*20*40message*0a*7c*20filter*20*40message*20like*20*2f*28*3fi*29error*2f*0a*7c*20filter*20*40message*20not*20like*20*2f*28Failed*20GQL*20request*7cHandling*20request*20for*29*2f*0a*7c*20sort*20*40timestamp*20desc~isLiveTail~false~queryId~'68348535-e20a-484d-bc0a-c25f8f9b37d0~source~(~'canary~'tachyon))>).

After you've found an error instance that you need to investigate more
thoroughly, an easy way to get more info is to then do an insight query like
[this one](<https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logs-insights:queryDetail=~(end~0~start~-900~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40message*0a*7c*20filter*20*40logStream*20*3d*20*27sin*2ftachyon*2f8c3a413f-3fae-4a4c-80c8-5d8280432c70*27*0a*7c*20filter*20*40ingestionTime*20*3e*201571534285120*20and*20*40ingestionTime*20*3c*201571534285130*0a~isLiveTail~false~queryId~'68348535-e20a-484d-bc0a-c25f8f9b37d0~source~(~'canary~'tachyon))>),
copying the error's `@logStream` value and narrowing to within a 5-10ms of the
errors' `@ingestionTime`.

#### GraphQL Errors

To search for GQL errors in the tachyon production and canary logs, use this
[prebuilt Insights query](<https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logs-insights:queryDetail=~(end~0~start~-900~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40timestamp*2c*20*40message*0a*7c*20filter*20*40message*20like*20*2f*28*3fi*29error*2f*0a*7c*20filter*20*40message*20like*20*2f*28Failed*20GQL*20request*29*2f*0a*7c*20sort*20*40timestamp*20desc~isLiveTail~false~queryId~'68348535-e20a-484d-bc0a-c25f8f9b37d0~source~(~'canary~'tachyon)))>).

To see if GraphQL errors were localized to a specific region, use this
[prebuilt Insights query](<https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logs-insights:queryDetail=~(end~0~start~-900~timeType~'RELATIVE~unit~'seconds~editorString~'filter*20*40message*20like*20*2f*28Failed*20GQL*20request*29*2f*0a*7c*20parse*20*40logStream*20*22*2a*2ftachyon*2f*2a*22*20as*20*40region*2c*20*40whatever*0a*7c*20stats*20count*28*2a*29*20as*20*40count*20by*20*40region*0a*7c*20sort*20*40count*20desc~isLiveTail~false~queryId~'68348535-e20a-484d-bc0a-c25f8f9b37d0~source~(~'canary~'tachyon))>)
