# Analyzing Fastly Logs

Fastly fronts most of the tachyon projects. As such, it is useful to be able to
process fastly logs for ad hoc analysis. IP-Sys ships Fastly logs to the
`fastly-access-logs-default-uswest2-dcd8b4d3bc4084fbd017deff4f10` bucket in the
`twitch-cpe+tools@amazon.com` account and provides us with the `S3LogAccess` to
download and view logs.

## Downloading Logs for a Range

Fastly produces multiple log files per day which can make manual downloading of
logs cumbersome. The easiest way to download a useful amount of logs is through
the `aws s3 sync` command, for example:

```
aws s3 sync s3://fastly-access-logs-default-uswest2-dcd8b4d3bc4084fbd017deff4f10/mobile-web-upsell/ ./ --exclude '*' --include '2019-10-29*'
```

This would download all the `mobile-web-upsell` fastly logs for `2019-10-29`.
This works because S3 allows you to exclude all logs and then include back in
logs that match a certain pattern, in this case one for the date of October
29th, 2019.

## Processing Logs

All logs are stored gzip'd and in a JSON format. You can unzip the whole
directory via `gunzip *`. Once the files are unziped you can use `jq` to process
all the log files. A simple way, yet inefficient, way to do this is:

```
cat *.log | jq .request.url | sort | uniq -c
```

Breaking this down, it will:

1. Concatenate all the log files in the directory together and pipe them to…
2. The `jq` command which will fetch the `url` field off of the `request` field
   of each object and pipe that to…
3. `sort` which will sort them so that `uniq` can count each duplicate instance.

The result of this is a list of all paths & queries requested in the logs and
the number of times they were requested.
