# deploying

To deploy changes to Zenyatta code, use [clean-deploy](https://clean-deploy.internal.justin.tv/#/d8a/zenyatta). You may
need to run Puppet afterwards?

To run Puppet, run `sudo puppet agent --test` on each instance (currently master and 4 workers).
You can get hostnames from [clean-deploy](https://clean-deploy.internal.justin.tv/#/d8a/zenyatta/status?env=production).

# new cluster

* uses terraform in `zenyatta/terraform`
* `zenyatta/terraform/<environment>/variables.tf` will contain things like the ami if you need to update due to meltdown
    * https://git-aws.internal.justin.tv/d8a/zenyatta/blob/master/terraform/production/zenyatta/variables.tf#L37
    * If you just need to change the postgres configuration on the AMI, you can just spin up the existing AMI using awscli
      and modify it in-place to make a new AMI.
* bring up 1 more worker than you need
    * edit the value here to change how many workers there are: https://git-aws.internal.justin.tv/d8a/zenyatta/blob/master/terraform/production/zenyatta/variables.tf#L2
* ssh into this temporary worker, and run:
```bash
sudo su - airflow
sudo svc -d /etc/service/airflow-worker
rm -rf /etc/service/airflow*
```
* and now create a new ami from this worker. once the ami is created, terminate this extra worker
```bash
# using the awscli you can check out the existing image:
aws --region us-west-2 --profile twitch-web-aws ec2 describe-images --image-id ami-c5f00abd
{
    "Images": [
        {
            "Architecture": "x86_64",
            "CreationDate": "2017-10-03T19:05:50.000Z",
            "ImageId": "ami-c5f00abd",
            "ImageLocation": "641044725657/zenyatta-ephemeral-6",
            "ImageType": "machine",
            "Public": false,
            "OwnerId": "641044725657",
            "State": "available",
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/sda1",
                    "Ebs": {
                        "Encrypted": false,
                        "DeleteOnTermination": true,
                        "SnapshotId": "snap-0f8171f87c96a250d",
                        "VolumeSize": 32,
                        "VolumeType": "gp2"
                    }
                }
            ],
            "Description": "",
            "EnaSupport": true,
            "Hypervisor": "xen",
            "Name": "zenyatta-ephemeral-6",
            "RootDeviceName": "/dev/sda1",
            "RootDeviceType": "ebs",
            "SriovNetSupport": "simple",
            "VirtualizationType": "hvm"
        }
    ]
}
# now create new ami
aws --region us-west-2 --profile twitch-web-aws ec2 create-image --instance-id i-1234567890abcdef0 --name "zenyatta-ephemeral" --description "node for zenyatta pitr of non-rds instances"
```
* make the ami accessible across acounts by adding the account # that will be using this to the existing policy for the ami. 
* update the default ami value here: https://git-aws.internal.justin.tv/systems/puppet/blob/master/modules/zenyatta/manifests/params.pp#L10
    * this is the simplest way to ensure it gets rolled out everywhere
* run puppet on zenyatta nodes you just created (master and workers)
* check to see status of airflow on `ip.of.new.master:8080/`
* if there are no connections or visible DAGs, you might have to run `init_db.py`
```bash
sudo su - airflow
cd /opt/twitch/zenyatta/current
source /etc/zenyatta/zenyatta.env
python init_db.py
```

# zenyatta config
* for various config options, there are 2 files you have to edit if you want to make cluster wide changes
* `puppet/hiera/cluster/zenyatta-master.yaml`
* `puppet/hiera/cluster/zenyatta-worker.yaml`
* the `zenyatta:connections:` hash will populate a file `/etc/zenyatta/connections.yaml`
    * this file is the input for `/opt/twitch/zenyatta/current/init_db.py`

# workers aren't picking up new tasks
* ssh into a worker
* run:
```bash
sudo su - airflow
sudo svc -d /etc/service/airflow-worker

ps -ef | grep airflow
# now check to see if any tasks like [celeryd: celery@zenyatta-worker-0567f94025330a93b:ForkPoolWorker-49]
# still running
cd /opt/twitch/zenyatta/current
# this will start 1 celery worker
source /etc/zenyatta/zenyatta.env
airflow worker -c 1
# now you should see some output, and eventually an error or the worker will successfully pick
# up new tasks.
# if it picks up new tasks, the worker likely missed a config change and/or a refresh and just
# had to be restarted, so restart the worker now after ctrl+c'ing out of this one you just started
sudo svc -u /etc/service/airflow-worker
```
* other cases involve restarting the scheduler sometimes, and running it manually
* to do that ssh into the master and:
```bash
sudo su - airflow
sudo svc -d /etc/service/airflow-scheduler

ps -ef | grep airflow | grep scheduler
# this will start scheduler manually so you can inspect output
source /etc/zenyatta/zenyatta.env
airflow scheduler
```
