# iceman

A database migration tool for relation databases such as postgres and mysql.

Installation
------------
You can get iceman by running `go get code.justin.tv/d8a/iceman/cmd/iceman` or `apt-get install iceman`

Setup
-----
In the root directory where you want to manage your database migrations, make sure there is a subdirectory named "iceman". Here, create a configuration file called "dbconf.yaml" to specify desired environments and database details. The driver parameter is option.  If omitted, `postgres` is assumed Shown below is a sample config file: 
```
development:
  [driver: <mysql> | <postgres> | ...]
  database: host=1.2.3.4 port=5678 dbname=mydb user=dev sslmode=disable

staging:
  [driver: <mysql> | <postgres> | ...]
  database: host=1.2.3.4 port=5678 dbname=mydb user=staging sslmode=disable

production:
  [driver: <mysql> | <postgres> | ...]
  database: host=1.2.3.4 port=5678 dbname=mydb user=prod sslmode=disable
  
<environment>
  [driver: <mysql> | <postgres> | ...]
  database: host=1.2.3.4 or /var/run/postgresql port=5678 dbname=mydb user=prod sslmode=disable

```

Migrations Workflow
--------
#### Create a New Migration
To create a new migration file run:
```iceman create migration <YourMigrationaName>```
This will generate a YAML file in [migration management root directory] > iceman > migrations. It will have the file name [Timestamp of Creation]_<YourMigrationName>.yaml. If there is no "migrations" directory, it will be created. Edit the up and down sections to accomplish your goals by filling out the queries with valid SQL and specifying a timeout length. You can include multiple queries, and note that YAML supports multiple-line strings if you want to style them. Here is a simple example with a single query:
```
# Up is executed when this migration is applied.
up:
  operations:
    -
      queries:
        - CREATE TABLE IF NOT EXISTS my_table (
          user_id integer primary key,
          name varchar not null
          )
      timeout: 
      txn: true
  timeout: 2000

# Down is executed when this migration is rolled back.
down:
  operations:
    -
      queries:
        - DROP TABLE my_table
      timeout:
      txn: true
  timeout: 500

```
* **_operations_**: The list of individual sql batches to be executed for the transaction.  Most migrations only have one operation, executed as a transaction.  However, sometimes you may want an operation to include multiple transactions, or some sections executed outside of a transaction.
* **_queries_**: input the SQL you want to run (for up/down)
* **_timeout_**: time is in milliseconds; leaving this blank (or setting it to 0) will mean infinite timeout.  Each operation has its own timeout value, and then the up/down have a global timeout value used as the default for each operation.  If an operation's timeout is left blank, the global value will be used.  If the global timeout is left blank, there will be no timeout.  To set a single operation to no timeout, set the timeout to 0, specifically.
* **_txn_**: set this to false if you want the operation to be non-transactional (true by default).  Each individual query on a non-transactional operation may be executed on a separate connection, so connection state should not be modified.  Additionally, manually beginning or ending transactions in any operation is forbidden.

#### Apply Migrations
By running
```iceman up```
, all unapplied migrations in your local directory will be applied to the database. If there are multiple ones, they will be executed in order of ascending creation time. For each migration, if there are multiple queries in the transaction, they will be executed serially. If a migration is successful, a corresponding entry will be added to a table in the database called "iceman_migrations" that will act as a changelog, containing information about the migration's file name (also the primary key), name, creation time, and applied time. 

If, at any point during a migration, an error occurs, the transaction will be rolled back, and all further migrations queued up in this session will not proceed. Note that changes incurred by migrations before the error occurred will stay.
#### Roll Back
By running 
```iceman down```
, you will roll back the latest applied migration applied to the database. If your local directory does not contain the file associated with this migration, then an error message will pop up prompting you for an update. Otherwise, the roll back will continue as planned, and, upon completion, the corresponding entry in the "iceman_migrations" table will be deleted. 

#### View Status
By running 
```iceman status```
, you can see the status of your database. Migration names, creation times, and applied times are all shown. Unapplied migrations are marked as "pending" and will appear yellow in the console. Also displayed are migrations that have been applied to the database but do not exist in your local directory. These will be red and offer a good indicator that you should update your local migrations directory. 

#### Migration Examples

**_Non-Transaction Migration_** - Some migration commands, such as WITH CONCURRENTLY commands, don't support being executed in transactions.  In these cases, you will want to execute operations outside transaciton mode:

```
# Up is executed when this migration is applied.
up:
  operations:
    -
      queries:
        - CREATE INDEX CONCURRENTLY my_idx ON my_table (user_id)
      timeout: 
      txn: false
  timeout: 2000

# Down is executed when this migration is rolled back.
down:
  operations:
    -
      queries:
        - DROP INDEX my_idx
      timeout:
      txn: true
  timeout: 500
```

**_Multiple Operations_** - There are plenty of reasons to use multiple operations with a migration- one is to weave a no-transaction operation among transaction operations. 

```
# Up is executed when this migration is applied.
up:
  operations:
    - 
      queries:
        - CREATE TABLE IF NOT EXISTS my_table (
          id serial primary key,
          user_id integer unique,
          nom varchar not null
          )
        - UPDATE my_table SET nom='bob'
      timeout:
      txn: true
    -
      queries:
        - CREATE INDEX CONCURRENTLY my_idx ON my_table (user_id)
      timeout: 
      txn: false
    -
      queries:
        - DELETE FROM my_table WHERE user_id=4
        - UPDATE my_table SET user_id=4
      timeout:
      txn: true
  timeout: 2000

# Down is executed when this migration is rolled back.
down:
  operations:
    -
      queries:
        - DROP TABLE my_table
      timeout:
      txn: true
  timeout: 500
```

**_Varying Timeouts_** - Maybe you have a section that needs a different timeout than the others.  An easy way to do this is to run them as different operations, with most operations using the global migration timeout and the other having its own timeout.

```
# Up is executed when this migration is applied.
up:
  operations:
    - 
      queries:
        - CREATE TABLE IF NOT EXISTS my_table (
          id serial primary key,
          user_id integer unique,
          nom varchar not null
          )
        - UPDATE my_table SET nom='bob'
      timeout:
      txn: true
    -
      queries:
        - CREATE INDEX my_idx ON my_table (user_id)
      timeout: 0
      txn: true
    -
      queries:
        - DELETE FROM my_table WHERE user_id=4
        - UPDATE my_table SET user_id=4
      timeout:
      txn: true
  timeout: 2000

# Down is executed when this migration is rolled back.
down:
  operations:
    -
      queries:
        - DROP TABLE my_table
      timeout:
      txn: true
  timeout: 500
```

Bulk Operations
---------------
iceman can also be used to manage bulk operations!
#### Create a New Bulk Operation
Simply run
```iceman create bulk <YourBulkName>```
to autogenerate a yaml file which you can edit to your satisfaction. This file will be located in a subdirectory called "bulk", which will be created if not already existent. Here is an example of a completed one:
```
relation: user_properties

read:
  query: >
    SELECT
      id,
      CONCAT('deleteme_',
      name) AS new_name
    FROM
      user_properties
    WHERE
      name IN(
        'displayname', 'description'
      )
  start_id: 0

write:
  query: >
    UPDATE
      user_properties
    SET
      name = $2
    WHERE 
      id = $1
  max_writes: 100000

batch:
  size: 10000
  batch: false  

sleep: 100

verbose: true

```
* **_relation_**: the main relation that will be read
* **_read_**: initial part of operation to select necessary information for subsequent write
  * **_query_**: input your desired SQL
  * **_start_id_**: set the starting id to read from the relation
* **_write_**: segment of operation that writes to database
  * **_query_**: input your desired SQL
  * **_max_writes_**: set a cap on the number of writes; leaving this blank (or setting to 0) will mean infinite cap
* **_batch_**: batch details
  * **_size_**: set batch size
  * **_batch<sup>1</sup>_**: set to true if you want to pass in a list of ids to the write query
* **_sleep_**: time (in milliseconds) to wait between loops
* **_verbose_**: set to true if you want to print the status during the operation

**1**: When **_batch_** is false, the write query can accept any number of placeholders. However, when it's set to true, the write query will expect a single placeholder `$1` wrapped in an `ANY()` clause, so it will have to look something like...
```
DELETE FROM user_properties WHERE id =  ANY($1::int[])
```
#### Run the Bulk Operation
To execute, use 
```iceman bulk```, which will run all unfinished operations.  Note that they may not finish due to inherently long runtimes, and failures are bound to occur. Because bulks are non-transactional, changes applied will not be rolled back. However, this is not a cause for concern, because appropriate details are provided in a table called ```iceman_bulk``` that will enable the user to continue an interrupted job. 

#### Status Tracking
As mentioned above, there is a table called ```iceman_bulk``` that will contain information about current bulk operations, both completed and unfinished. Its primary key is the name of the operation, and the start time of execution is provided, as well as whether or not the job has been completed or not. There is also a ```nextRow``` column, which one can set as the ```start_id``` for the next run.
