# E2E Testing Suite Runtime

This directory contains code and data for running a fully end-to-end set of tests against the controlplane API and converger. Tests can be as large or small as desired, and the tooling contains support for using JSON to define seed data to populate the controlplane DB with.


## Getting Started

The following steps show creating a very simple end-to-end test and integrating it into the test runtime.

### Credentials

To run the E2E suite, credentials are needed that are allowed to assume the role `arn:aws:iam::793846415324:role/e2e-test-runner`. For users running this on their workstation, this can be achieved by having Isengard admin credentials for the `twitch-eventbus-main-test` account in the shell environment (`isencreds` can be used here: https://git-aws.internal.justin.tv/gist/nherson/673ea40dad2d23f56b43f95bf77cad1e).

### Debugging Converger Issues

If your end-to-end tests are failing due to a converger issue, it's helpful to see the converger output. To do this, simply add the `SHOW_CONVERGER_OUTPUT=true` environment variable. For example:

```
SHOW_CONVERGER_OUTPUT=true make e2e
```

### Seed Data

Use JSON seed data definitions to outline a basic infrastructure in which to perform tests. Each distinct test is seeded from JSON files in a directory named `e2e/internal/seed/data/$testName`. `$testName` is then the name of the test suite using that seed data. The following section shows a simple example. For a more complete seed schema, see [Supported Seed Schema](#supported-seed-schema)

#### `event_definitions.json`

```
[
    {
        "event_type": "MyE2ETestEvent",
    }
]
```

#### `services.json`

```
[
    {
        "service_catalog_url": "https://servicecatalog.internal.justin.tv/services/666",
        "ldap_group": "e2e-group1",
        "description": "Basic E2E seed service",
        "accounts": [
            {
                "id": "567312298267",
                "label": "twitch-eventbus-pub-test"
            }
        ]
    }
]
```

### Create a test using the seed

#### Using the default suite runner

In the `e2e/internal/suite` package there is a struct called `DefaultTestSuite` which implements the `test.Runner` interface and performs the most basic set of steps outlined [here](#defaulttestsuite). For a simple test that requires no additional setup or test logic, this struct can be used:

Add the following snippet to `e2e/main.go`:
```
testName, err := suite.NewBasicTestSuite("$testName")
if err != nil {
    log.Fatal("Could not prepare test", zap.String("testName", "01_basic"))
}
util.LogErrorsFatally(log, test.Run(basic, log))
```
The above code will load the seed data in `e2e/internal/seed/data/$testName`, and call the default setup, test, and clean routines for that seed data. In this case, using the above seed data, the suite will:
1. Setup - Spin up SNS topics for the `MyE2ETestEvent` event definition in each of the supported environments; place config objects in S3 for each event stream; call out to the converger
2. Test - Ensure the expected SNS topics exist; ensure the expected S3 objects exist, and that they contain the right data
3. Clean - Delete the SNS topics; delete the S3 objects

The above steps can get arbitrarily complex as more seed data is added to any given test suite.

#### Implementing custom test logic

To make a test that requires more complex logic than what the `DefaultTestSuite` provides, a wrapping struct can be used to provide additional functionality. Suppose a suite were to be created using the above seed data, but with an additional test that checks the permissions of the SNS topic given that it has no publishers declared.

First, create a wrapping struct that uses the `DefaultTestSuite` as a base, and implement the same interface:

`e2e/internal/suite/snspermissions.go`:
```
package suite

import (
    // ...
)

var _ test.Runner = &SNSPermissionsTestSuite{}

type SNSPermissionsTestSuite struct {
	Default *DefaultTestSuite // Use the default suite as ground work

	errors []error // record errors local to this test suite's checks
}

func NewSNSPermissionsTestSuite(testName string) (*SNSPermissionsTestSuite, error) {
	defaultTest, err := NewDefaultTestSuite(testName)
	if err != nil {
		return nil, err
	}
	return &SNSPermissionsTestSuite{
		Default: defaultTest,
	}, nil
}

// Setup just defers to the default setup procedure
func (t *PublishTestSuite) Setup(log *logger.Logger) error {
	return t.Default.Setup(log)
}

// Test uses the default tests as a base, with an additional test added
func (t *PublishTestSuite) Test(log *logger.Logger) {
	t.Default.Test(log)
	// do some extra tests
    t.myCustomSNSPermissionCheck(log)
}

// Clean defers to the default cleaning procedure
func (t *PublishTestSuite) Clean(log *logger.Logger) error {
	return t.Default.Clean(log)
}

// Errors reports errors after Test() is called
func (t *PublishTestSuite) Errors() []error {
	if t.errors != nil {
		return append(t.errors, t.Default.Errors()...)
	}
	return t.Default.Errors()
}

// JobID defers to the job ID of the default test suite
func (t *PublishTestSuite) JobID() string {
	return t.Default.JobID()
}

// TestName defers to the test name held by the default test suite
func (t *PublishTestSuite) TestName() string {
	return t.Default.TestName()
}

func (t *PublishTestSuite) Error(err error) {
	if t.errors == nil {
		t.errors = []error{err}
	} else {
		t.errors = append(t.errors, err)
	}
}
```

The above strategy can be expanded to deal with test suites that need additional setup steps (e.g. testing a flow where the converger must be run multiple times such is the case when unsubscribing). In some cases it can also be used to setup infrastructure in an obsolete way, in events where backwards compatibility is to be tested. At the end of the day, the core `test.Runner` interface allows for a sane default implementation `DefaultTestSuite` which can be wrapped and expanded upon to implement arbitrarily complex test scenarios.


## Test Runner Interface

`test.Runner` is an interface that outlines the base requirements for constructing an E2E test.

`Setup() error` should carry out all the base construction of the thing being tested. For low level resources, this means populating the controlplane database with the necessary records. For higher order tests, this means calling `Setup()` on the lower level resources, and then calling out to a one-off converger process (complex tests can do this many times for multi-converge tests).

`Test()` should run all checks associated with the `test.Runner`. For a low level resource, this means making sure that the converger properly created all expected infrastructure and associated permissions. For higher order `test.Runner` implementations, this means calling `Test()` on contained lower level resources and aggregating resulting errors.

`Errors()` should return all errors that were encountered during `Test()`. This means that `test.Runner` implementations should internally store all errors that occur during `Test()`.

`Clean()` should tear down all infrastructure associated with the test. Tests should not leak any infrastructure! For resources whose infrastructure are strictly permissions (e.g. `publications` are only added permissions), no cleanup is necessary.

## Supported Seed Schema

Each seed file should be an array at the top level

### Event Definitions
#### `event_definitions.json`
```
[
	{
		"event_type": <string>,
	},
	...
]
```

#### `DefaultTestSuite`

- `Setup()` - Adds one EventStream to the DB per supported environment (production, staging, development)
- `Test()` - Checks topic existence per environment, checks S3 object containing SNS topic ARN in JSON format
- `Clean()` - Removes SNS topic and S3 object

### Services
#### `services.json`
```
[
	{
        	"service_catalog_url": <string>,
        	"ldap_group": <string>,
        	"description": <string>,
        	"accounts": [
            		{
                		"id": <string>,
                		"label": <string>
            		}
        	]
	}
]
```

#### `DefaultTestSuite`
- `Setup()` - Creates the service in the DB
- `Test()` - N/A
- `Clean()` - N/A

### Subscription Targets
#### `subscription_targets.json`
```
[
    {
        "name": <string>,
        "aws_account_id": <string>,
        "service_catalog_id": <int>
    }
]
```

#### `DefaultTestSuite`
- `Setup()` - Adds the subscription target to the DB
- `Test()` - Check SQS existence
- `Clean()` - Remove SQS target

### Subscriptions
#### `subscriptions.json
```
[
	{
		"target_name": <string>,
		"environment": <string>,
		"event_type": <string>,
		"service_catalog_id": <int>
	}
]
```
#### `DefaultTestSuite`
- `Setup()` - Adds subscription to DB
- `Test()` - Checks SNS+SQS subscription exists (ARN check)
- `Clean()` - Unsubscribes the queue from the topic

### Publications
#### `publications.json`
```
[
    {
        "event_type": <string>,
        "environment": <string>,
        "service_catalog_id": <string>
    },
    ...
]
```
#### `DefaultTestSuite`
- `Setup()` - Adds the publication to the DB, fetches service AWS account IDs for `Test()` phase
- `Test()` - Checks that service's account IDs from `Setup()` exist in the SNS topic policy for publish permissions
- `Clean()` - N/A


## Job ID and Parallel Runs

Every test suite should be run with a JobID (the `JobID() string` method is required to meet the `test.Runner` interface). These JobIDs are used to namespace test runs. This means that any given E2E test should be able to be run in parallel with itself or other E2E test suites. `DefaultTestSuite` generates a JobID using the internal `uuid` package. Test suites that wrap and extend `DefaultTestSuite` functionality should call `JobID()` of the contained `DefaultTestSuite` to get and reuse the same JobID within the test suite run.

## TODO
- Rework seed loading so not every resource type must have a JSON file present
- Unify service catalog seed data fields
   - Only use `"service_catalog_id": <string>` instead of `"service_catalog_url": <string>`, and don't use `int`
- Propagate errors better, such that error output at the end of the E2E run provides context about _where_ the error happened
- Default subscription check should include policy checks

