Search…
Instant Datasets
This guide describes the purpose of using instant datasets in your applications and how to set them up in Release.
One of the most commonly encountered challenges when developing software is access to data. With seed data, you can generate a simple and consistent dataset, with the downside of it normally being a relatively small amount of data and not fully reflective of what your application will be accumulating in production. Additionally, as your application changes, this becomes yet another problem to deal with and maintain. A more optimal solution would be to have a pool of resources (this could be in forms such as data in a database, cache, search infrastructure, etc) that can be utilized by an environment for the life of that environment, ready to be used immediately. This is exactly what instant datasets do for you and your teammates.
Release currently supports this feature for databases on AWS (RDS). We will be expanding this offering with additional supported platforms and services in the future. With just a small amount of setup, you can have production replica data available to any of your environments, instantly, regardless of dataset size.

Why use Instant Datasets with Release?

RDS is a great technology that we have used at Release from the beginning to power our production application. It automatically takes daily snapshots of your application databases and provides them to you in a readily accessible manner. One of the major drawbacks to using RDS for environments other than production is the long spin-up time for any particular database, making it less than ideal for ephemeral/staging environments. The solution to this problem and solving the challenge of giving your staging environments production-like data is the same: Instant Datasets.
Instant Datasets are a collection of databases that are ready to be used by an ephemeral environment. These datasets are based on production snapshots, already created for you by RDS, and are used to create your instant datasets. Each time an environment is created that requires an instant dataset, another dataset is generated from the production snapshot so that you never run out of databases or have to wait for one to spin-up. This process allows access to production-like data instantly for any environment you create.
High-level diagram of using Instant Datasets

Setting up an Instant Dataset in Release

In order to use Instant Datasets in Release we need a snapshot to restore from. In order to have one you must have automatic snapshots setup. Please refer to this document to setup snapshots in RDS.
Each Instant Dataset is limited to a single account.

Creating an Instant Dataset

Login to Release and click on the account settings button in the top right corner. Then open the Datasets tab on the left and click Create Dataset.
Instant Dataset creation prompt
  • Name: Anything you like to help you remember what this dataset contains.
  • Cluster: Instant Datasets must be assigned to a cluster. For most people this will be the default cluster. This cluster must have access to the snapshot.
  • Instance Size: This is the database instance in RDS. Refer to this document for additional information.
  • Snapshot: These are the snapshots available and accessible by the cluster you currently selected.
  • Database Password: This is what an application will use to connect to the underlying database in the instant dataset.
  • Number of available Instant Databases: This is the target number of available databases in the Instant Dataset pool that are free to checkout at any time. Each time an application environment is created that uses one of the instances, the system starts the creation of another database in the set. The ideal size will be the number of "spare" instances you need to keep up with environment creation. IE If you create 2 environments within a few minutes of each other before the pool is replenished, you would likely want at least 2 instances available at any given time.
If you ever find that environments are delayed or fail to be created during Instant Dataset checkouts, you will need to increase the value so that this does not happen. If you find that you have a lot of database instances sitting idle all day, you can try reducing the number (with care to make sure you don't make the number too small).
We generally recommend starting with a value of 2 if you unsure because you can increase it later; we do not recommend setting the pool size to 1 unless you are sure that you will not run out of new database instances for a new environment while a replacement is being generated.
Click Create to begin the process of initializing the database instances. This process takes an unknown amount of time based on size of the database, but it will take at least a few minutes and could take hours.
RDS takes around 5-6 mins to create the database. Once that finishes it takes time to restore the data from the snapshot and this is based on how large the snapshot is.
Once the dataset is ready to be used it will transition to an Active state and environments can now use this dataset when they are deployed.
List of Instant Datasets
Click on a dataset to see more details about it.
Instant Dataset details

Setup your application to use Instant Datasets

For environments that you want to use Instant Datasets with, you will need to make an addition in the application's template and environment config.
If we didn't setup the Instant Dataset to work with every ephemeral environment through our default configuration, we can also add it explicitly to an environment we create. The syntax is the same whether Release automatically creates it or we do it manually.
Essentially, you will need to do the following:
In the application template, add a datasets line with a name property that has a value set as the name of your instant dataset.
1
environment_templates:
2
- name: ephemeral
3
datasets:
4
- name: release-prod-for-development
Copied!
Linking an Instant Dataset to an ephemeral environment in the application's template config
In the environment config, create a mapping using the Generated Environment Variables section in the Instant Dataset details page.
1
---
2
mapping:
3
DATABASE_HOST: RELEASE_PROD_FOR_DEVELOPMENT_RDS_DB_POOL_HOST
4
DATABASE_PASSWORD: RELEASE_PROD_FOR_DEVELOPMENT_RDS_DB_POOL_PASS
5
DATABASE_USER: RELEASE_PROD_FOR_DEVELOPMENT_RDS_DB_POOL_USER
Copied!
Your env variables are on the left side of the colon and the generated env variables are on the right.
Linking an Instant dataset to an ephemeral environment in the application's environment config

Deploy your application to use your dataset

Notice: These changes will only be propagated to newly created environments. You cannot apply a new Instant Dataset configuration retroactively to an existing environment. We are working on fixes to update this experience in the future.
Your application is now ready to use your Instant Dataset!
  • Whenever you deploy an ephemeral environment, it will check-out one of the available (active) databases to be used while the environment exists.
  • The Instant Dataset will then create an additional database to maintain the target available database count.

Modifying an existing Instant Dataset

Changing the target available size

Reducing the size will not destroy any databases currently in use by an environment.
To change the target size of an Instant Dataset, go to the Datasets tab in account settings and click on the edit button on the right (pencil icon) for the dataset that you want to modify.
List of Instant Datasets
Instant Dataset modification prompt

Changing the RDS Instance type

Changing the instance type will destroy ALL standby databases in the Instant Dataset. You might not be able to create a new environment during this window because there will be no database instances available to checkout. You should only perform this step during off-hours when you are unlikely to make new environments.
To change the instance type for an Instant Dataset, go to the Datasets tab in account settings and click on the edit button on the right (pencil icon) for the dataset that you want to modify.
List of Instant Datasets
Instant Dataset modification prompt

Changing the Instant Dataset master password

To change the master password on an Instant Dataset, go to the Datasets tab in account settings, click on the dataset you want to modify, then click the Change Master Password button.
Instant Dataset details
Password change prompt

Conclusion

Instant Datasets provide the ability for you to access any data regardless of the size or complexity instantly by any of your environments on Release. You can create multiple RDS datasets and add them to your application by linking it through the application template config and environment config. This allows you to have services using different datasets or a single service accessing multiple datasets. You can also modify an Instant Dataset by changing its size, instance type, and master password.