Introduction
Databases are growing in both size and complexity to meet the increasing demands of business. Applications to process the data are also increasing in size and complexity. With the growing complexity, solid testing becomes more and more important in order to assure the quality of software. Ideally we would like to test all changes against up-to-date production data, so the general practice is use a copy of the production database for all testing.
The Problem
When a database exceeds a certain size, it becomes very expensive to provide full-size copies of the production database for development and testing. One solution to this problem is to have fewer full size copies of the production database than are really needed, often only one, which will be shared between the development and testing teams.
Of course this is far from optimal. Data in the database is left in an unknown state when passed from one team to the other. It takes a long time to provide a refresh of the production copy when it’s required. Always having an up-to-date production copy is almost impossible.
A Solution
The databases required for development and testing rarely need to be full size, it is often easier to work on a small copy. Unfortunately it is very hard to manually extract a small subset of the production data. It is not possible to just take 10% of each table to get a 10% size database. The data in one table would not be related to the data in the other tables. It would not be referentially intact.
Jailer simplifies the extraction of referentially intact data. Once you have defined an extraction model, it can be used to extract data from the production database fast and easily whenever up-to-date test data is required.
The open-source tool Jailer is available here.