The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
I've never seen a DB with so-called mock data in it actually not get mangled and abused by other people.
At a previous job we avoided that issue by having the setup step of the test suite create and initialize a fresh DB for running the unit tests with each execution. Which was great for avoiding DB abuse, but meant the startup time before running any tests was kinda blah.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason?
Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful?
Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies.
-- Sarah Hoyt
The short answer is that ideally you need both unit tests and integration tests.
The unit tests will affirm that a dev has not broken any specific methods within the system and the integration test will affirm the general 'health' of a system.
With unit tests it's sometimes necessary to mock data/functions as if you are just testing one aspect of the system you may need to provide inputs via mocking.
The reason unit tests are useful is that it immediately allows a dev to know if they have broken something - integration tests can take a long time to run and they as much harder to write.
“That which can be asserted without evidence, can be dismissed without evidence.”
We draw a distinction in style of testing, as in we would write tests in a integration style but not as far as a database. We have a common data access layer and mock out data returned with test data. Code above the data layer is all tested with concretes, so our tests do end with lots of data setup (but using Builders/Mothers/Factories we can enhance common data providers to many tests). The tests all run from the highest level possible and pass through all units - we have abstracted away the http/message infrastructure so our starting units are after the "command" reaches the domain.
In the past we would test each unit individually then moved over to a more concrete implementation testing.
Our conclusion is:
1. We have far fewer tests to maintain
2. Our tests are more resilient to change - e.g. one dependency change/refactoring doesn't result in 45 tests needing to be changed, but just update in a common data change (our tests focus on behaviours being met) if the change has no material change to expected output.
3. We can test integration of units quicker - e.g. using strategy/command patterns and only testing units that are mocked has bitten us badly, so testing full concrete implementations ensured correctness.
Where we deviate from the above pattern is where our tests need to test multiple paths in a specific unit (i.e. when the result is null given 3 out 6 inputs, constructor testing, builder testing, etc.), the cost of testing this from the upper level would have infeasible. Then we hit that unit directly with tests.
With this approach we have seen our developers making better testing decisions and code evolving better as they are not swamped with updating tests just because they need to do a refactoring for new functionality.
How do you define mock data? We are not allowed to use real-life customer data in our tests. But we sometimes use data from test system when it is too complex to craft it by ourselves for example. However this has nothing to do with the fact where you store it: hard-coded in your test or in a file or database. For me these are two different aspects.
I suspect anybody who's done a significant amount of automated testing will have experienced the frustration of spending more time maintaining the tests than the code itself. Too many, or too complicated tests can become a burden, so for me it's always a tradeoff between coverage and simplicity.
I've found that testing the full stack with a test DB (what I would call end-to-end tests) gives great test coverage - without these, it's quite possible to have lots of passing unit tests but a system that doesn't actually work when put together. However, maintaining the schema and data in the test DB is an overhead, and worse, end-to-end tests can be brittle and very hard to debug when you get a failure.
On the other hand, I've found that with unit or integration tests to get a useful test often requires quite a bit of mocking, which can quickly get quite complex, and lead to tests that can be... brittle and hard to debug when you get a failure.
Supporting end-to-end tests with good unit/integration tests gives the best of both worlds, but leads to lots of tests and lots of complexity, all of which requires maintenance.
I'm curious: You mentioned that you have preferences, and as you've experienced both strategies in several places, I'd be really interested to know what these are?
Design so you can mock the outside world as far as you can (database, filesystem, the system clock, even), but have some level of integration testing so that you verify the connections between your system and the outside world work.
I've done the 'use a test database' thing as well in the past, and that really wasn't a good thing...
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
We have a really complex database (uses four databases, with almost 1000 stored procs and hundreds of tables).
We've started redesigning it (and the app that uses it).
Due to the high complexity of the database itself, we will create a staging database that we can pull data from which will exercise specific use/edge cases, and when we start a test run, it will copy that staged data into the actual database (in our dev environment of course), and that way, the store procs in the actual database(s) can be used without any special "testing sql" in the database itself, short of some stored procs that verify certain data is present.
".45 ACP - because shooting twice is just silly" - JSOP, 2010 ----- You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010 ----- When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013
By definition unit tests should be small, focused, fast and should not use external files, databases, rest.. anything like that. If you need to hit the file system create an interface and mock it. Unit tests can run on any machine without configurations, files, database connections .. They can run on DevOps or your choice of cloud with no extra configs. They just work.
Integration tests on the other hand don't have that restriction. For me it's basically a sandbox to make sure I can run that tracer thru the code correctly. Maybe figure out how to do something, benchmarks, etc. They are not run during CI/CD builds. I mark them as ignored just to be sure.
Also.. I keep them in separate assemblies just to be sure of no leaks.
~"Watch your thoughts; they become your words. Watch your words they become your actions.
Watch your actions; they become your habits. Watch your habits; they become your character.
Watch your character; it becomes your destiny."
Largely covered, but ( note, not my area, so I likely have some terms wrong ) mocked data and requests lets you test for corrupted data, "little Johnny Tables", spaces where they don't belong, stupidly long inputs...
In my stuff, I first try to get it working so I can use it, then try to break it, then watch the operators and production parts to see what happens. ( Pro tip, while checking to make sure you can't drop a transmission ( gearbox ), have several cardboard boxes there to catch it. )
I understand the theory and conventional wisdom about unit testing and integration testing. But I don't just go by what the "experts" say or where the herd goes. I apply value engineering and a long-term support consideration to implement those "conventional wisdom" principles in a way that maximizes benefit today and tomorrow on a project by project basis.
When I use mock data:
1 - early in the development process before a test database and any external services (for test mode) are available. In some cases, those test sources are not available, at least to developers.
2 - when the test database is unmaintained, or difficult to update (e.g. developers have to go through the DBA's team to do anything to a DB)
3 - when the project is so large, or so distributed among development teams that mock data is necessary for the benefit of other teams.
When I use test sources and NOT mock data:
1 - when available and easily updateable to developers (e.g. I never hire a developer who cannot at least create a T-SQL table with indices and stored procedures).
2 - when PII (Personally Identifiable Information) can be auto-generated to fill DB tables. Not hard to do, but creating such utility apps is helpful and can avoid lawsuits should the company be audited or test data leaks out.
When I use test sources AND mock data:
1 - Same reasons as above to use test sources - as available.
2 - I use mock data for those connection points where test sources are not reliably available.
In most situations, I use test sources only because I do not need mock data. As part of the initial design, the UI layout is created and approved at some point. As an offshoot, the DB design (tables, indices, PK-FK relationships) is done in parallel. I use my utility to generate the SPs for POCO CRUD, the POCO classes in C#, and the factory/manager classes in C# (I intentionally choose NOT to use the Entity Framework for a host of reasons not a part of this context). Once the UI and DB are relatively stable quantities, then I can work on the controllers and views for the WebAPI (I generally support both Blazor and Xamarin.Forms clients) using the generated code. Thus, I can write my unit tests to use actual sources.
I prefer to unit test the "stack" from WebAPI down to the source and back. Because I use an IDE that supports easy and thorough debugging (Visual Studio 2019 as of this writing), it is as easy for me to debug and fix a given unit with test sources as it is with mock data. By top-down testing throughout the "stack", I can know for certain that not only do all the parts work (in various testing case scenarios), but the integrated system works. And I only have to write one test project. What test database or service is used is configured, via transform in the project file) based on the environment chosen (typically dev, qa, uat, prod).
However, that is my approach. I find added value in this approach that I do not achieve with mock data alone. I have used this approach in simple and complex systems, with and without CI/CD. I have used this in systems with multiple databases/sources. It reduces testing complexity and coding time while finding issues not found with mock data alone.
Whatever approach works for you is probably your best choice. I appreciate the insights given by those who took the time to post.
I agree with what you said. I ran into that at two employers with PII information. The utilities to create fake data for test and QA allowed me to include test databases instead of mock data. I guess, in a way, that is technically "mock" data, just without using a Mock functionality in testing.
Don't know if that one will razor lower status, but I'm certain we can't brush it off.
"the debugger doesn't tell me anything because this code compiles just fine" - random QA comment
"Facebook is where you tell lies to your friends. Twitter is where you tell the truth to strangers." - chriselst
"I don't drink any more... then again, I don't drink any less." - Mike Mullikins uncle
I go to a "hipster" barber in Europe, and they always seem to put stuff in a few areas of the hair to make it go in a certain direction. It's pretty cool until I let it grow out by not getting the next cut for 6-8 months.
Nope, I can tell you, and be reasonably accurate about it: Friday, 10 March 1989, at 21:30 (give or take ten minutes). Location: 51.097775, -0.849655[^]
My head was shaved for the second Comic Relief Day: the first raised £11 at my local pub, so I (foolishly and drunkenly) said that if they raised £500 for the second one, they could shave my head. The landlord chipped in the last £20 to get it over, the b*st*ard!
So on a busy Friday night the band was stopped, and my phair, beard and even eyebrows went.
Mind you ... I got a lot of free beer that night ...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!