The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
Please, no "mocking" of the question or questioner.
At several places where I have worked, there were those who made a common practice of using mock data in their unit tests, and those who made a practice of integrating a test database into their unit tests.
The former draw a clear distinction between unit testing and integrated testing. Where there are multi-tiered objects (e.g. controller, services, repositories, etc.) each level gets tested. The theory is that each object is tested independently, so however and by whatever it is used, it will succeed. Integrated testing, with an actual database, is the next step of testing.
The latter see what is being tested as a connected group of systems, so to them integrated testing is part of the process of unit testing. The theory seems to go: Test the outermost connection point (the exposed Web API method) which in turn tests the objects and methods down the stack, as well as the connections between them. That includes using a test database (and test services for 3rd party APIs) so that part of the connected system is tested. The end result they seek is that when the outermost connection point is used, it and everything below it has been tested.
I see a good reason to use either approach, depending on what I am testing. I do have preferences, but I am interested in what this community thinks about using mock data?
The end result they seek is that when the outermost connection point is used, it and everything below it has been tested.
That's a shortcut.
You mock so that each thing is tested individually so you get an immediate indication what in that stack is responsible for the fail. If you take the shortcut, you probably spend some extra time debugging when a fail eventually occurs.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
This is not an either-or situation.
A test that connects to a database is not a unit test by definition because it tests more than a unit.
Unit tests should use mock data.
It's quick, it's easy and you can think of all manner of weird mock data that may be hard to get in or out of a database.
So use unit tests to check whether A + B == C or if the IEmailService.SendMail is called for a (mock) customer who has "InvoiceByEmail" checked (and that it throws if the email address is empty) and that sort of stuff.
If something fails you know either your test or your logic is wrong, but never some third party component.
Use database and service connections for integration tests.
At this point your unit tests should've done their work and you can assume that the code is correct.
If something fails you know there's probably a problem with the connection and you can focus your efforts on finding the problem there.
I once used a library that could mock HTTP requests.
We used it to test whether the correct OData queries would be send to Microsoft Dynamics CRM (including headers and everything).
It's like unit testing your integration
Mock data, because it's a lot easier to create edge cases, and I've never seen a DB with so-called mock data in it actually not get mangled and abused by other people.
The only exception is a weird middle ground, where you store your mock data in a table, not in code. Meaning, you don't rely on the real data structures, you just use the DB like a flat file to feed all your mock data test cases. Hope that makes sense. So yeah, a flat file would be a middle ground too.
I've never seen a DB with so-called mock data in it actually not get mangled and abused by other people.
At a previous job we avoided that issue by having the setup step of the test suite create and initialize a fresh DB for running the unit tests with each execution. Which was great for avoiding DB abuse, but meant the startup time before running any tests was kinda blah.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason?
Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful?
Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies.
-- Sarah Hoyt
The short answer is that ideally you need both unit tests and integration tests.
The unit tests will affirm that a dev has not broken any specific methods within the system and the integration test will affirm the general 'health' of a system.
With unit tests it's sometimes necessary to mock data/functions as if you are just testing one aspect of the system you may need to provide inputs via mocking.
The reason unit tests are useful is that it immediately allows a dev to know if they have broken something - integration tests can take a long time to run and they as much harder to write.
“That which can be asserted without evidence, can be dismissed without evidence.”
We draw a distinction in style of testing, as in we would write tests in a integration style but not as far as a database. We have a common data access layer and mock out data returned with test data. Code above the data layer is all tested with concretes, so our tests do end with lots of data setup (but using Builders/Mothers/Factories we can enhance common data providers to many tests). The tests all run from the highest level possible and pass through all units - we have abstracted away the http/message infrastructure so our starting units are after the "command" reaches the domain.
In the past we would test each unit individually then moved over to a more concrete implementation testing.
Our conclusion is:
1. We have far fewer tests to maintain
2. Our tests are more resilient to change - e.g. one dependency change/refactoring doesn't result in 45 tests needing to be changed, but just update in a common data change (our tests focus on behaviours being met) if the change has no material change to expected output.
3. We can test integration of units quicker - e.g. using strategy/command patterns and only testing units that are mocked has bitten us badly, so testing full concrete implementations ensured correctness.
Where we deviate from the above pattern is where our tests need to test multiple paths in a specific unit (i.e. when the result is null given 3 out 6 inputs, constructor testing, builder testing, etc.), the cost of testing this from the upper level would have infeasible. Then we hit that unit directly with tests.
With this approach we have seen our developers making better testing decisions and code evolving better as they are not swamped with updating tests just because they need to do a refactoring for new functionality.
How do you define mock data? We are not allowed to use real-life customer data in our tests. But we sometimes use data from test system when it is too complex to craft it by ourselves for example. However this has nothing to do with the fact where you store it: hard-coded in your test or in a file or database. For me these are two different aspects.
I suspect anybody who's done a significant amount of automated testing will have experienced the frustration of spending more time maintaining the tests than the code itself. Too many, or too complicated tests can become a burden, so for me it's always a tradeoff between coverage and simplicity.
I've found that testing the full stack with a test DB (what I would call end-to-end tests) gives great test coverage - without these, it's quite possible to have lots of passing unit tests but a system that doesn't actually work when put together. However, maintaining the schema and data in the test DB is an overhead, and worse, end-to-end tests can be brittle and very hard to debug when you get a failure.
On the other hand, I've found that with unit or integration tests to get a useful test often requires quite a bit of mocking, which can quickly get quite complex, and lead to tests that can be... brittle and hard to debug when you get a failure.
Supporting end-to-end tests with good unit/integration tests gives the best of both worlds, but leads to lots of tests and lots of complexity, all of which requires maintenance.
I'm curious: You mentioned that you have preferences, and as you've experienced both strategies in several places, I'd be really interested to know what these are?
Design so you can mock the outside world as far as you can (database, filesystem, the system clock, even), but have some level of integration testing so that you verify the connections between your system and the outside world work.
I've done the 'use a test database' thing as well in the past, and that really wasn't a good thing...
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
We have a really complex database (uses four databases, with almost 1000 stored procs and hundreds of tables).
We've started redesigning it (and the app that uses it).
Due to the high complexity of the database itself, we will create a staging database that we can pull data from which will exercise specific use/edge cases, and when we start a test run, it will copy that staged data into the actual database (in our dev environment of course), and that way, the store procs in the actual database(s) can be used without any special "testing sql" in the database itself, short of some stored procs that verify certain data is present.
".45 ACP - because shooting twice is just silly" - JSOP, 2010 ----- You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010 ----- When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013
By definition unit tests should be small, focused, fast and should not use external files, databases, rest.. anything like that. If you need to hit the file system create an interface and mock it. Unit tests can run on any machine without configurations, files, database connections .. They can run on DevOps or your choice of cloud with no extra configs. They just work.
Integration tests on the other hand don't have that restriction. For me it's basically a sandbox to make sure I can run that tracer thru the code correctly. Maybe figure out how to do something, benchmarks, etc. They are not run during CI/CD builds. I mark them as ignored just to be sure.
Also.. I keep them in separate assemblies just to be sure of no leaks.
~"Watch your thoughts; they become your words. Watch your words they become your actions.
Watch your actions; they become your habits. Watch your habits; they become your character.
Watch your character; it becomes your destiny."
Largely covered, but ( note, not my area, so I likely have some terms wrong ) mocked data and requests lets you test for corrupted data, "little Johnny Tables", spaces where they don't belong, stupidly long inputs...
In my stuff, I first try to get it working so I can use it, then try to break it, then watch the operators and production parts to see what happens. ( Pro tip, while checking to make sure you can't drop a transmission ( gearbox ), have several cardboard boxes there to catch it. )
I understand the theory and conventional wisdom about unit testing and integration testing. But I don't just go by what the "experts" say or where the herd goes. I apply value engineering and a long-term support consideration to implement those "conventional wisdom" principles in a way that maximizes benefit today and tomorrow on a project by project basis.
When I use mock data:
1 - early in the development process before a test database and any external services (for test mode) are available. In some cases, those test sources are not available, at least to developers.
2 - when the test database is unmaintained, or difficult to update (e.g. developers have to go through the DBA's team to do anything to a DB)
3 - when the project is so large, or so distributed among development teams that mock data is necessary for the benefit of other teams.
When I use test sources and NOT mock data:
1 - when available and easily updateable to developers (e.g. I never hire a developer who cannot at least create a T-SQL table with indices and stored procedures).
2 - when PII (Personally Identifiable Information) can be auto-generated to fill DB tables. Not hard to do, but creating such utility apps is helpful and can avoid lawsuits should the company be audited or test data leaks out.
When I use test sources AND mock data:
1 - Same reasons as above to use test sources - as available.
2 - I use mock data for those connection points where test sources are not reliably available.
In most situations, I use test sources only because I do not need mock data. As part of the initial design, the UI layout is created and approved at some point. As an offshoot, the DB design (tables, indices, PK-FK relationships) is done in parallel. I use my utility to generate the SPs for POCO CRUD, the POCO classes in C#, and the factory/manager classes in C# (I intentionally choose NOT to use the Entity Framework for a host of reasons not a part of this context). Once the UI and DB are relatively stable quantities, then I can work on the controllers and views for the WebAPI (I generally support both Blazor and Xamarin.Forms clients) using the generated code. Thus, I can write my unit tests to use actual sources.
I prefer to unit test the "stack" from WebAPI down to the source and back. Because I use an IDE that supports easy and thorough debugging (Visual Studio 2019 as of this writing), it is as easy for me to debug and fix a given unit with test sources as it is with mock data. By top-down testing throughout the "stack", I can know for certain that not only do all the parts work (in various testing case scenarios), but the integrated system works. And I only have to write one test project. What test database or service is used is configured, via transform in the project file) based on the environment chosen (typically dev, qa, uat, prod).
However, that is my approach. I find added value in this approach that I do not achieve with mock data alone. I have used this approach in simple and complex systems, with and without CI/CD. I have used this in systems with multiple databases/sources. It reduces testing complexity and coding time while finding issues not found with mock data alone.
Whatever approach works for you is probably your best choice. I appreciate the insights given by those who took the time to post.
I agree with what you said. I ran into that at two employers with PII information. The utilities to create fake data for test and QA allowed me to include test databases instead of mock data. I guess, in a way, that is technically "mock" data, just without using a Mock functionality in testing.