|We are in the process of firmly establishing the Docker Registry from docker.com. We are developing low level software, so we mostly do system builds and testing - no orchestration or swarming, no microservices. The reason for using Docker is to keep control over the build tools: We must be able to pick up an old project and rebuild a delivery from two or three years ago, identical to the bit. Using dockerized tools is an element to reach this goal.
We did initial trials on Windows, just to learn what it is, but our IT guys want central servers on Linux (they seem to prefer CentOS, but I think the developers pressured for Ubuntu on the registry server). We already have a small handful of production lines and half a dozen repositories (i.e. image names) using the registry. It seems to be fairly stable.
But: The free version has no access control whatsoever: Any developer can push any self-built garbage image to the server. Young software developers are as rebellious as a teenage son, always trying to ignore rules and sneak around blocks. So we must either switch to the paid version (our budget guys prefer not to), or investigate the open-source Portus[^] solution - our IT guys are evaluating Portus right now.
When deleting, only pointers are deleted. The garbage collector is sort of lazy, you have to wake him up manually (or by an alarm clock). He is rather careless, too: First makes a round to mark what to dispose, then a second round to pick it up. If someone pushes another image between the rounds, saying "But I would like to use that layer!", he may pick it up and dispose the layer in his second round anyway. Their own words are "Stop the world GC" We will set the cron-ometer to Monday morning at 04:00, and all who are supposed to upload images will know to sleep tight Monday mornings, rather than pushing images.
The free version neither provides a web interface to the registry nor a stand-alone UI - not even a command-line version (but you would definitely want a GUI of sorts to overview the registry). For now we are using curl for REST calls ... which is slightly above drawing the bit pattern to send out on the line, but not by much. You can find a number of free front ends at github, but I haven't seen any ready-to-use binaries, and most certainly not for Windows. While I sure can retrieve the source code, set up Linux in a virtual machine, pick up all the build tools required to build the job and run the build, doing that for twelve alternatives is a little cumbersome. I haven't done that yet.
One point that is independent of which registry solution you choose:
In the experimentation phase, images were build without any discipline and order, so except for the Ubuntu base layer, almost every image had its own set of layers. And they were huge - each version tended to be 4-5 gigabytes.
There are two reasons for this: We decided against "One image, one tool" and an "external" build script calling tools in turn. Rather, we put all the tools for a build step into a single container; this makes it much easier to keep track of consistent toolboxes where we know that the various tools' versions go together. The build step is controlled by a bash script running inside the container (it is located in the checked-out source tree, that is mounted in the container at startup). So, images tend to be large (but there are not as many of them).
Second: The experimenting developers seemed to be scared of layers, trying to reduce the number by loading as many tools, as many Python packages and whathaveyou, as possible, in one single build steps for the image. So every layer was different, no common use (except for the Ubuntu base), and disk space requirements were huge, when the tiniest little version update required a complete 5 GB image build from the bottom.
So: We are now establishing a tree structure of base layers: With Ubuntu 18.04 LTS at the bottom, we create an image with a stable set of basic build management tools, common to all build tasks and not expected to change, and we use this "ubuntutools" (rather than the raw Ubuntu) layer as a base to build on. Then we add a fairly stable gcc, and a set of C/C++ related tools to make a "gcc base layer" for the more specialized images to be based on. On the ubuntutools base we also build a Python branch with a fairly large set of pre-installed Python packages (we currently use around 150 of them) and a set of Python tools. Our developers frequently request new packages; then we lay a thin "veneer" layer on top of the common Python layer, adding to the large set already in the base.
The art is in determining which tools are super-stable, and can be put in the lower layers (like Cmake and Ninja - they do come in new versions, but we rarely require the update), medium-stable tools (like gcc - we do not switch to a new release until the old one doesn't work for us), and volatile elements (like python packages under development) that must be placed in the leaf nodes. When we have to update a low or intermediate level layer, the tree must grow a new branch, but we require a documented need for that update before we accept it - a developer's wish to always run "the latest and greatest version" is not sufficient. (In many cases, when the update requirement is for a single component, we can also provide a veneer layer that replaces the version lower layer.)
This structure has a number of benefits:
Using an already-build, complex image as a base reduces (leaf) image build time drastically.
Dockerfile for the top layers are very simple.
A lot of disk space is saved, both in the registry and in the Docker engine.
The layer cache in the engine is used far more efficiently.
Network traffic to retrieve layers/images from the registry is significantly reduced.
When several containers run simultaneously, they will to a much larger degree share code segments in RAM, even if they run different images, when they are built on the same "high level" base image.
Startup times may be somewhat reduced: The probability of a (medium layer) image already being present in RAM increases.
The only significant disadvantage is that to see all the tool versions in your image, you have to nest backwards through multiple levels of base images. We are documenting the entire tree on our intranet, and you can click yourself backwards layer by layer, to get far more information than you could find in a huge "single level" Dockerfile, and certainly in a much more readable format!
But most of all: Enforcing this tree structure helps us keep those unruly developers under control so they don't go wild with plethoras of incompatible tool versions (which is killing to the idea of reproducible builds!).