When Git isn't good enough: Version control for enterprises

Built for development teams, Git can’t meet enterprise scalability and security requirements on its own

The road to product release is forked, twisted, and winding -- anything but straight. That’s because modern product development is increasingly multidisciplinary, rapid, iterative, and geographically distributed. Art, design, prototyping, manufacturing, programming, and so forth all use different tools, yet need to work together more closely than ever before. As such, critical to success is having what agile devotees call a “single source of truth.” This means one and only one place where all of the product development content is stored, revised, secured, and synchronized, even when contributors are spread around the world.

Git has long been popular among developers, but attempts to make it work for the enterprise have evinced many challenges and spawned a variety of workarounds. The purpose of this article is to explain a few of the more salient challenges and their workarounds and to outline the shape of a better solution.

Git limitations

The largest challenge Git faces in the enterprise arguably stems from its own limitations and performance problems when dealing with large numbers of files or very large files. Git repositories become so slow and unwieldy as they grow that the largest practical size is broadly recognized as being between 1GB and 2GB of content.

This is precisely what drives the phenomenon known as “Git sprawl,” or the tendency to break what should be one large repository into dozens, hundreds, or even thousands of smaller repos. Managing so many repos introduces its own problems and has unsurprisingly given rise to a variety of tools (such as git-annex) to tame such complexity.

This isn’t the only workaround. Another is to store the larger files outside the repository itself, using a tool like the Git Large File Storage extension. The idea is to store only a small “pointer” file in the actual repository, yet retrieve the large data when needed from a completely different system.

Another set of challenges revolves around hosting so many repositories and protecting their content. Git includes its own daemon for easy hosting, but such hosting is completely open. It requires no authorization, so anyone on the network can see or do anything. That works nicely for a variety of use cases, but it’s a recipe for ulcers in the enterprise.

These limitations stem from the fact that Git concerns itself only with authentication, leaving authorization to the file system. In other words, Git provides tools that can be used to ensure commits are correctly signed cryptographically by the individuals making them, but it doesn’t offer any options for locking down particular files, folders, branches, and so on via the usual access control lists or other mechanisms.

These shortcomings explain the explosion of third-party hosting tools and services such as GitHub, GitLab, and Atlassian Stash, all of which may involve their own trade-offs. Free hosting in the cloud usually means limited space or privacy. Free local hosting behind your own firewall usually means limited features and IT headaches. As with so many aspects of life, you get what you pay for, but even the paid offerings can’t magically dispel the limitations of the underlying tool.

Version control for the enterprise

These are only some of the challenges facing Git adoption in the enterprise, but they’re enough to establish a pattern, one that is overlooked surprisingly often. Each limitation of Git ineluctably burdens the enterprise with the weight of a corresponding workaround.

It doesn’t take long before that metaphorical weight meets or exceeds that of the original problems a version control system was supposed to solve. Storing many files or large files means you need to split your repos or perhaps host the large files externally. The need for secure hosting means you need to embrace a third-party solution or build your own. These needs -- as well as those for more granular access control, synchronization, scalability, high availability, disaster recovery -- add more extensions, tools, and processes to the picture in very short order.

The result is effectively another kind of Git sprawl, what we might call “IT sprawl,” as IT departments who embrace Git must learn, adopt, and support a variety of elements from the larger Git ecosystem.

Modern product development requires us to store, revision, secure, and synchronize content across teams. Any version control system that needs a large number of add-on tools and workaround processes to meet these requirements misses the mark for the enterprise.

The shape of a better solution should already be pretty clear. An enterprise-grade version control system will let you store any number of files of any size and type. It should easily revision that content and synchronize it across multiple teams, no matter where they’re located. It should be flexible in its hosting configuration, yet offer the benefits of centralized administration, making it easy to define groups, users, and permissions with fine-grained access control. And of course, it should be open and flexible enough to adapt to different workflows, getting out of the way as much as possible and letting artists, designers, programmers, and other contributors develop content using their favorite tools.

By itself, Git paints only a small part of this picture. Adding multiple extensions and tools brings more of it into focus. But for the enterprise, there comes a point where the workarounds -- with their additional demands, complexity, and limitations -- begin to cause as many problems as they solve. This is the point where you realize you needed an enterprise solution all along.

A better solution

An enterprise-oriented version control system such as Perforce Helix meets enterprise needs without forcing you to embrace a host of unsupported extensions or burn precious development cycles on home-brewed tools to fill gaps. Rather, it completes the picture we’ve been framing right out of the box from a single vendor and source of support. At the same time, the solution speaks native Git and supports the full ecosystem of Git tools that developers know and love.

For example, Helix can handle any number of digital assets in addition to source code, any type of data stored in files of any size. Helix supports tens of thousands of concurrent users pushing millions of transactions each day. And with its federated architecture, clustering, and high-availability options, it can automatically synchronize all of that content to remote teams located around the world and keep users working even when hardware is down. This is possible because the Helix Versioning Engine is the product of decades of careful development and tuning aimed at maximum scalability, performance, and reliability.

What’s more, Helix is not only a great Git solution for developers, but a much broader platform for all the stakeholders in the enterprise. It offers its own native distributed version control features, flexible and granular security down to the file level, locking features for digital assets that can’t be merged, collaboration and review capabilities, and analytics and insight into the complete production pipeline. The new Helix Threat Detection component even leverages advanced behavioral analysis to detect and report potential risks to your intellectual property before it walks out the door.

Choosing the right version control system is all about empowering your teams to build better products faster and more cheaply. Git may be free, but making Git work for the enterprise isn’t, particularly in terms of the hidden costs associated with lost productivity, the burdens of various workarounds, and climbing the learning curves for Git and its supplemental tools and extensions.

Commercial offerings have more obvious up-front costs, but they also more completely paint the picture of a better solution, which Git merely sketches. As such, organizations should consider all the relevant factors before making a final decision. In short, if your version control system doesn’t easily and intrinsically address all of the reasons it’s needed, then it’s clearly not an ideal solution. You should look elsewhere.

John Williston, Ph.D., is a veteran software developer for Windows, .Net, and the Web. He is currently a product marketing manager at Perforce Software.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2015 IDG Communications, Inc.