DevNation sneak peek is a behind-the-scenes preview of sessions and information that will take place at DevNation 2016. Sign up for DevNation at Learn more. Code more. Share more. Join the Nation.

Tracking huge files with Git LFS

Hi! I'm Tim Pettersen, a passionate Git evangelist and Atlassian developer with a decade of experience working on JIRA and Bitbucket, and I have a problem.

In the past, I've presented about almost every aspect of "getting Git right": the value of feature branching, the power of rebase, the dangers of submodules. Most audience members were receptive and excited to hear about how distributed version control could change their life! But after every talk, there'd always be a few glum faces in the crowd, unmoved by the rapturous beauty and raw power of the world's favorite distributed version control system.

I soon discovered why. These unfortunate souls were held hostage by their legacy centralized version control systems.

Not because of the technical challenge of migration: Git is now eleven years old and migration from any given VCS is not only a solved problem, but a well worn path where countless software teams have trod. These unlucky few were trapped due to the very nature of Git's philosophy of distributed version control.

Many of Git's benefits come from the fact that each developer has their own local copy of the entire history of their codebase. If that repository contains large binary files, fetching and pushing these repositories eventually slows down to the point of impracticality. The upshot is that game designers with binary assets, researchers with large data sets, web developers with rich media, QA engineers with data snapshots, and any other software team that needs to version large content, is stuck on centralized Subversion, or CVS, or Perforce, or ClearCase.

At least, they were. Today, we have Git LFS.

Git LFS (Large File Storage) is an open source extension to native Git, jointly developed by engineers from Atlassian Bitbucket and GitHub. In my session at DevNation I'll be talking in depth about how Git LFS works, how to adopt it, and how to use it in a software team. This talk is partly practical advice on Git LFS usage; and partly a technical deep-dive on the Git data model, Git's powerful extension system of hooks and filters, and the inspired computer science behind Git LFS' storage design and client/server architecture.

So, if you love Git, want to move to Git, or you're just curious about how the two largest competitors in the Git hosting space came to collaborate on the same open source project, come along to learn about the exciting Git LFS project! If you have any specific questions you’d like me to address in the presentation, hit me up on Twitter (I’m @kannonboy).

Last updated: February 6, 2024