Is there a way to limit Git to sparse checkout?

As a recent question slated, I am looking for a way to speed up operations on a Git repository with a very large number of files (~ 6 million). I would rather not use submodules. The problem is that the operations are quite slow. Is it possible to have one large repository but tell Git to focus on only a portion of the repository? I thought maybe creating a sparse checkout would do it, but the read-tree operation seems to delete files not listed in the sparse checkout file and is taking a very long time. Is it possible for the reading tree to keep all files where they are, and in proportion only to the number of files specified in the sparse check file?


source to share

2 answers

Currently no. Git only recently (1.7+) added any sparse support for checking out, and it was still pretty bare-bones - mainly because Git was not designed to handle only working with a portion of the repository.

It was designed more as a version control system with one repository per project. Submodules were the method of choice for handling "projects" that had many large subcomponents.



First, I would suggest learning and using submodules.

You can script what you like with

git ls-tree sha1
git show sha1:path/to/some/file.txt


and other low-level teams. Also see bash commands such as



and pipelines.



All Articles