How does Git know that I renamed a file when I didn't, but intended?
I didn't actually rename it, I deleted hash.c
with linux rm
and then copied the new version of my implementation of the hash table named hashdic.c
with linux cp
from another directory. The deleted file and the new file are very similar, but not the same because I was working on hashdic.c
in a different directory for several hours.
Then I typed git rm hash.c
(although it was already removed from the filesystem to remove from the repository) and then typed git add hashdic.c
.
Then git commit -am "update to hash table"
. And magic! Git says:
renamed: hash.h -> hashdic.h
But, Holmes, how? As Git knows, I actually renamed a file if technically I just deleted it and added a new one called MISCELLANEOUS ?
The whole process:
- copy / paste from
~/project/hash.c
to~/other/project/hashdic.c
- edit ~ / other / project / hashdic.c
-
rm hash.c
-
cp ~/other/project/hashdic.c ~/project/hashdic.c
-
git rm hash.c
-
git commit -am descr
source to share
Try the following:
$ git diff --name-status -M HEAD^ HEAD
You should see that between the two commits the file has been renamed and has a "similarity index" (say) 95:
R095 hash.c hashdic.c
(I typed this according to your post - one line calls both files .h
, others call it .c
, I went with .c
here, anyway, it wasn't cut and pasted, so there might be some minor glitches - and I made up the index value similarity, but the result should be similar enough to be recognized anyway, and I expect the similarity index to be below 100%. Obviously at least 50% since that is the default.)
This shows that between the previous and current commits, the file was renamed and slightly modified.
Once you've done that, try this:
$ git diff --name-status -M100% HEAD^ HEAD
This time you will see what has hash.c
been removed and has hashdic.c
been added:
D hash.c
A hashdic.c
This shows that the change between the previous and the current commit has no renames, only the file removed and added.
What is it? It's both: it's floor wax and dessert valley!
The point is that git dynamically computes the change between commits (or a commit and an index or working directory or any such pairing) every time you request it, whether you explicitly do it git diff
or run it git status
(or git commit
run it for you). You can specify if rename detection is allowed ( --no-renames
1 ), and if so, at what similarity threshold ( -M
).
You can also request a copy detection ( -C
and --find-copies-harder
). There are some restrictions on how many "tree names" apply this, as it can be very computationally expensive to compare every file in one commit versus every file in another. By default, git restricts renaming detection, which is a bit easier since git only does this for "filenames that were at the start of a commit but not at the destination, or filenames that were at the end of a commit but not at the beginning" ...
(In this case, it is, hash.c
and hashdic.c
accordingly, unless you removed and / or added additional paths. Thus, git should distinguish between these two files against each other, and not against any additional files, get one similarity index and compare it with setting -M
.)
1 Most of these control buttons are only available in git diff
: git status
definition of renaming hard drives to "on" and 50%, for example. Number of file names queued rename detection, supervised installation git config
, diff.renameLimit
. Other git commands like git blame
run git internal diff mechanism with user-configurable controls, but not all of them have the same meaning as in git diff
. For example, git blame
viewing only one file rather than entire directories, so it is -C
, and -M
completely different.
source to share