If you try to fsck a repository that isn't entirely empty, but that has no
inter-object references (ie all the objects are blobs, and don't refer to
anything else), git-fsck-objects currently fails.
This probably cannot happen in practice, but can be tested with something
like
git init-db
touch dummy
git add dummy
git fsck-objects
where the fsck will die by a divide-by-zero when it tries to look up the
references from the one object it found (hash_obj() will do a modulus by
refs_hash_size).
On some other archiectures (ppc, sparc) the divide-by-zero will go
unnoticed, and we'll instead SIGSEGV when we hit the "refs_hash[j]"
access.
So move the test that should protect against this from mark_reachable()
into lookup_object_refs(), which incidentally in the process also fixes
mark_reachable() itself (it used to not mark the one object that _was_
reachable, because it decided that it had no refs too early).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Earlier commit 3e4339e6f96e8c4f38a9c6607b98d3e96a2ed783 had a
thinko that did not check for collisions while repopulating the
objects in the new hash table.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently, we don't check refs_hash_size size and happily call
lookup_object_refs() even if refs_hash_size is zero which leads to
a division by zero in hash_obj().
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This shrinks "struct object" to the absolutely minimal size possible.
It now contains /only/ the object flags and the SHA1 hash name of the
object.
The "refs" field, which is really needed only for fsck, is maintained in
a separate hashed lookup-table, allowing all normal users to totally
ignore it.
This helps memory usage, although not as much as I hoped: it looks like
the allocation overhead of malloc (and the alignment constraints in
particular) means that while the structure size shrinks, the actual
allocation overhead mostly does not.
[ That said: memory usage is actually down, but not as much as it should
be: I suspect just one of the object types actually ended up shrinking
its effective allocation size.
To get to the next level, we probably need specialized allocators that
don't pad the allocation more than necessary. ]
The separation makes for some code cleanup, though, and makes the ref
tracking that fsck wants a clearly separate thing.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>