1
0
mirror of https://github.com/git/git.git synced 2024-10-06 07:31:21 +02:00
Git Source Code Mirror. Please follow Documentation/SubmittingPatches procedure for any of your improvements.
Go to file
Nicolas Pitre 843366961c improve delta long block matching with big files
Martin Koegler noted that create_delta() performs a new hash lookup
after every block copy encoding which are currently limited to 64KB.

In case of larger identical blocks, the next hash lookup would normally
point to the next 64KB block in the reference buffer and multiple block
copy operations will be consecutively encoded.

It is however possible that the reference buffer be sparsely indexed if
hash buckets have been trimmed down in create_delta_index() when hashing
of the reference buffer isn't well balanced.  In that case the hash
lookup following a block copy might fail to match anything and the fact
that the reference buffer still matches beyond the previous 64KB block
will be missed.

Let's rework the code so that buffer comparison isn't bounded to 64KB
anymore.  The match size should be as large as possible up front and
only then should multiple block copy be encoded to cover it all.
Also, fewer hash lookups will be performed in the end.

According to Martin, this patch should reduce his 92MB pack down to 75MB
with the dataset he has.

Tests performed on the Linux kernel repo show a slightly smaller pack and
a slightly faster repack.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 20:28:13 -07:00
arm
compat
contrib Update bash completion for git-config options 2007-05-24 02:07:45 -04:00
Documentation Merge branch 'maint' 2007-05-26 18:53:22 -07:00
git-gui Merge branch 'master' of git://repo.or.cz/git-gui 2007-05-17 16:52:45 -07:00
gitweb gitweb.perl - Optionally send archives as .zip files 2007-05-23 15:09:49 -07:00
mozilla-sha1
perl
ppc
t Merge branch 'maint' 2007-05-26 18:53:22 -07:00
templates
xdiff
.gitignore
.mailmap
alloc.c
archive-tar.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
archive-zip.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
archive.h
attr.c
attr.h
base85.c
blob.c
blob.h
builtin-add.c
builtin-annotate.c
builtin-apply.c git-apply: Fix removal of new trailing blank lines. 2007-05-20 23:51:06 -07:00
builtin-archive.c connect: display connection progress 2007-05-16 12:48:18 -07:00
builtin-blame.c
builtin-branch.c Merge branch 'maint' 2007-05-20 19:58:03 -07:00
builtin-bundle.c
builtin-cat-file.c
builtin-check-attr.c
builtin-check-ref-format.c
builtin-checkout-index.c
builtin-commit-tree.c
builtin-config.c
builtin-count-objects.c
builtin-describe.c Teach git-describe how to run name-rev 2007-05-21 23:56:28 -07:00
builtin-diff-files.c
builtin-diff-index.c
builtin-diff-tree.c
builtin-diff.c
builtin-fetch--tool.c Merge branch 'sv/checkout' 2007-05-20 02:18:47 -07:00
builtin-fmt-merge-msg.c
builtin-for-each-ref.c
builtin-fsck.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
builtin-gc.c Make "git gc" pack all refs by default 2007-05-24 19:05:39 -07:00
builtin-grep.c
builtin-init-db.c
builtin-log.c
builtin-ls-files.c
builtin-ls-tree.c Merge branch 'jn/lstree' 2007-05-23 00:17:47 -07:00
builtin-mailinfo.c
builtin-mailsplit.c Teach mailsplit about Maildir's 2007-05-24 19:01:56 -07:00
builtin-merge-base.c
builtin-merge-file.c
builtin-mv.c
builtin-name-rev.c Merge branch 'maint' 2007-05-24 21:35:29 -07:00
builtin-pack-objects.c Merge branch 'dh/pack' 2007-05-20 02:19:19 -07:00
builtin-pack-refs.c Make the pack-refs interfaces usable from outside 2007-05-26 20:00:55 -07:00
builtin-prune-packed.c
builtin-prune.c
builtin-push.c
builtin-read-tree.c
builtin-reflog.c
builtin-rerere.c
builtin-rev-list.c
builtin-rev-parse.c
builtin-revert.c Fix command line parameter parser of revert/cherry-pick 2007-05-23 00:17:51 -07:00
builtin-rm.c
builtin-runstatus.c
builtin-shortlog.c
builtin-show-branch.c
builtin-show-ref.c
builtin-stripspace.c
builtin-symbolic-ref.c
builtin-tar-tree.c
builtin-unpack-objects.c
builtin-update-index.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
builtin-update-ref.c
builtin-upload-archive.c
builtin-verify-pack.c
builtin-write-tree.c
builtin.h Teach mailsplit about Maildir's 2007-05-24 19:01:56 -07:00
cache-tree.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
cache-tree.h
cache.h rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
check-builtins.sh
check-racy.c
color.c
color.h
combine-diff.c
commit.c Merge branch 'maint' 2007-05-16 12:43:05 -07:00
commit.h
config.c Merge branch 'dh/pack' 2007-05-20 02:19:19 -07:00
config.mak.in
configure.ac
connect.c connect: display connection progress 2007-05-16 12:48:18 -07:00
convert-objects.c
convert.c Fix mishandling of $Id$ expanded in the repository copy in convert.c 2007-05-26 01:12:43 -07:00
copy.c
COPYING
csum-file.c
csum-file.h
ctype.c
daemon.c git-daemon: don't ignore pid-file write failure 2007-05-21 18:34:14 -07:00
date.c
decorate.c
decorate.h
delta.h
diff-delta.c improve delta long block matching with big files 2007-05-26 20:28:13 -07:00
diff-lib.c
diff.c Merge branch 'maint' 2007-05-26 18:53:22 -07:00
diff.h
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c
diffcore.h
dir.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
dir.h rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
dump-cache-tree.c
entry.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
environment.c Merge branch 'dh/pack' 2007-05-20 02:19:19 -07:00
exec_cmd.c
exec_cmd.h
fast-import.c Merge branch 'maint' 2007-05-23 22:37:23 -07:00
fetch-pack.c connect: display connection progress 2007-05-16 12:48:18 -07:00
fetch.c
fetch.h
generate-cmdlist.sh
git-add--interactive.perl
git-am.sh More echo "$user_message" fixes. 2007-05-26 00:33:03 -07:00
git-applymbox.sh
git-applypatch.sh
git-archimport.perl
git-bisect.sh
git-checkout.sh
git-clean.sh
git-clone.sh
git-commit.sh Merge branch 'maint-1.5.1' into maint 2007-05-26 01:30:40 -07:00
git-compat-util.h Merge branch 'maint' 2007-05-16 12:43:05 -07:00
git-cvsexportcommit.perl
git-cvsimport.perl Use git-for-each-ref to check whether the origin branch exists. 2007-05-23 11:06:38 -07:00
git-cvsserver.perl git-cvsserver: fix disabling service via per-method config 2007-05-21 18:42:57 -07:00
git-fetch.sh
git-instaweb.sh
git-lost-found.sh
git-ls-remote.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-ours.sh
git-merge-resolve.sh
git-merge-stupid.sh
git-merge.sh Merge branch 'maint' 2007-05-26 18:53:22 -07:00
git-mergetool.sh
git-p4import.py
git-parse-remote.sh
git-pull.sh
git-quiltimport.sh
git-rebase.sh
git-relink.perl
git-remote.perl
git-repack.sh
git-request-pull.sh
git-reset.sh
git-send-email.perl Merge branch 'maint' 2007-05-17 17:36:57 -07:00
git-sh-setup.sh
git-svn.perl Fix git-svn to handle svn not reporting the md5sum of a file, and test. 2007-05-26 01:17:58 -07:00
git-svnimport.perl
git-tag.sh More echo "$user_message" fixes. 2007-05-26 00:33:03 -07:00
git-verify-tag.sh
GIT-VERSION-GEN GIT 1.5.2 2007-05-20 00:30:39 -07:00
git.c Merge branch 'maint-1.5.1' into maint 2007-05-20 19:57:00 -07:00
git.spec.in
gitk
grep.c
grep.h
hash-object.c
help.c
http-fetch.c
http-push.c
http.c
http.h
ident.c
imap-send.c
index-pack.c
INSTALL
interpolate.c
interpolate.h
list-objects.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
list-objects.h
local-fetch.c
lockfile.c
log-tree.c
log-tree.h
mailmap.c
mailmap.h
Makefile git-gui: Gracefully handle bad TCL_PATH at compile time 2007-05-17 18:10:26 -04:00
match-trees.c
merge-file.c
merge-index.c
merge-recursive.c
merge-tree.c
mktag.c
mktree.c
object-refs.c
object.c Merge branch 'maint-1.5.1' into maint 2007-05-24 19:01:50 -07:00
object.h
pack-check.c fixes to output of git-verify-pack -v 2007-05-25 21:42:47 -07:00
pack-redundant.c
pack-write.c
pack.h
pager.c
patch-delta.c
patch-id.c
patch-ids.c
patch-ids.h
path-list.c
path-list.h
path.c
peek-remote.c connect: display connection progress 2007-05-16 12:48:18 -07:00
pkt-line.c
pkt-line.h
progress.c Fix the progress code to output LF only when it is really needed 2007-05-23 11:30:49 -07:00
progress.h Fix the progress code to output LF only when it is really needed 2007-05-23 11:30:49 -07:00
quote.c
quote.h
reachable.c
reachable.h
read-cache.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
README
receive-pack.c
reflog-walk.c
reflog-walk.h
refs.c
refs.h
RelNotes GIT 1.5.1.6 2007-05-20 00:15:53 -07:00
revision.c git-rev-list: Add regexp tuning options 2007-05-20 20:31:50 -07:00
revision.h
rsh.c
rsh.h
run-command.c
run-command.h
send-pack.c connect: display connection progress 2007-05-16 12:48:18 -07:00
server-info.c
setup.c
sha1_file.c Merge branch 'np/pack' 2007-05-20 02:18:43 -07:00
sha1_name.c
shallow.c
shell.c
show-index.c
sideband.c
sideband.h
ssh-fetch.c
ssh-pull.c
ssh-push.c
ssh-upload.c
strbuf.c
strbuf.h
symlinks.c
tag.c
tag.h
tar.h
test-chmtime.c
test-date.c
test-delta.c
test-genrandom.c
test-match-trees.c
test-sha1.c
test-sha1.sh
trace.c
tree-diff.c
tree-walk.c
tree-walk.h
tree.c rename dirlink to gitlink. 2007-05-21 23:34:54 -07:00
tree.h
unpack-file.c
unpack-trees.c Merge branch 'maint-1.5.1' into maint 2007-05-20 19:57:00 -07:00
unpack-trees.h
update-server-info.c
upload-pack.c
usage.c
utf8.c
utf8.h
var.c
write_or_die.c
wt-status.c Merge branch 'maint-1.5.1' into maint 2007-05-21 18:42:35 -07:00
wt-status.h
xdiff-interface.c
xdiff-interface.h

////////////////////////////////////////////////////////////////

	GIT - the stupid content tracker

////////////////////////////////////////////////////////////////

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room.
 - "goddamn idiotic truckload of sh*t": when it breaks

Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.

Git is an Open Source project covered by the GNU General Public License.
It was originally written by Linus Torvalds with help of a group of
hackers around the net. It is currently maintained by Junio C Hamano.

Please read the file INSTALL for installation instructions.
See Documentation/tutorial.txt to get started, then see
Documentation/everyday.txt for a useful minimum set of commands,
and "man git-commandname" for documentation of each command.
CVS users may also want to read Documentation/cvs-migration.txt.

Many Git online resources are accessible from http://git.or.cz/
including full documentation and Git related tools.

The user discussion and development of Git take place on the Git
mailing list -- everyone is welcome to post bug reports, feature
requests, comments and patches to git@vger.kernel.org. To subscribe
to the list, send an email with just "subscribe git" in the body to
majordomo@vger.kernel.org. The mailing list archives are available at
http://marc.theaimsgroup.com/?l=git and other archival sites.

The messages titled "A note from the maintainer", "What's in
git.git (stable)" and "What's cooking in git.git (topics)" and
the discussion following them on the mailing list give a good
reference for project status, development direction and
remaining tasks.