1
0
mirror of https://github.com/git/git.git synced 2024-10-06 09:51:24 +02:00
Commit Graph

9966 Commits

Author SHA1 Message Date
Nicolas Pitre
843366961c improve delta long block matching with big files
Martin Koegler noted that create_delta() performs a new hash lookup
after every block copy encoding which are currently limited to 64KB.

In case of larger identical blocks, the next hash lookup would normally
point to the next 64KB block in the reference buffer and multiple block
copy operations will be consecutively encoded.

It is however possible that the reference buffer be sparsely indexed if
hash buckets have been trimmed down in create_delta_index() when hashing
of the reference buffer isn't well balanced.  In that case the hash
lookup following a block copy might fail to match anything and the fact
that the reference buffer still matches beyond the previous 64KB block
will be missed.

Let's rework the code so that buffer comparison isn't bounded to 64KB
anymore.  The match size should be as large as possible up front and
only then should multiple block copy be encoded to cover it all.
Also, fewer hash lookups will be performed in the end.

According to Martin, this patch should reduce his 92MB pack down to 75MB
with the dataset he has.

Tests performed on the Linux kernel repo show a slightly smaller pack and
a slightly faster repack.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 20:28:13 -07:00
Linus Torvalds
99b5a79e13 Make the pack-refs interfaces usable from outside
This just basically creates a "pack_refs()" function that could be used by
anybody. You pass it in the flags you want as a bitmask (PACK_REFS_ALL and
PACK_REFS_PRUNE), and it will do all the heavy lifting.

Of course, it's still static, and it's all in the builtin-pack-refs.c
file, so it's not actually visible to the outside, but the next step would
be to just move it all to a library file (probably refs.c) and expose it.

Then we could easily make "git gc" do this too.

While I did it, I also made it check the return value of the fflush and
fsync stage, to make sure that we don't overwrite the old packed-refs file
with something that got truncated due to write errors!

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 20:00:55 -07:00
Junio C Hamano
c56ed464b0 Merge branch 'maint'
* maint:
  Fix git-svn to handle svn not reporting the md5sum of a file, and test.
  Fix mishandling of $Id$ expanded in the repository copy in convert.c
  More echo "$user_message" fixes.
  Add tests for the last two fixes.
  git-commit: use printf '%s\n' instead of echo on user-supplied strings
  git-am: use printf instead of echo on user-supplied strings
  Documentation: Add definition of "evil merge" to GIT Glossary
  Replace the last 'dircache's by 'index'
  Documentation: Clean up links in GIT Glossary
2007-05-26 18:53:22 -07:00
Junio C Hamano
d1c7c27ea3 Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  Fix git-svn to handle svn not reporting the md5sum of a file, and test.
  More echo "$user_message" fixes.
  Add tests for the last two fixes.
  git-commit: use printf '%s\n' instead of echo on user-supplied strings
  git-am: use printf instead of echo on user-supplied strings
  Documentation: Add definition of "evil merge" to GIT Glossary
  Replace the last 'dircache's by 'index'
  Documentation: Clean up links in GIT Glossary
2007-05-26 01:30:40 -07:00
James Y Knight
20b3d206ac Fix git-svn to handle svn not reporting the md5sum of a file, and test.
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 01:17:58 -07:00
Andy Parkins
c23290d528 Fix mishandling of $Id$ expanded in the repository copy in convert.c
If the repository contained an expanded ident keyword (i.e. $Id:XXXX$),
then the wrong bytes were discarded, and the Id keyword was not
expanded.  The fault was in convert.c:ident_to_worktree().

Previously, when a "$Id:" was found in the repository version,
ident_to_worktree() would search for the next "$" after this, and
discarded everything it found until then.  That was done with the loop:

    do {
        ch = *cp++;
        if (ch == '$')
            break;
        rem--;
    } while (rem);

The above loop left cp pointing one character _after_ the final "$"
(because of ch = *cp++).  This was different from the non-expanded case,
were cp is left pointing at the "$", and was different from the comment
which stated "discard up to but not including the closing $".  This
patch fixes that by making the loop:

    do {
        ch = *cp;
        if (ch == '$')
            break;
        cp++;
        rem--;
    } while (rem);

That is, cp is tested _then_ incremented.

This loop exits if it finds a "$" or if it runs out of bytes in the
source.  After this loop, if there was no closing "$" the expansion is
skipped, and the outer loop is allowed to continue leaving this
non-keyword as it was.  However, when the "$" is found, size is
corrected, before running the expansion:

    size -= (cp - src);

This is wrong; size is going to be corrected anyway after the expansion,
so there is no need to do it here.  This patch removes that redundant
correction.

To help find this bug, I heavily commented the routine; those comments
are included here as a bonus.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 01:12:43 -07:00
Jeff King
a23bfaed7d More echo "$user_message" fixes.
Here are fixes to more uses of 'echo "$msg"' where $msg could contain
backslashed sequence.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 00:33:03 -07:00
Junio C Hamano
816366e23d Add tests for the last two fixes.
This updates t4014 to check the two fixes for git-am and git-commit
we observed with "echo" that does backslash interpolation by default
without being asked with -e option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 00:26:20 -07:00
Junio C Hamano
293623edbc git-commit: use printf '%s\n' instead of echo on user-supplied strings
This fixes the same issue git-am had, which was fixed by Jeff
King in the previous commit.  Cleverly enough, this commit's log
message is a good test case at the same time.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 00:23:23 -07:00
Jeff King
4b7cc26a74 git-am: use printf instead of echo on user-supplied strings
Under some implementations of echo (such as that provided by
dash), backslash escapes are recognized without any other
options. This means that echo-ing user-supplied strings may
cause any backslash sequences in them to be converted. Using
printf resolves the ambiguity.

This bug can be seen when using git-am to apply a patch
whose subject contains the character sequence "\n"; the
characters are converted to a literal newline. Noticed by
Szekeres Istvan.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-25 21:43:33 -07:00
Nicolas Pitre
ddcf786fd7 fixes to output of git-verify-pack -v
Now that the default delta depth is 50, it is a good idea to also bump
MAX_CHAIN to 50.

While at it, make the display a bit prettier by making the MAX_CHAIN
limit inclusive, and display the number of deltas that are above that
limit at the end instead of the beginning.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-25 21:42:47 -07:00
Jakub Narebski
c1bab2889e Documentation: Add definition of "evil merge" to GIT Glossary
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-25 20:54:38 -07:00
Jakub Narebski
5adf317b31 Replace the last 'dircache's by 'index'
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-25 20:54:23 -07:00
Jakub Narebski
a58f3c01f7 Documentation: Clean up links in GIT Glossary
Ensure that the same link is not repeated in single glossary entry,
and that there is no self-link i.e. link to current entry.

Add links to other definitions in git glossary.

Remove inappropriate (nonsense) links, or change link to link to
correct definition (to correct term).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-25 20:54:16 -07:00
Junio C Hamano
18bece4367 Merge branch 'maint'
* maint:
  fix memory leak in parse_object when check_sha1_signature fails
  name-rev: tolerate clock skew in committer dates
  Update bash completion for git-config options
  Teach bash completion about recent log long options
  Teach bash completion about 'git remote update'
  Update bash completion header documentation
  Remove a duplicate --not option in bash completion
  Teach bash completion about git-shortlog
  Hide the plumbing diff-{files,index,tree} from bash completion
  Update bash completion to ignore some more plumbing commands
2007-05-24 21:35:29 -07:00
Junio C Hamano
6d9d26d826 Merge branch 'master' of git://repo.or.cz/git/fastimport into maint
* 'master' of git://repo.or.cz/git/fastimport:
  Update bash completion for git-config options
  Teach bash completion about recent log long options
  Teach bash completion about 'git remote update'
  Update bash completion header documentation
  Remove a duplicate --not option in bash completion
  Teach bash completion about git-shortlog
  Hide the plumbing diff-{files,index,tree} from bash completion
  Update bash completion to ignore some more plumbing commands
2007-05-24 21:34:59 -07:00
Linus Torvalds
56752391a8 Make "git gc" pack all refs by default
I've taught myself to use "git gc" instead of doing the repack explicitly,
but it doesn't actually do what I think it should do.

We've had packed refs for a long time now, and I think it just makes sense
to pack normal branches too. So I end up having to do

	git pack-refs --all --prune

in order to get a nice git repo that doesn't have any unnecessary files.

So why not just do that in "git gc"? It's not as if there really is any
downside to packing branches, even if they end up changing later. Quite
often they don't, and even if they do, so what?

Also, make the default for refs packing just be an unambiguous "do it",
rather than "do it by default only for non-bare repositories". If you want
that behaviour, you can always just add a

	[gc]
		packrefs = notbare

in your ~/.gitconfig file, but I don't actually see why bare would be any
different (except for the broken reason that http-fetching used to be
totally broken, and not doing it just meant that it didn't even get
fixed in a timely manner!).

So here's a trivial patch to make "git gc" do a better job. Hmm?

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-24 19:05:39 -07:00
Fernando J. Pereda
d63bd9a217 Teach mailsplit about Maildir's
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-24 19:01:56 -07:00
Junio C Hamano
76026200ee Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  fix memory leak in parse_object when check_sha1_signature fails
  name-rev: tolerate clock skew in committer dates
2007-05-24 19:01:50 -07:00
Carlos Rica
0b1f113075 fix memory leak in parse_object when check_sha1_signature fails
When check_sha1_signature fails, program is not terminated:
it prints an error message and returns NULL, so the
buffer returned by read_sha1_file should be freed before.

Signed-off-by: Carlos Rica <jasampler@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-24 18:56:06 -07:00
Junio C Hamano
c075aea5da name-rev: tolerate clock skew in committer dates
In git.git repository, "git-name-rev v1.3.0~158" cannot name the
rev, while adjacent revs can be named.

This was because it gives up traversal from the tips of existing
refs as soon as it sees a commit that has older commit timestamp
than what is being named.  This is usually a good heuristics,
but v1.3.0~158 has a slightly older commit timestamp than
v1.3.0~157 (i.e. it's child), as these two were made in a
separate repostiory (in fact, in a different continent).

This adds a hardcoded slop value (1 day) to the cut-off
heuristics to work this kind of problem around.  The current
algorithm essentially runs around from the available tips down
to ancient commits and names every single rev available that are
newer than cut-off date, so a single day slop would not add that
much overhead in repositories with long enough history where the
performance of name-rev matters.

I think the algorithm could be made a bit smarter by deepening
the graph on demand as a new commit is asked to be named (this
would require rewriting of name_rev() function not to recurse
itself but use a traversal list like revision.c traverser does),
but that would be a separate issue.

Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-24 13:36:54 -07:00
Shawn O. Pearce
12977705b3 Update bash completion for git-config options
A few new configuration options grew out of the woodwork during the
1.5.2 series.  Most of these are pretty easy to support a completion
of, so we do so.

I wanted to also add completion support for the <driver> part of
merge.<driver>.name but to do that we have to look at all of the
.gitattributes files and guess what the unique set of <driver>
strings would be.  Since this appears to be non-trivial I'm punting
on it at this time.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 02:07:45 -04:00
Shawn O. Pearce
8f87fae645 Teach bash completion about recent log long options
(Somewhat) recently git-log learned about --reverse (to show commits
in the opposite order) and a looong time ago I think it learned
about --raw (to show the raw diff, rather than a unified diff).
These are both useful options, so we should make them easy for the
user to complete.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:51:30 -04:00
Shawn O. Pearce
fb72759b7d Teach bash completion about 'git remote update'
Recently the git-remote command grew an update subcommand, which
can be used to execute git-fetch across multiple repositories
in a single step.  These can be configured with the 'remotes.*'
configuration options, so we can offer completion for any name that
matches and appears to be useful to git-remote update.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:46:49 -04:00
Junio C Hamano
98ee8187e4 Merge branch 'maint'
* maint:
  Fix possible coredump with fast-import --import-marks
  Refactor fast-import branch creation from existing commit
  fast-import: Fix crash when referencing already existing objects
  fast-import: Fix uninitialized variable
  Documentation: fix git-config.xml generation
2007-05-23 22:37:23 -07:00
Junio C Hamano
a21f0f0a22 Merge branch 'maint' of git://repo.or.cz/git/fastimport into maint
* 'maint' of git://repo.or.cz/git/fastimport:
  Fix possible coredump with fast-import --import-marks
  Refactor fast-import branch creation from existing commit
  fast-import: Fix crash when referencing already existing objects
  fast-import: Fix uninitialized variable
2007-05-23 22:37:03 -07:00
Shawn O. Pearce
c70680ce7c Update bash completion header documentation
1) Added a note about supporting the long options for most commands,
    as we have been doing so for quite some time.

 2) Include a notice that these routines are covered by the GPL,
    as that may not be obvious, even though they are distributed
    as part of the core Git distribution.

 3) Added a short section on how to send patches to the routines,
    and to whom they should get sent to.  Currently that is me,
    as I am the active maintainer.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:36:46 -04:00
Junio C Hamano
baf5597ae4 Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  Documentation: fix git-config.xml generation
2007-05-23 22:34:11 -07:00
Shawn O. Pearce
bfbd131f52 Remove a duplicate --not option in bash completion
This was just me being silly; I put the --not option into the
completion list twice.  There's no duplicates shown in the shell
as the shell removes them before showing them to the user.  But we
really don't need the duplicates in the source script either.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:26:58 -04:00
Shawn O. Pearce
1fd6bec9bc Teach bash completion about git-shortlog
We've had completion for git-log for quite some time, but just
today I noticed we don't have it for the new builtin shortlog
that runs git-log internally.  This is indeed a handy thing to
have completion for, especially when your branch names are of
the Very-Very-Long-and-Hard/To-Type/Variety/That-Some-Use.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:25:34 -04:00
Shawn O. Pearce
5cfb4fe525 Hide the plumbing diff-{files,index,tree} from bash completion
The diff-* programs are meant to be plumbing for the diff frontend;
most end users aren't invoking these commands directly.  Consequently
we should avoid showing them as possible completions.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 01:10:15 -04:00
Shawn O. Pearce
aac65ed1bc Fix possible coredump with fast-import --import-marks
When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload
the marks table on subsequent runs of fast-import we really broke
things, as we set pack_id to MAX_PACK_ID for any objects we imported
into the marks table.  Creating a branch from that mark should fail
as we attempt to read the object through a non-existant packed_git
pointer.  Instead we have to use the normal Git object system to
locate the older commit, as we ourselves do not have a reference
to the packed_git it resides in.

This bug only occurred because t9300 was not complete enough.
When we added the --import-marks feature we didn't actually test
its implementation enough to verify the function worked as intended.
I have corrected that, and included the changes as part of this fix.
Prior versions of fast-import fail the new test(s); this commit
allows them to pass.

Credit for this bug find goes to Simon Hausmann <simon@lst.de> as
he recently identified a similiar bug in the tree lazy-loading path.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 00:50:19 -04:00
Shawn O. Pearce
654aaa37ab Refactor fast-import branch creation from existing commit
To resolve a corner case uncovered by Simon Hausmann I need to
reuse the logic for the SHA-1 expression version of the 'from '
command within the mark version of the 'from ' command.  This change
doesn't alter any functionality, but is merely breaking the common
code out to a function that I can reuse.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24 00:11:48 -04:00
Simon Hausmann
20f546a86c fast-import: Fix crash when referencing already existing objects
Commit a5c1780a0355a71b9fb70f1f1977ce726ee5b8d8 sets the pack_id of existing
objects to MAX_PACK_ID. When the same object is referenced later again it is
found in the local object hash. With such a pack_id fast-import should not try
to locate that object in the newly created pack(s).

Signed-off-by: Simon Hausmann <simon@lst.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-23 23:36:47 -04:00
Simon Hausmann
b259157f3c fast-import: Fix uninitialized variable
Fix uninitialized last_object->no_free variable that is accessed in
store_object.

Signed-off-by: Simon Hausmann <simon@lst.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-23 23:36:47 -04:00
James Bowes
5fdcf75c68 Documentation: fix git-config.xml generation
Signed-off-by: James Bowes <jbowes@dangerouslyinc.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-23 17:48:42 -07:00
Mark Levedahl
072570ee26 gitweb.perl - Optionally send archives as .zip files
git-archive already knows how to generate an archive as a tar or a zip
file, but gitweb did not. zip archvies are much more usable in a Windows
environment due to native support and this patch allows a site admin the
option to deliver zip rather than tar files. The selection is done by
inserting

    $feature{'snapshot'}{'default'} = ['x-zip', 'zip', ''];

in gitweb_config.perl.

Tar files remain the default option.

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
Acked-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-23 15:09:49 -07:00
Junio C Hamano
2720de4261 Merge branch 'fl/cvsserver'
* fl/cvsserver:
  t9400: Add some basic pserver tests
  t9400: Add some more cvs update tests
  t9400: Add test cases for config file handling
2007-05-23 14:54:18 -07:00
Junio C Hamano
ed82edc402 Merge branch 'ar/progress'
* ar/progress:
  Fix the progress code to output LF only when it is really needed
2007-05-23 11:39:58 -07:00
Junio C Hamano
1654a3ba0c Merge branch 'maint'
* maint:
  Use git-for-each-ref to check whether the origin branch exists.
2007-05-23 11:39:53 -07:00
Alex Riesen
421f9d1685 Fix the progress code to output LF only when it is really needed
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-23 11:30:49 -07:00
Stephan Springl
7ca055f75a Use git-for-each-ref to check whether the origin branch exists.
This works in repositories that have their refs packed by
"git-pack-refs --all --prune" whereas testing the file
$git_dir/refs/heads/$opt_o does not.

Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-23 11:06:38 -07:00
Junio C Hamano
32309f54ed Fix command line parameter parser of revert/cherry-pick
The parser was inconsistently done, in that it did not look at
the last command line parameter to see if it could be an unknown
option, although it was designed to notice unknown options if
they were given in positions the command expects to find them
(i.e. everything except the last parameter, which ought to be
<commit-ish>).  This prevented a very natural invocation

	$ git cherry-pick --usage

from issuing the usage help.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-23 00:17:51 -07:00
Junio C Hamano
2555699aa2 Merge branch 'jn/lstree'
* jn/lstree:
  Add an option to git-ls-tree to display also the size of blob
2007-05-23 00:17:47 -07:00
Junio C Hamano
e97593693e Merge branch 'maint'
* maint:
  Document branch.autosetupmerge.
2007-05-23 00:16:11 -07:00
Junio C Hamano
c80e07d495 Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  Document branch.autosetupmerge.
2007-05-23 00:15:35 -07:00
Paolo Bonzini
9902387d20 Document branch.autosetupmerge.
This patch documents the branch.autosetupmerge config option, added
by commit 0746d19a.

Signed-off-by: Paolo Bonzini  <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-22 22:42:42 -07:00
Frank Lichtenheld
240ba7f235 t9400: Add some basic pserver tests
While we can easily test the cvs <-> git-cvsserver
communication with :fork: and git-cvsserver server
there are some pserver specifics we should test, too.

Currently this are two tests of the pserver authentication.

Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-22 00:00:42 -07:00
Frank Lichtenheld
1978659a74 t9400: Add some more cvs update tests
Add some cvs update tests that include various merge
situations. Also add a basic test for update -C
since it fits so well in there.

Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-22 00:00:42 -07:00
Frank Lichtenheld
1d431b2235 t9400: Add test cases for config file handling
Add a few test cases for the config file parsing
done by git-cvsserver.

Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-22 00:00:42 -07:00