1
0
mirror of https://github.com/git/git.git synced 2024-11-16 00:53:09 +01:00
git/Documentation/git-gc.txt

211 lines
7.8 KiB
Plaintext
Raw Normal View History

git-gc(1)
=========
NAME
----
git-gc - Cleanup unnecessary files and optimize the local repository
SYNOPSIS
--------
[verse]
'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
DESCRIPTION
-----------
Runs a number of housekeeping tasks within the current repository,
such as compressing file revisions (to reduce disk space and increase
performance), removing unreachable objects which may have been
created from prior invocations of 'git add', packing refs, pruning
reflog, rerere metadata or stale working trees. May also update ancillary
indexes such as the commit-graph.
When common porcelain operations that create objects are run, they
will check whether the repository has grown substantially since the
last maintenance, and if so run `git gc` automatically. See `gc.auto`
below for how to disable this behavior.
Running `git gc` manually should only be needed when adding objects to
a repository without regularly running such porcelain commands, to do
a one-off repository optimization, or e.g. to clean up a suboptimal
mass-import. See the "PACKFILE OPTIMIZATION" section in
linkgit:git-fast-import[1] for more details on the import case.
OPTIONS
-------
--aggressive::
Usually 'git gc' runs very quickly while providing good disk
space utilization and performance. This option will cause
'git gc' to more aggressively optimize the repository at the expense
of taking much more time. The effects of this optimization are
persistent, so this option only needs to be used occasionally; every
few hundred changesets or so.
--auto::
With this option, 'git gc' checks whether any housekeeping is
required; if not, it exits without performing any work.
Some git commands run `git gc --auto` after performing
operations that could create many loose objects. Housekeeping
is required if there are too many loose objects or too many
packs in the repository.
+
If the number of loose objects exceeds the value of the `gc.auto`
configuration variable, then all loose objects are combined into a
single pack. Setting the value of `gc.auto`
to 0 disables automatic packing of loose objects.
+
If the number of packs exceeds the value of `gc.autoPackLimit`,
then existing packs (except those marked with a `.keep` file
or over `gc.bigPackThreshold` limit)
are consolidated into a single pack.
If the amount of memory estimated for `git repack` to run smoothly is
not available and `gc.bigPackThreshold` is not set, the largest
pack will also be excluded (this is the equivalent of running `git gc`
with `--keep-base-pack`).
Setting `gc.autoPackLimit` to 0 disables automatic consolidation of
packs.
+
If houskeeping is required due to many loose objects or packs, all
other housekeeping tasks (e.g. rerere, working trees, reflog...) will
be performed as well.
--prune=<date>::
Prune loose objects older than date (default is 2 weeks ago,
overridable by the config variable `gc.pruneExpire`).
--prune=now prunes loose objects regardless of their age and
increases the risk of corruption if another process is writing to
the repository concurrently; see "NOTES" below. --prune is on by
default.
--no-prune::
Do not prune any loose objects.
--quiet::
Suppress all progress reports.
--force::
Force `git gc` to run even if there may be another `git gc`
instance running on this repository.
--keep-largest-pack::
All packs except the largest pack and those marked with a
`.keep` files are consolidated into a single pack. When this
option is used, `gc.bigPackThreshold` is ignored.
CONFIGURATION
-------------
The optional configuration variable `gc.reflogExpire` can be
set to indicate how long historical entries within each branch's
reflog should remain available in this repository. The setting is
expressed as a length of time, for example '90 days' or '3 months'.
It defaults to '90 days'.
The optional configuration variable `gc.reflogExpireUnreachable`
can be set to indicate how long historical reflog entries which
are not part of the current branch should remain available in
this repository. These types of entries are generally created as
docs: stop using asciidoc no-inline-literal In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 10:51:57 +02:00
a result of using `git commit --amend` or `git rebase` and are the
commits prior to the amend or rebase occurring. Since these changes
are not part of the current project most users will want to expire
them sooner. This option defaults to '30 days'.
The above two configuration variables can be given to a pattern. For
example, this sets non-default expiry values only to remote-tracking
branches:
------------
[gc "refs/remotes/*"]
reflogExpire = never
reflogExpireUnreachable = 3 days
------------
The optional configuration variable `gc.rerereResolved` indicates
how long records of conflicted merge you resolved earlier are
kept. This defaults to 60 days.
The optional configuration variable `gc.rerereUnresolved` indicates
how long records of conflicted merge you have not resolved are
kept. This defaults to 15 days.
The optional configuration variable `gc.packRefs` determines if
'git gc' runs 'git pack-refs'. This can be set to "notbare" to enable
it within all non-bare repos or it can be set to a boolean value.
This defaults to true.
The optional configuration variable `gc.writeCommitGraph` determines if
'git gc' should run 'git commit-graph write'. This can be set to a
boolean value. This defaults to false.
The optional configuration variable `gc.aggressiveWindow` controls how
much time is spent optimizing the delta compression of the objects in
the repository when the --aggressive option is specified. The larger
the value, the more time is spent optimizing the delta compression. See
the documentation for the --window option in linkgit:git-repack[1] for
more details. This defaults to 250.
gc --aggressive: make --depth configurable When 1c192f3 (gc --aggressive: make it really aggressive - 2007-12-06) made --depth=250 the default value, it didn't really explain the reason behind, especially the pros and cons of --depth=250. An old mail from Linus below explains it at length. Long story short, --depth=250 is a disk saver and a performance killer. Not everybody agrees on that aggressiveness. Let the user configure it. From: Linus Torvalds <torvalds@linux-foundation.org> Subject: Re: [PATCH] gc --aggressive: make it really aggressive Date: Thu, 6 Dec 2007 08:19:24 -0800 (PST) Message-ID: <alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org> Gmane-URL: http://article.gmane.org/gmane.comp.gcc.devel/94637 On Thu, 6 Dec 2007, Harvey Harrison wrote: > > 7:41:25elapsed 86%CPU Heh. And this is why you want to do it exactly *once*, and then just export the end result for others ;) > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46...pack But yeah, especially if you allow longer delta chains, the end result can be much smaller (and what makes the one-time repack more expensive is the window size, not the delta chain - you could make the delta chains longer with no cost overhead at packing time) HOWEVER. The longer delta chains do make it potentially much more expensive to then use old history. So there's a trade-off. And quite frankly, a delta depth of 250 is likely going to cause overflows in the delta cache (which is only 256 entries in size *and* it's a hash, so it's going to start having hash conflicts long before hitting the 250 depth limit). So when I said "--depth=250 --window=250", I chose those numbers more as an example of extremely aggressive packing, and I'm not at all sure that the end result is necessarily wonderfully usable. It's going to save disk space (and network bandwidth - the delta's will be re-used for the network protocol too!), but there are definitely downsides too, and using long delta chains may simply not be worth it in practice. (And some of it might just want to have git tuning, ie if people think that long deltas are worth it, we could easily just expand on the delta hash, at the cost of some more memory used!) That said, the good news is that working with *new* history will not be affected negatively, and if you want to be _really_ sneaky, there are ways to say "create a pack that contains the history up to a version one year ago, and be very aggressive about those old versions that we still want to have around, but do a separate pack for newer stuff using less aggressive parameters" So this is something that can be tweaked, although we don't really have any really nice interfaces for stuff like that (ie the git delta cache size is hardcoded in the sources and cannot be set in the config file, and the "pack old history more aggressively" involves some manual scripting and knowing how "git pack-objects" works rather than any nice simple command line switch). So the thing to take away from this is: - git is certainly flexible as hell - .. but to get the full power you may need to tweak things - .. happily you really only need to have one person to do the tweaking, and the tweaked end results will be available to others that do not need to know/care. And whether the difference between 320MB and 500MB is worth any really involved tweaking (considering the potential downsides), I really don't know. Only testing will tell. Linus Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-16 14:35:03 +01:00
Similarly, the optional configuration variable `gc.aggressiveDepth`
controls --depth option in linkgit:git-repack[1]. This defaults to 50.
The optional configuration variable `gc.pruneExpire` controls how old
gc: call "prune --expire 2.weeks.ago" by default The only reason we did not call "prune" in git-gc was that it is an inherently dangerous operation: if there is a commit going on, you will prune loose objects that were just created, and are, in fact, needed by the commit object just about to be created. Since it is dangerous, we told users so. That led to many users not even daring to run it when it was actually safe. Besides, they are users, and should not have to remember such details as when to call git-gc with --prune, or to call git-prune directly. Of course, the consequence was that "git gc --auto" gets triggered much more often than we would like, since unreferenced loose objects (such as left-overs from a rebase or a reset --hard) were never pruned. Alas, git-prune recently learnt the option --expire <minimum-age>, which makes it a much safer operation. This allows us to call prune from git-gc, with a grace period of 2 weeks for the unreferenced loose objects (this value was determined in a discussion on the git list as a safe one). If you want to override this grace period, just set the config variable gc.pruneExpire to a different value; an example would be [gc] pruneExpire = 6.months.ago or even "never", if you feel really paranoid. Note that this new behaviour makes "--prune" be a no-op. While adding a test to t5304-prune.sh (since it really tests the implicit call to "prune"), also the original test for "prune --expire" was moved there from t1410-reflog.sh, where it did not belong. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2008-03-12 21:55:47 +01:00
the unreferenced loose objects have to be before they are pruned. The
default is "2 weeks ago".
Optional configuration variable `gc.worktreePruneExpire` controls how
old a stale working tree should be before `git worktree prune` deletes
it. Default is "3 months ago".
NOTES
-----
'git gc' tries very hard not to delete objects that are referenced
anywhere in your repository. In
particular, it will keep not only objects referenced by your current set
of branches and tags, but also objects referenced by the index,
remote-tracking branches, refs saved by 'git filter-branch' in
refs/original/, or reflogs (which may reference commits in branches
that were later amended or rewound).
If you are expecting some objects to be deleted and they aren't, check
all of those locations and decide whether it makes sense in your case to
remove those references.
On the other hand, when 'git gc' runs concurrently with another process,
there is a risk of it deleting an object that the other process is using
but hasn't created a reference to. This may just cause the other process
to fail or may corrupt the repository if the other process later adds a
reference to the deleted object. Git has two features that significantly
mitigate this problem:
. Any object with modification time newer than the `--prune` date is kept,
along with everything reachable from it.
. Most operations that add an object to the database update the
modification time of the object if it is already present so that #1
applies.
However, these features fall short of a complete solution, so users who
run commands concurrently have to live with some risk of corruption (which
seems to be low in practice) unless they turn off automatic garbage
collection with 'git config gc.auto 0'.
HOOKS
-----
The 'git gc --auto' command will run the 'pre-auto-gc' hook. See
linkgit:githooks[5] for more information.
SEE ALSO
--------
linkgit:git-prune[1]
linkgit:git-reflog[1]
linkgit:git-repack[1]
linkgit:git-rerere[1]
GIT
---
Part of the linkgit:git[1] suite