1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-30 16:06:13 +02:00

Merge branch 'git:master' into master

This commit is contained in:
sulincix 2022-11-22 12:28:34 +00:00 committed by GitHub
commit 68059f1225
Signed by: GitHub
GPG Key ID: 4AEE18F83AFDEB23
34 changed files with 1806 additions and 134 deletions

View File

@ -32,6 +32,9 @@ UI, Workflows & Features
* Enable gc.cruftpacks by default for those who opt into
feature.experimental setting.
* "git repack" learns to send cruft objects out of the way into
packfiles outside the repository.
Performance, Internal Implementation, Development Support etc.
--------------------------------------------------------------
@ -109,6 +112,18 @@ Performance, Internal Implementation, Development Support etc.
* Modernize test script to avoid "test -f" and friends.
* Avoid calling 'cache_tree_update()' when doing so would be
redundant.
* Update the credential-cache documentation to provide a more
realistic example.
* Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.
* A design document for sparse-checkout's future directions has been
added.
Fixes since v2.38
-----------------
@ -250,6 +265,12 @@ Fixes since v2.38
* "git archive" mistakenly complained twice about a missing
executable, which has been corrected.
* Fix a bug where `git branch -d` did not work on an orphaned HEAD.
* `git rebase --update-refs` would delete references when all
`update-ref` commands in the sequencer were removed, which has been
corrected.
* Other code cleanup, docfix, build fix, etc.
(merge 413bc6d20a ds/cmd-main-reorder later to maint).
(merge 8d2863e4ed nw/t1002-cleanup later to maint).

View File

@ -69,10 +69,10 @@ $ git push http://example.com/repo.git
------------------------------------
You can provide options via the credential.helper configuration
variable (this example drops the cache time to 5 minutes):
variable (this example increases the cache time to 1 hour):
-------------------------------------------------------
$ git config credential.helper 'cache --timeout=300'
$ git config credential.helper 'cache --timeout=3600'
-------------------------------------------------------
GIT

View File

@ -160,6 +160,8 @@ empty string.
Components which are missing from the URL (e.g., there is no
username in the example above) will be left unset.
Unrecognised attributes are silently discarded.
GIT
---
Part of the linkgit:git[1] suite

View File

@ -74,6 +74,12 @@ to the new separate pack will be written.
immediately instead of waiting for the next `git gc` invocation.
Only useful with `--cruft -d`.
--expire-to=<dir>::
Write a cruft pack containing pruned objects (if any) to the
directory `<dir>`. This option is useful for keeping a copy of
any pruned objects in a separate directory as a backup. Only
useful with `--cruft -d`.
-l::
Pass the `--local` option to 'git pack-objects'. See
linkgit:git-pack-objects[1].

View File

@ -270,6 +270,7 @@ stdout in the same format (see linkgit:git-credential[1] for common
attributes). A helper is free to produce a subset, or even no values at
all if it has nothing useful to provide. Any provided attributes will
overwrite those already known about by Git's credential subsystem.
Unrecognised attributes are silently discarded.
While it is possible to override all attributes, well behaving helpers
should refrain from doing so for any attribute other than username and

View File

@ -0,0 +1,1103 @@
Table of contents:
* Terminology
* Purpose of sparse-checkouts
* Usecases of primary concern
* Oversimplified mental models ("Cliff Notes" for this document!)
* Desired behavior
* Behavior classes
* Subcommand-dependent defaults
* Sparse specification vs. sparsity patterns
* Implementation Questions
* Implementation Goals/Plans
* Known bugs
* Reference Emails
=== Terminology ===
cone mode: one of two modes for specifying the desired subset of files
in a sparse-checkout. In cone-mode, the user specifies
directories (getting both everything under that directory as
well as everything in leading directories), while in non-cone
mode, the user specifies gitignore-style patterns. Controlled
by the --[no-]cone option to sparse-checkout init|set.
SKIP_WORKTREE: When tracked files do not match the sparse specification and
are removed from the working tree, the file in the index is marked
with a SKIP_WORKTREE bit. Note that if a tracked file has the
SKIP_WORKTREE bit set but the file is later written by the user to
the working tree anyway, the SKIP_WORKTREE bit will be cleared at
the beginning of any subsequent Git operation.
Most sparse checkout users are unaware of this implementation
detail, and the term should generally be avoided in user-facing
descriptions and command flags. Unfortunately, prior to the
`sparse-checkout` subcommand this low-level detail was exposed,
and as of time of writing, is still exposed in various places.
sparse-checkout: a subcommand in git used to reduce the files present in
the working tree to a subset of all tracked files. Also, the
name of the file in the $GIT_DIR/info directory used to track
the sparsity patterns corresponding to the user's desired
subset.
sparse cone: see cone mode
sparse directory: An entry in the index corresponding to a directory, which
appears in the index instead of all the files under that directory
that would normally appear. See also sparse-index. Something that
can cause confusion is that the "sparse directory" does NOT match
the sparse specification, i.e. the directory is NOT present in the
working tree. May be renamed in the future (e.g. to "skipped
directory").
sparse index: A special mode for sparse-checkout that also makes the
index sparse by recording a directory entry in lieu of all the
files underneath that directory (thus making that a "skipped
directory" which unfortunately has also been called a "sparse
directory"), and does this for potentially multiple
directories. Controlled by the --[no-]sparse-index option to
init|set|reapply.
sparsity patterns: patterns from $GIT_DIR/info/sparse-checkout used to
define the set of files of interest. A warning: It is easy to
over-use this term (or the shortened "patterns" term), for two
reasons: (1) users in cone mode specify directories rather than
patterns (their directories are transformed into patterns, but
users may think you are talking about non-cone mode if you use the
word "patterns"), and (b) the sparse specification might
transiently differ in the working tree or index from the sparsity
patterns (see "Sparse specification vs. sparsity patterns").
sparse specification: The set of paths in the user's area of focus. This
is typically just the tracked files that match the sparsity
patterns, but the sparse specification can temporarily differ and
include additional files. (See also "Sparse specification
vs. sparsity patterns")
* When working with history, the sparse specification is exactly
the set of files matching the sparsity patterns.
* When interacting with the working tree, the sparse specification
is the set of tracked files with a clear SKIP_WORKTREE bit or
tracked files present in the working copy.
* When modifying or showing results from the index, the sparse
specification is the set of files with a clear SKIP_WORKTREE bit
or that differ in the index from HEAD.
* If working with the index and the working copy, the sparse
specification is the union of the paths from above.
vivifying: When a command restores a tracked file to the working tree (and
hopefully also clears the SKIP_WORKTREE bit in the index for that
file), this is referred to as "vivifying" the file.
=== Purpose of sparse-checkouts ===
sparse-checkouts exist to allow users to work with a subset of their
files.
You can think of sparse-checkouts as subdividing "tracked" files into two
categories -- a sparse subset, and all the rest. Implementationally, we
mark "all the rest" in the index with a SKIP_WORKTREE bit and leave them
out of the working tree. The SKIP_WORKTREE files are still tracked, just
not present in the working tree.
In the past, sparse-checkouts were defined by "SKIP_WORKTREE means the file
is missing from the working tree but pretend the file contents match HEAD".
That was not only bogus (it actually meant the file missing from the
working tree matched the index rather than HEAD), but it was also a
low-level detail which only provided decent behavior for a few commands.
There were a surprising number of ways in which that guiding principle gave
command results that violated user expectations, and as such was a bad
mental model. However, it persisted for many years and may still be found
in some corners of the code base.
Anyway, the idea of "working with a subset of files" is simple enough, but
there are multiple different high-level usecases which affect how some Git
subcommands should behave. Further, even if we only considered one of
those usecases, sparse-checkouts can modify different subcommands in over a
half dozen different ways. Let's start by considering the high level
usecases:
A) Users are _only_ interested in the sparse portion of the repo
A*) Users are _only_ interested in the sparse portion of the repo
that they have downloaded so far
B) Users want a sparse working tree, but are working in a larger whole
C) sparse-checkout is a behind-the-scenes implementation detail allowing
Git to work with a specially crafted in-house virtual file system;
users are actually working with a "full" working tree that is
lazily populated, and sparse-checkout helps with the lazy population
piece.
It may be worth explaining each of these in a bit more detail:
(Behavior A) Users are _only_ interested in the sparse portion of the repo
These folks might know there are other things in the repository, but
don't care. They are uninterested in other parts of the repository, and
only want to know about changes within their area of interest. Showing
them other files from history (e.g. from diff/log/grep/etc.) is a
usability annoyance, potentially a huge one since other changes in
history may dwarf the changes they are interested in.
Some of these users also arrive at this usecase from wanting to use partial
clones together with sparse checkouts (in a way where they have downloaded
blobs within the sparse specification) and do disconnected development.
Not only do these users generally not care about other parts of the
repository, but consider it a blocker for Git commands to try to operate on
those. If commands attempt to access paths in history outside the sparsity
specification, then the partial clone will attempt to download additional
blobs on demand, fail, and then fail the user's command. (This may be
unavoidable in some cases, e.g. when `git merge` has non-trivial changes to
reconcile outside the sparse specification, but we should limit how often
users are forced to connect to the network.)
Also, even for users using partial clones that do not mind being
always connected to the network, the need to download blobs as
side-effects of various other commands (such as the printed diffstat
after a merge or pull) can lead to worries about local repository size
growing unnecessarily[10].
(Behavior A*) Users are _only_ interested in the sparse portion of the repo
that they have downloaded so far (a variant on the first usecase)
This variant is driven by folks who using partial clones together with
sparse checkouts and do disconnected development (so far sounding like a
subset of behavior A users) and doing so on very large repositories. The
reason for yet another variant is that downloading even just the blobs
through history within their sparse specification may be too much, so they
only download some. They would still like operations to succeed without
network connectivity, though, so things like `git log -S${SEARCH_TERM} -p`
or `git grep ${SEARCH_TERM} OLDREV ` would need to be prepared to provide
partial results that depend on what happens to have been downloaded.
This variant could be viewed as Behavior A with the sparse specification
for history querying operations modified from "sparsity patterns" to
"sparsity patterns limited to the blobs we have already downloaded".
(Behavior B) Users want a sparse working tree, but are working in a
larger whole
Stolee described this usecase this way[11]:
"I'm also focused on users that know that they are a part of a larger
whole. They know they are operating on a large repository but focus on
what they need to contribute their part. I expect multiple "roles" to
use very different, almost disjoint parts of the codebase. Some other
"architect" users operate across the entire tree or hop between different
sections of the codebase as necessary. In this situation, I'm wary of
scoping too many features to the sparse-checkout definition, especially
"git log," as it can be too confusing to have their view of the codebase
depend on your "point of view."
People might also end up wanting behavior B due to complex inter-project
dependencies. The initial attempts to use sparse-checkouts usually involve
the directories you are directly interested in plus what those directories
depend upon within your repository. But there's a monkey wrench here: if
you have integration tests, they invert the hierarchy: to run integration
tests, you need not only what you are interested in and its in-tree
dependencies, you also need everything that depends upon what you are
interested in or that depends upon one of your dependencies...AND you need
all the in-tree dependencies of that expanded group. That can easily
change your sparse-checkout into a nearly dense one.
Naturally, that tends to kill the benefits of sparse-checkouts. There are
a couple solutions to this conundrum: either avoid grabbing in-repo
dependencies (maybe have built versions of your in-repo dependencies pulled
from a CI cache somewhere), or say that users shouldn't run integration
tests directly and instead do it on the CI server when they submit a code
review. Or do both. Regardless of whether you stub out your in-repo
dependencies or stub out the things that depend upon you, there is
certainly a reason to want to query and be aware of those other stubbed-out
parts of the repository, particularly when the dependencies are complex or
change relatively frequently. Thus, for such uses, sparse-checkouts can be
used to limit what you directly build and modify, but these users do not
necessarily want their sparse checkout paths to limit their queries of
versions in history.
Some people may also be interested in behavior B over behavior A simply as
a performance workaround: if they are using non-cone mode, then they have
to deal with its inherent quadratic performance problems. In that mode,
every operation that checks whether paths match the sparsity specification
can be expensive. As such, these users may only be willing to pay for
those expensive checks when interacting with the working copy, and may
prefer getting "unrelated" results from their history queries over having
slow commands.
(Behavior C) sparse-checkout is an implementational detail supporting a
special VFS.
This usecase goes slightly against the traditional definition of
sparse-checkout in that it actually tries to present a full or dense
checkout to the user. However, this usecase utilizes the same underlying
technical underpinnings in a new way which does provide some performance
advantages to users. The basic idea is that a company can have an in-house
Git-aware Virtual File System which pretends all files are present in the
working tree, by intercepting all file system accesses and using those to
fetch and write accessed files on demand via partial clones. The VFS uses
sparse-checkout to prevent Git from writing or paying attention to many
files, and manually updates the sparse checkout patterns itself based on
user access and modification of files in the working tree. See commit
ecc7c8841d ("repo_read_index: add config to expect files outside sparse
patterns", 2022-02-25) and the link at [17] for a more detailed description
of such a VFS.
The biggest difference here is that users are completely unaware that the
sparse-checkout machinery is even in use. The sparse patterns are not
specified by the user but rather are under the complete control of the VFS
(and the patterns are updated frequently and dynamically by it). The user
will perceive the checkout as dense, and commands should thus behave as if
all files are present.
=== Usecases of primary concern ===
Most of the rest of this document will focus on Behavior A and Behavior
B. Some notes about the other two cases and why we are not focusing on
them:
(Behavior A*)
Supporting this usecase is estimated to be difficult and a lot of work.
There are no plans to implement it currently, but it may be a potential
future alternative. Knowing about the existence of additional alternatives
may affect our choice of command line flags (e.g. if we need tri-state or
quad-state flags rather than just binary flags), so it was still important
to at least note.
Further, I believe the descriptions below for Behavior A are probably still
valid for this usecase, with the only exception being that it redefines the
sparse specification to restrict it to already-downloaded blobs. The hard
part is in making commands capable of respecting that modified definition.
(Behavior C)
This usecase violates some of the early sparse-checkout documented
assumptions (since files marked as SKIP_WORKTREE will be displayed to users
as present in the working tree). That violation may mean various
sparse-checkout related behaviors are not well suited to this usecase and
we may need tweaks -- to both documentation and code -- to handle it.
However, this usecase is also perhaps the simplest model to support in that
everything behaves like a dense checkout with a few exceptions (e.g. branch
checkouts and switches write fewer things, knowing the VFS will lazily
write the rest on an as-needed basis).
Since there is no publically available VFS-related code for folks to try,
the number of folks who can test such a usecase is limited.
The primary reason to note the Behavior C usecase is that as we fix things
to better support Behaviors A and B, there may be additional places where
we need to make tweaks allowing folks in this usecase to get the original
non-sparse treatment. For an example, see ecc7c8841d ("repo_read_index:
add config to expect files outside sparse patterns", 2022-02-25). The
secondary reason to note Behavior C, is so that folks taking advantage of
Behavior C do not assume they are part of the Behavior B camp and propose
patches that break things for the real Behavior B folks.
=== Oversimplified mental models ===
An oversimplification of the differences in the above behaviors is:
Behavior A: Restrict worktree and history operations to sparse specification
Behavior B: Restrict worktree operations to sparse specification; have any
history operations work across all files
Behavior C: Do not restrict either worktree or history operations to the
sparse specification...with the exception of branch checkouts or
switches which avoid writing files that will match the index so
they can later lazily be populated instead.
=== Desired behavior ===
As noted previously, despite the simple idea of just working with a subset
of files, there are a range of different behavioral changes that need to be
made to different subcommands to work well with such a feature. See
[1,2,3,4,5,6,7,8,9,10] for various examples. In particular, at [2], we saw
that mere composition of other commands that individually worked correctly
in a sparse-checkout context did not imply that the higher level command
would work correctly; it sometimes requires further tweaks. So,
understanding these differences can be beneficial.
* Commands behaving the same regardless of high-level use-case
* commands that only look at files within the sparsity specification
* diff (without --cached or REVISION arguments)
* grep (without --cached or REVISION arguments)
* diff-files
* commands that restore files to the working tree that match sparsity
patterns, and remove unmodified files that don't match those
patterns:
* switch
* checkout (the switch-like half)
* read-tree
* reset --hard
* commands that write conflicted files to the working tree, but otherwise
will omit writing files to the working tree that do not match the
sparsity patterns:
* merge
* rebase
* cherry-pick
* revert
* `am` and `apply --cached` should probably be in this section but
are buggy (see the "Known bugs" section below)
The behavior for these commands somewhat depends upon the merge
strategy being used:
* `ort` behaves as described above
* `recursive` tries to not vivify files unnecessarily, but does sometimes
vivify files without conflicts.
* `octopus` and `resolve` will always vivify any file changed in the merge
relative to the first parent, which is rather suboptimal.
It is also important to note that these commands WILL update the index
outside the sparse specification relative to when the operation began,
BUT these commands often make a commit just before or after such that
by the end of the operation there is no change to the index outside the
sparse specification. Of course, if the operation hits conflicts or
does not make a commit, then these operations clearly can modify the
index outside the sparse specification.
Finally, it is important to note that at least the first four of these
commands also try to remove differences between the sparse
specification and the sparsity patterns (much like the commands in the
previous section).
* commands that always ignore sparsity since commits must be full-tree
* archive
* bundle
* commit
* format-patch
* fast-export
* fast-import
* commit-tree
* commands that write any modified file to the working tree (conflicted
or not, and whether those paths match sparsity patterns or not):
* stash
* apply (without `--index` or `--cached`)
* Commands that may slightly differ for behavior A vs. behavior B:
Commands in this category behave mostly the same between the two
behaviors, but may differ in verbosity and types of warning and error
messages.
* commands that make modifications to which files are tracked:
* add
* rm
* mv
* update-index
The fact that files can move between the 'tracked' and 'untracked'
categories means some commands will have to treat untracked files
differently. But if we have to treat untracked files differently,
then additional commands may also need changes:
* status
* clean
In particular, `status` may need to report any untracked files outside
the sparsity specification as an erroneous condition (especially to
avoid the user trying to `git add` them, forcing `git add` to display
an error).
It's not clear to me exactly how (or even if) `clean` would change,
but it's the other command that also affects untracked files.
`update-index` may be slightly special. Its --[no-]skip-worktree flag
may need to ignore the sparse specification by its nature. Also, its
current --[no-]ignore-skip-worktree-entries default is totally bogus.
* commands for manually tweaking paths in both the index and the working tree
* `restore`
* the restore-like half of `checkout`
These commands should be similar to add/rm/mv in that they should
only operate on the sparse specification by default, and require a
special flag to operate on all files.
Also, note that these commands currently have a number of issues (see
the "Known bugs" section below)
* Commands that significantly differ for behavior A vs. behavior B:
* commands that query history
* diff (with --cached or REVISION arguments)
* grep (with --cached or REVISION arguments)
* show (when given commit arguments)
* blame (only matters when one or more -C flags are passed)
* and annotate
* log
* whatchanged
* ls-files
* diff-index
* diff-tree
* ls-tree
Note: for log and whatchanged, revision walking logic is unaffected
but displaying of patches is affected by scoping the command to the
sparse-checkout. (The fact that revision walking is unaffected is
why rev-list, shortlog, show-branch, and bisect are not in this
list.)
ls-files may be slightly special in that e.g. `git ls-files -t` is
often used to see what is sparse and what is not. Perhaps -t should
always work on the full tree?
* Commands I don't know how to classify
* range-diff
Is this like `log` or `format-patch`?
* cherry
See range-diff
* Commands unaffected by sparse-checkouts
* shortlog
* show-branch
* rev-list
* bisect
* branch
* describe
* fetch
* gc
* init
* maintenance
* notes
* pull (merge & rebase have the necessary changes)
* push
* submodule
* tag
* config
* filter-branch (works in separate checkout without sparse-checkout setup)
* pack-refs
* prune
* remote
* repack
* replace
* bugreport
* count-objects
* fsck
* gitweb
* help
* instaweb
* merge-tree (doesn't touch worktree or index, and merges always compute full-tree)
* rerere
* verify-commit
* verify-tag
* commit-graph
* hash-object
* index-pack
* mktag
* mktree
* multi-pack-index
* pack-objects
* prune-packed
* symbolic-ref
* unpack-objects
* update-ref
* write-tree (operates on index, possibly optimized to use sparse dir entries)
* for-each-ref
* get-tar-commit-id
* ls-remote
* merge-base (merges are computed full tree, so merge base should be too)
* name-rev
* pack-redundant
* rev-parse
* show-index
* show-ref
* unpack-file
* var
* verify-pack
* <Everything under 'Interacting with Others' in 'git help --all'>
* <Everything under 'Low-level...Syncing' in 'git help --all'>
* <Everything under 'Low-level...Internal Helpers' in 'git help --all'>
* <Everything under 'External commands' in 'git help --all'>
* Commands that might be affected, but who cares?
* merge-file
* merge-index
* gitk?
=== Behavior classes ===
From the above there are a few classes of behavior:
* "restrict"
Commands in this class only read or write files in the working tree
within the sparse specification.
When moving to a new commit (e.g. switch, reset --hard), these commands
may update index files outside the sparse specification as of the start
of the operation, but by the end of the operation those index files
will match HEAD again and thus those files will again be outside the
sparse specification.
When paths are explicitly specified, these paths are intersected with
the sparse specification and will only operate on such paths.
(e.g. `git restore [--staged] -- '*.png'`, `git reset -p -- '*.md'`)
Some of these commands may also attempt, at the end of their operation,
to cull transient differences between the sparse specification and the
sparsity patterns (see "Sparse specification vs. sparsity patterns" for
details, but this basically means either removing unmodified files not
matching the sparsity patterns and marking those files as
SKIP_WORKTREE, or vivifying files that match the sparsity patterns and
marking those files as !SKIP_WORKTREE).
* "restrict modulo conflicts"
Commands in this class generally behave like the "restrict" class,
except that:
(1) they will ignore the sparse specification and write files with
conflicts to the working tree (thus temporarily expanding the
sparse specification to include such files.)
(2) they are grouped with commands which move to a new commit, since
they often create a commit and then move to it, even though we
know there are many exceptions to moving to the new commit. (For
example, the user may rebase a commit that becomes empty, or have
a cherry-pick which conflicts, or a user could run `merge
--no-commit`, and we also view `apply --index` kind of like `am
--no-commit`.) As such, these commands can make changes to index
files outside the sparse specification, though they'll mark such
files with SKIP_WORKTREE.
* "restrict also specially applied to untracked files"
Commands in this class generally behave like the "restrict" class,
except that they have to handle untracked files differently too, often
because these commands are dealing with files changing state between
'tracked' and 'untracked'. Often, this may mean printing an error
message if the command had nothing to do, but the arguments may have
referred to files whose tracked-ness state could have changed were it
not for the sparsity patterns excluding them.
* "no restrict"
Commands in this class ignore the sparse specification entirely.
* "restrict or no restrict dependent upon behavior A vs. behavior B"
Commands in this class behave like "no restrict" for folks in the
behavior B camp, and like "restrict" for folks in the behavior A camp.
However, when behaving like "restrict" a warning of some sort might be
provided that history queries have been limited by the sparse-checkout
specification.
=== Subcommand-dependent defaults ===
Note that we have different defaults depending on the command for the
desired behavior :
* Commands defaulting to "restrict":
* diff-files
* diff (without --cached or REVISION arguments)
* grep (without --cached or REVISION arguments)
* switch
* checkout (the switch-like half)
* reset (<commit>)
* restore
* checkout (the restore-like half)
* checkout-index
* reset (with pathspec)
This behavior makes sense; these interact with the working tree.
* Commands defaulting to "restrict modulo conflicts":
* merge
* rebase
* cherry-pick
* revert
* am
* apply --index (which is kind of like an `am --no-commit`)
* read-tree (especially with -m or -u; is kind of like a --no-commit merge)
* reset (<tree-ish>, due to similarity to read-tree)
These also interact with the working tree, but require slightly
different behavior either so that (a) conflicts can be resolved or (b)
because they are kind of like a merge-without-commit operation.
(See also the "Known bugs" section below regarding `am` and `apply`)
* Commands defaulting to "no restrict":
* archive
* bundle
* commit
* format-patch
* fast-export
* fast-import
* commit-tree
* stash
* apply (without `--index`)
These have completely different defaults and perhaps deserve the most
detailed explanation:
In the case of commands in the first group (format-patch,
fast-export, bundle, archive, etc.), these are commands for
communicating history, which will be broken if they restrict to a
subset of the repository. As such, they operate on full paths and
have no `--restrict` option for overriding. Some of these commands may
take paths for manually restricting what is exported, but it needs to
be very explicit.
In the case of stash, it needs to vivify files to avoid losing the
user's changes.
In the case of apply without `--index`, that command needs to update
the working tree without the index (or the index without the working
tree if `--cached` is passed), and if we restrict those updates to the
sparse specification then we'll lose changes from the user.
* Commands defaulting to "restrict also specially applied to untracked files":
* add
* rm
* mv
* update-index
* status
* clean (?)
Our original implementation for the first three of these commands was
"no restrict", but it had some severe usability issues:
* `git add <somefile>` if honored and outside the sparse
specification, can result in the file randomly disappearing later
when some subsequent command is run (since various commands
automatically clean up unmodified files outside the sparse
specification).
* `git rm '*.jpg'` could very negatively surprise users if it deletes
files outside the range of the user's interest.
* `git mv` has similar surprises when moving into or out of the cone,
so best to restrict by default
So, we switched `add` and `rm` to default to "restrict", which made
usability problems much less severe and less frequent, but we still got
complaints because commands like:
git add <file-outside-sparse-specification>
git rm <file-outside-sparse-specification>
would silently do nothing. We should instead print an error in those
cases to get usability right.
update-index needs to be updated to match, and status and maybe clean
also need to be updated to specially handle untracked paths.
There may be a difference in here between behavior A and behavior B in
terms of verboseness of errors or additional warnings.
* Commands falling under "restrict or no restrict dependent upon behavior
A vs. behavior B"
* diff (with --cached or REVISION arguments)
* grep (with --cached or REVISION arguments)
* show (when given commit arguments)
* blame (only matters when one or more -C flags passed)
* and annotate
* log
* and variants: shortlog, gitk, show-branch, whatchanged, rev-list
* ls-files
* diff-index
* diff-tree
* ls-tree
For now, we default to behavior B for these, which want a default of
"no restrict".
Note that two of these commands -- diff and grep -- also appeared in a
different list with a default of "restrict", but only when limited to
searching the working tree. The working tree vs. history distinction
is fundamental in how behavior B operates, so this is expected. Note,
though, that for diff and grep with --cached, when doing "restrict"
behavior, the difference between sparse specification and sparsity
patterns is important to handle.
"restrict" may make more sense as the long term default for these[12].
Also, supporting "restrict" for these commands might be a fair amount
of work to implement, meaning it might be implemented over multiple
releases. If that behavior were the default in the commands that
supported it, that would force behavior B users to need to learn to
slowly add additional flags to their commands, depending on git
version, to get the behavior they want. That gradual switchover would
be painful, so we should avoid it at least until it's fully
implemented.
=== Sparse specification vs. sparsity patterns ===
In a well-behaved situation, the sparse specification is given directly
by the $GIT_DIR/info/sparse-checkout file. However, it can transiently
diverge for a few reasons:
* needing to resolve conflicts (merging will vivify conflicted files)
* running Git commands that implicitly vivify files (e.g. "git stash apply")
* running Git commands that explicitly vivify files (e.g. "git checkout
--ignore-skip-worktree-bits FILENAME")
* other commands that write to these files (perhaps a user copies it
from elsewhere)
For the last item, note that we do automatically clear the SKIP_WORKTREE
bit for files that are present in the working tree. This has been true
since 82386b4496 ("Merge branch 'en/present-despite-skipped'",
2022-03-09)
However, such a situation is transient because:
* Such transient differences can and will be automatically removed as
a side-effect of commands which call unpack_trees() (checkout,
merge, reset, etc.).
* Users can also request such transient differences be corrected via
running `git sparse-checkout reapply`. Various places recommend
running that command.
* Additional commands are also welcome to implicitly fix these
differences; we may add more in the future.
While we avoid dropping unstaged changes or files which have conflicts,
we otherwise aggressively try to fix these transient differences. If
users want these differences to persist, they should run the `set` or
`add` subcommands of `git sparse-checkout` to reflect their intended
sparse specification.
However, when we need to do a query on history restricted to the
"relevant subset of files" such a transiently expanded sparse
specification is ignored. There are a couple reasons for this:
* The behavior wanted when doing something like
git grep expression REVISION
is roughly what the users would expect from
git checkout REVISION && git grep expression
(modulo a "REVISION:" prefix), which has a couple ramifications:
* REVISION may have paths not in the current index, so there is no
path we can consult for a SKIP_WORKTREE setting for those paths.
* Since `checkout` is one of those commands that tries to remove
transient differences in the sparse specification, it makes sense
to use the corrected sparse specification
(i.e. $GIT_DIR/info/sparse-checkout) rather than attempting to
consult SKIP_WORKTREE anyway.
So, a transiently expanded (or restricted) sparse specification applies to
the working tree, but not to history queries where we always use the
sparsity patterns. (See [16] for an early discussion of this.)
Similar to a transiently expanded sparse specification of the working tree
based on additional files being present in the working tree, we also need
to consider additional files being modified in the index. In particular,
if the user has staged changes to files (relative to HEAD) that do not
match the sparsity patterns, and the file is not present in the working
tree, we still want to consider the file part of the sparse specification
if we are specifically performing a query related to the index (e.g. git
diff --cached [REVISION], git diff-index [REVISION], git restore --staged
--source=REVISION -- PATHS, etc.) Note that a transiently expanded sparse
specification for the index usually only matters under behavior A, since
under behavior B index operations are lumped with history and tend to
operate full-tree.
=== Implementation Questions ===
* Do the options --scope={sparse,all} sound good to others? Are there better
options?
* Names in use, or appearing in patches, or previously suggested:
* --sparse/--dense
* --ignore-skip-worktree-bits
* --ignore-skip-worktree-entries
* --ignore-sparsity
* --[no-]restrict-to-sparse-paths
* --full-tree/--sparse-tree
* --[no-]restrict
* --scope={sparse,all}
* --focus/--unfocus
* --limit/--unlimited
* Rationale making me lean slightly towards --scope={sparse,all}:
* We want a name that works for many commands, so we need a name that
does not conflict
* We know that we have more than two possible usecases, so it is best
to avoid a flag that appears to be binary.
* --scope={sparse,all} isn't overly long and seems relatively
explanatory
* `--sparse`, as used in add/rm/mv, is totally backwards for
grep/log/etc. Changing the meaning of `--sparse` for these
commands would fix the backwardness, but possibly break existing
scripts. Using a new name pairing would allow us to treat
`--sparse` in these commands as a deprecated alias.
* There is a different `--sparse`/`--dense` pair for commands using
revision machinery, so using that naming might cause confusion
* There is also a `--sparse` in both pack-objects and show-branch, which
don't conflict but do suggest that `--sparse` is overloaded
* The name --ignore-skip-worktree-bits is a double negative, is
quite a mouthful, refers to an implementation detail that many
users may not be familiar with, and we'd need a negation for it
which would probably be even more ridiculously long. (But we
can make --ignore-skip-worktree-bits a deprecated alias for
--no-restrict.)
* If a config option is added (sparse.scope?) what should the values and
description be? "sparse" (behavior A), "worktree-sparse-history-dense"
(behavior B), "dense" (behavior C)? There's a risk of confusion,
because even for Behaviors A and B we want some commands to be
full-tree and others to operate sparsely, so the wording may need to be
more tied to the usecases and somehow explain that. Also, right now,
the primary difference we are focusing is just the history-querying
commands (log/diff/grep). Previous config suggestion here: [13]
* Is `--no-expand` a good alias for ls-files's `--sparse` option?
(`--sparse` does not map to either `--scope=sparse` or `--scope=all`,
because in non-cone mode it does nothing and in cone-mode it shows the
sparse directory entries which are technically outside the sparse
specification)
* Under Behavior A:
* Does ls-files' `--no-expand` override the default `--scope=all`, or
does it need an extra flag?
* Does ls-files' `-t` option imply `--scope=all`?
* Does update-index's `--[no-]skip-worktree` option imply `--scope=all`?
* sparse-checkout: once behavior A is fully implemented, should we take
an interim measure to ease people into switching the default? Namely,
if folks are not already in a sparse checkout, then require
`sparse-checkout init/set` to take a
`--set-scope=(sparse|worktree-sparse-history-dense|dense)` flag (which
would set sparse.scope according to the setting given), and throw an
error if the flag is not provided? That error would be a great place
to warn folks that the default may change in the future, and get them
used to specifying what they want so that the eventual default switch
is seamless for them.
=== Implementation Goals/Plans ===
* Get buy-in on this document in general.
* Figure out answers to the 'Implementation Questions' sections (above)
* Fix bugs in the 'Known bugs' section (below)
* Provide some kind of method for backfilling the blobs within the sparse
specification in a partial clone
[Below here is kind of spitballing since the first two haven't been resolved]
* update-index: flip the default to --no-ignore-skip-worktree-entries,
nuke this stupid "Oh, there's a bug? Let me add a flag to let users
request that they not trigger this bug." flag
* Flags & Config
* Make `--sparse` in add/rm/mv a deprecated alias for `--scope=all`
* Make `--ignore-skip-worktree-bits` in checkout-index/checkout/restore
a deprecated aliases for `--scope=all`
* Create config option (sparse.scope?), tie it to the "Cliff notes"
overview
* Add --scope=sparse (and --scope=all) flag to each of the history querying
commands. IMPORTANT: make sure diff machinery changes don't mess with
format-patch, fast-export, etc.
=== Known bugs ===
This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've
been working on it.
0. Behavior A is not well supported in Git. (Behavior B didn't used to
be either, but was the easier of the two to implement.)
1. am and apply:
apply, without `--index` or `--cached`, relies on files being present
in the working copy, and also writes to them unconditionally. As
such, it should first check for the files' presence, and if found to
be SKIP_WORKTREE, then clear the bit and vivify the paths, then do
its work. Currently, it just throws an error.
apply, with either `--cached` or `--index`, will not preserve the
SKIP_WORKTREE bit. This is fine if the file has conflicts, but
otherwise SKIP_WORKTREE bits should be preserved for --cached and
probably also for --index.
am, if there are no conflicts, will vivify files and fail to preserve
the SKIP_WORKTREE bit. If there are conflicts and `-3` is not
specified, it will vivify files and then complain the patch doesn't
apply. If there are conflicts and `-3` is specified, it will vivify
files and then complain that those vivified files would be
overwritten by merge.
2. reset --hard:
reset --hard provides confusing error message (works correctly, but
misleads the user into believing it didn't):
$ touch addme
$ git add addme
$ git ls-files -t
H addme
H tracked
S tracked-but-maybe-skipped
$ git reset --hard # usually works great
error: Path 'addme' not uptodate; will not remove from working tree.
HEAD is now at bdbbb6f third
$ git ls-files -t
H tracked
S tracked-but-maybe-skipped
$ ls -1
tracked
`git reset --hard` DID remove addme from the index and the working tree, contrary
to the error message, but in line with how reset --hard should behave.
3. read-tree
`read-tree` doesn't apply the 'SKIP_WORKTREE' bit to *any* of the
entries it reads into the index, resulting in all your files suddenly
appearing to be "deleted".
4. Checkout, restore:
These command do not handle path & revision arguments appropriately:
$ ls
tracked
$ git ls-files -t
H tracked
S tracked-but-maybe-skipped
$ git status --porcelain
$ git checkout -- '*skipped'
error: pathspec '*skipped' did not match any file(s) known to git
$ git ls-files -- '*skipped'
tracked-but-maybe-skipped
$ git checkout HEAD -- '*skipped'
error: pathspec '*skipped' did not match any file(s) known to git
$ git ls-tree HEAD | grep skipped
100644 blob 276f5a64354b791b13840f02047738c77ad0584f tracked-but-maybe-skipped
$ git status --porcelain
$ git checkout HEAD~1 -- '*skipped'
$ git ls-files -t
H tracked
H tracked-but-maybe-skipped
$ git status --porcelain
M tracked-but-maybe-skipped
$ git checkout HEAD -- '*skipped'
$ git status --porcelain
$
Note that checkout without a revision (or restore --staged) fails to
find a file to restore from the index, even though ls-files shows
such a file certainly exists.
Similar issues occur with HEAD (--source=HEAD in restore's case),
but suddenly works when HEAD~1 is specified. And then after that it
will work with HEAD specified, even though it didn't before.
Directories are also an issue:
$ git sparse-checkout set nomatches
$ git status
On branch main
You are in a sparse checkout with 0% of tracked files present.
nothing to commit, working tree clean
$ git checkout .
error: pathspec '.' did not match any file(s) known to git
$ git checkout HEAD~1 .
Updated 1 path from 58916d9
$ git ls-files -t
S tracked
H tracked-but-maybe-skipped
5. checkout and restore --staged, continued:
These commands do not correctly scope operations to the sparse
specification, and make it worse by not setting important SKIP_WORKTREE
bits:
$ git restore --source OLDREV --staged outside-sparse-cone/
$ git status --porcelain
MD outside-sparse-cone/file1
MD outside-sparse-cone/file2
MD outside-sparse-cone/file3
We can add a --scope=all mode to `git restore` to let it operate outside
the sparse specification, but then it will be important to set the
SKIP_WORKTREE bits appropriately.
6. Performance issues; see:
https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
=== Reference Emails ===
Emails that detail various bugs we've had in sparse-checkout:
[1] (Original descriptions of behavior A & behavior B)
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[2] (Fix stash applications in sparse checkouts; bugs from behavioral differences)
https://lore.kernel.org/git/ccfedc7140dbf63ba26a15f93bd3885180b26517.1606861519.git.gitgitgadget@gmail.com/
[3] (Present-despite-skipped entries)
https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/
[4] (Clone --no-checkout interaction)
https://lore.kernel.org/git/pull.801.v2.git.git.1591324899170.gitgitgadget@gmail.com/ (clone --no-checkout)
[5] (The need for update_sparsity() and avoiding `read-tree -mu HEAD`)
https://lore.kernel.org/git/3a1f084641eb47515b5a41ed4409a36128913309.1585270142.git.gitgitgadget@gmail.com/
[6] (SKIP_WORKTREE is advisory, not mandatory)
https://lore.kernel.org/git/844306c3e86ef67591cc086decb2b760e7d710a3.1585270142.git.gitgitgadget@gmail.com/
[7] (`worktree add` should copy sparsity settings from current worktree)
https://lore.kernel.org/git/c51cb3714e7b1d2f8c9370fe87eca9984ff4859f.1644269584.git.gitgitgadget@gmail.com/
[8] (Avoid negative surprises in add, rm, and mv)
https://lore.kernel.org/git/cover.1617914011.git.matheus.bernardino@usp.br/
https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
[9] (Move from out-of-cone to in-cone)
https://lore.kernel.org/git/20220630023737.473690-6-shaoxuan.yuan02@gmail.com/
https://lore.kernel.org/git/20220630023737.473690-4-shaoxuan.yuan02@gmail.com/
[10] (Unnecessarily downloading objects outside sparse specification)
https://lore.kernel.org/git/CAOLTT8QfwOi9yx_qZZgyGa8iL8kHWutEED7ok_jxwTcYT_hf9Q@mail.gmail.com/
[11] (Stolee's comments on high-level usecases)
https://lore.kernel.org/git/1a1e33f6-3514-9afc-0a28-5a6b85bd8014@gmail.com/
[12] Others commenting on eventually switching default to behavior A:
* https://lore.kernel.org/git/xmqqh719pcoo.fsf@gitster.g/
* https://lore.kernel.org/git/xmqqzgeqw0sy.fsf@gitster.g/
* https://lore.kernel.org/git/a86af661-cf58-a4e5-0214-a67d3a794d7e@github.com/
[13] Previous config name suggestion and description
* https://lore.kernel.org/git/CABPp-BE6zW0nJSStcVU=_DoDBnPgLqOR8pkTXK3dW11=T01OhA@mail.gmail.com/
[14] Tangential issue: switch to cone mode as default sparse specification mechanism:
https://lore.kernel.org/git/a1b68fd6126eb341ef3637bb93fedad4309b36d0.1650594746.git.gitgitgadget@gmail.com/
[15] Lengthy email on grep behavior, covering what should be searched:
* https://lore.kernel.org/git/CABPp-BGVO3QdbfE84uF_3QDF0-y2iHHh6G5FAFzNRfeRitkuHw@mail.gmail.com/
[16] Email explaining sparsity patterns vs. SKIP_WORKTREE and history operations,
search for the parenthetical comment starting "We do not check".
https://lore.kernel.org/git/CABPp-BFsCPPNOZ92JQRJeGyNd0e-TCW-LcLyr0i_+VSQJP+GCg@mail.gmail.com/
[17] https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/

View File

@ -133,10 +133,6 @@ Issues of note:
you are using libcurl older than 7.34.0. Otherwise you can use
NO_OPENSSL without losing git-imap-send.
By default, git uses OpenSSL for SHA1 but it will use its own
library (inspired by Mozilla's) with either NO_OPENSSL or
BLK_SHA1.
- "libcurl" library is used for fetching and pushing
repositories over http:// or https://, as well as by
git-imap-send if the curl version is >= 7.34.0. If you do

253
Makefile
View File

@ -4,8 +4,20 @@ all::
# Import tree-wide shared Makefile behavior and libraries
include shared.mak
# == Makefile defines ==
#
# These defines change the behavior of the Makefile itself, but have
# no impact on what it builds:
#
# Define V=1 to have a more verbose compile.
#
# == Portability and optional library defines ==
#
# These defines indicate what Git can expect from the OS, what
# libraries are available etc. Much of this is auto-detected in
# config.mak.uname, or in configure.ac when using the optional "make
# configure && ./configure" (see INSTALL).
#
# Define SHELL_PATH to a POSIX shell if your /bin/sh is broken.
#
# Define SANE_TOOL_PATH to a colon-separated list of paths to prepend
@ -30,68 +42,8 @@ include shared.mak
#
# Define NO_OPENSSL environment variable if you do not have OpenSSL.
#
# Define USE_LIBPCRE if you have and want to use libpcre. Various
# commands such as log and grep offer runtime options to use
# Perl-compatible regular expressions instead of standard or extended
# POSIX regular expressions.
#
# Only libpcre version 2 is supported. USE_LIBPCRE2 is a synonym for
# USE_LIBPCRE, support for the old USE_LIBPCRE1 has been removed.
#
# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
# in /foo/bar/include and /foo/bar/lib directories.
#
# Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
#
# Define NO_CURL if you do not have libcurl installed. git-http-fetch and
# git-http-push are not built, and you cannot use http:// and https://
# transports (neither smart nor dumb).
#
# Define CURLDIR=/foo/bar if your curl header and library files are in
# /foo/bar/include and /foo/bar/lib directories.
#
# Define CURL_CONFIG to curl's configuration program that prints information
# about the library (e.g., its version number). The default is 'curl-config'.
#
# Define CURL_LDFLAGS to specify flags that you need to link when using libcurl,
# if you do not want to rely on the libraries provided by CURL_CONFIG. The
# default value is a result of `curl-config --libs`. An example value for
# CURL_LDFLAGS is as follows:
#
# CURL_LDFLAGS=-lcurl
#
# Define NO_EXPAT if you do not have expat installed. git-http-push is
# not built, and you cannot push using http:// and https:// transports (dumb).
#
# Define EXPATDIR=/foo/bar if your expat header and library files are in
# /foo/bar/include and /foo/bar/lib directories.
#
# Define EXPAT_NEEDS_XMLPARSE_H if you have an old version of expat (e.g.,
# 1.1 or 1.2) that provides xmlparse.h instead of expat.h.
#
# Define NO_GETTEXT if you don't want Git output to be translated.
# A translated Git requires GNU libintl or another gettext implementation,
# plus libintl-perl at runtime.
#
# Define USE_GETTEXT_SCHEME and set it to 'fallthrough', if you don't trust
# the installed gettext translation of the shell scripts output.
#
# Define HAVE_LIBCHARSET_H if you haven't set NO_GETTEXT and you can't
# trust the langinfo.h's nl_langinfo(CODESET) function to return the
# current character set. GNU and Solaris have a nl_langinfo(CODESET),
# FreeBSD can use either, but MinGW and some others need to use
# libcharset.h's locale_charset() instead.
#
# Define CHARSET_LIB to the library you need to link with in order to
# use locale_charset() function. On some platforms this needs to set to
# -lcharset, on others to -liconv .
#
# Define LIBC_CONTAINS_LIBINTL if your gettext implementation doesn't
# need -lintl when linking.
#
# Define NO_MSGFMT_EXTENDED_OPTIONS if your implementation of msgfmt
# doesn't support GNU extensions like --check and --statistics
#
# Define HAVE_PATHS_H if you have paths.h and want to use the default PATH
# it specifies.
#
@ -152,39 +104,6 @@ include shared.mak
# and do not want to use Apple's CommonCrypto library. This allows you
# to provide your own OpenSSL library, for example from MacPorts.
#
# Define BLK_SHA1 environment variable to make use of the bundled
# optimized C SHA1 routine.
#
# Define DC_SHA1 to unconditionally enable the collision-detecting sha1
# algorithm. This is slower, but may detect attempted collision attacks.
# Takes priority over other *_SHA1 knobs.
#
# Define DC_SHA1_EXTERNAL in addition to DC_SHA1 if you want to build / link
# git with the external SHA1 collision-detect library.
# Without this option, i.e. the default behavior is to build git with its
# own built-in code (or submodule).
#
# Define DC_SHA1_SUBMODULE in addition to DC_SHA1 to use the
# sha1collisiondetection shipped as a submodule instead of the
# non-submodule copy in sha1dc/. This is an experimental option used
# by the git project to migrate to using sha1collisiondetection as a
# submodule.
#
# Define OPENSSL_SHA1 environment variable when running make to link
# with the SHA1 routine from openssl library.
#
# Define SHA1_MAX_BLOCK_SIZE to limit the amount of data that will be hashed
# in one call to the platform's SHA1_Update(). e.g. APPLE_COMMON_CRYPTO
# wants 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined.
#
# Define BLK_SHA256 to use the built-in SHA-256 routines.
#
# Define NETTLE_SHA256 to use the SHA-256 routines in libnettle.
#
# Define GCRYPT_SHA256 to use the SHA-256 routines in libgcrypt.
#
# Define OPENSSL_SHA256 to use the SHA-256 routines in OpenSSL.
#
# Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl (Darwin).
#
# Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto (Darwin).
@ -490,6 +409,151 @@ include shared.mak
# to the "<name>" of the corresponding `compat/fsmonitor/fsm-settings-<name>.c`
# that implements the `fsm_os_settings__*()` routines.
#
# === Optional library: libintl ===
#
# Define NO_GETTEXT if you don't want Git output to be translated.
# A translated Git requires GNU libintl or another gettext implementation,
# plus libintl-perl at runtime.
#
# Define USE_GETTEXT_SCHEME and set it to 'fallthrough', if you don't trust
# the installed gettext translation of the shell scripts output.
#
# Define HAVE_LIBCHARSET_H if you haven't set NO_GETTEXT and you can't
# trust the langinfo.h's nl_langinfo(CODESET) function to return the
# current character set. GNU and Solaris have a nl_langinfo(CODESET),
# FreeBSD can use either, but MinGW and some others need to use
# libcharset.h's locale_charset() instead.
#
# Define CHARSET_LIB to the library you need to link with in order to
# use locale_charset() function. On some platforms this needs to set to
# -lcharset, on others to -liconv .
#
# Define LIBC_CONTAINS_LIBINTL if your gettext implementation doesn't
# need -lintl when linking.
#
# Define NO_MSGFMT_EXTENDED_OPTIONS if your implementation of msgfmt
# doesn't support GNU extensions like --check and --statistics
#
# === Optional library: libexpat ===
#
# Define NO_EXPAT if you do not have expat installed. git-http-push is
# not built, and you cannot push using http:// and https:// transports (dumb).
#
# Define EXPATDIR=/foo/bar if your expat header and library files are in
# /foo/bar/include and /foo/bar/lib directories.
#
# Define EXPAT_NEEDS_XMLPARSE_H if you have an old version of expat (e.g.,
# 1.1 or 1.2) that provides xmlparse.h instead of expat.h.
# === Optional library: libcurl ===
#
# Define NO_CURL if you do not have libcurl installed. git-http-fetch and
# git-http-push are not built, and you cannot use http:// and https://
# transports (neither smart nor dumb).
#
# Define CURLDIR=/foo/bar if your curl header and library files are in
# /foo/bar/include and /foo/bar/lib directories.
#
# Define CURL_CONFIG to curl's configuration program that prints information
# about the library (e.g., its version number). The default is 'curl-config'.
#
# Define CURL_LDFLAGS to specify flags that you need to link when using libcurl,
# if you do not want to rely on the libraries provided by CURL_CONFIG. The
# default value is a result of `curl-config --libs`. An example value for
# CURL_LDFLAGS is as follows:
#
# CURL_LDFLAGS=-lcurl
#
# === Optional library: libpcre2 ===
#
# Define USE_LIBPCRE if you have and want to use libpcre. Various
# commands such as log and grep offer runtime options to use
# Perl-compatible regular expressions instead of standard or extended
# POSIX regular expressions.
#
# Only libpcre version 2 is supported. USE_LIBPCRE2 is a synonym for
# USE_LIBPCRE, support for the old USE_LIBPCRE1 has been removed.
#
# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are
# in /foo/bar/include and /foo/bar/lib directories.
#
# == SHA-1 and SHA-256 defines ==
#
# === SHA-1 backend ===
#
# ==== Security ====
#
# Due to the SHAttered (https://shattered.io) attack vector on SHA-1
# it's strongly recommended to use the sha1collisiondetection
# counter-cryptanalysis library for SHA-1 hashing.
#
# If you know that you can trust the repository contents, or where
# potential SHA-1 attacks are otherwise mitigated the other backends
# listed in "SHA-1 implementations" are faster than
# sha1collisiondetection.
#
# ==== Default SHA-1 backend ====
#
# If no *_SHA1 backend is picked, the first supported one listed in
# "SHA-1 implementations" will be picked.
#
# ==== Options common to all SHA-1 implementations ====
#
# Define SHA1_MAX_BLOCK_SIZE to limit the amount of data that will be hashed
# in one call to the platform's SHA1_Update(). e.g. APPLE_COMMON_CRYPTO
# wants 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined.
#
# ==== SHA-1 implementations ====
#
# Define OPENSSL_SHA1 to link to the SHA-1 routines from the OpenSSL
# library.
#
# Define BLK_SHA1 to make use of optimized C SHA-1 routines bundled
# with git (in the block-sha1/ directory).
#
# Define NO_APPLE_COMMON_CRYPTO on OSX to opt-out of using the
# "APPLE_COMMON_CRYPTO" backend for SHA-1, which is currently the
# default on that OS. On macOS 01.4 (Tiger) or older,
# NO_APPLE_COMMON_CRYPTO is defined by default.
#
# If don't enable any of the *_SHA1 settings in this section, Git will
# default to its built-in sha1collisiondetection library, which is a
# collision-detecting sha1 This is slower, but may detect attempted
# collision attacks.
#
# ==== Options for the sha1collisiondetection library ====
#
# Define DC_SHA1_EXTERNAL if you want to build / link
# git with the external SHA1 collision-detect library.
# Without this option, i.e. the default behavior is to build git with its
# own built-in code (or submodule).
#
# Define DC_SHA1_SUBMODULE to use the
# sha1collisiondetection shipped as a submodule instead of the
# non-submodule copy in sha1dc/. This is an experimental option used
# by the git project to migrate to using sha1collisiondetection as a
# submodule.
#
# === SHA-256 backend ===
#
# ==== Security ====
#
# Unlike SHA-1 the SHA-256 algorithm does not suffer from any known
# vulnerabilities, so any implementation will do.
#
# ==== SHA-256 implementations ====
#
# Define OPENSSL_SHA256 to use the SHA-256 routines in OpenSSL.
#
# Define NETTLE_SHA256 to use the SHA-256 routines in libnettle.
#
# Define GCRYPT_SHA256 to use the SHA-256 routines in libgcrypt.
#
# If don't enable any of the *_SHA256 settings in this section, Git
# will default to its built-in sha256 implementation.
#
# == DEVELOPER defines ==
#
# Define DEVELOPER to enable more compiler warnings. Compiler version
# and family are auto detected, but could be overridden by defining
# COMPILER_FEATURES (see config.mak.dev). You can still set
@ -723,6 +787,7 @@ TEST_BUILTINS_OBJS += test-advise.o
TEST_BUILTINS_OBJS += test-bitmap.o
TEST_BUILTINS_OBJS += test-bloom.o
TEST_BUILTINS_OBJS += test-bundle-uri.o
TEST_BUILTINS_OBJS += test-cache-tree.o
TEST_BUILTINS_OBJS += test-chmtime.o
TEST_BUILTINS_OBJS += test-config.o
TEST_BUILTINS_OBJS += test-crontab.o
@ -1826,7 +1891,6 @@ ifdef APPLE_COMMON_CRYPTO
COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
BASIC_CFLAGS += -DSHA1_APPLE
else
DC_SHA1 := YesPlease
BASIC_CFLAGS += -DSHA1_DC
LIB_OBJS += sha1dc_git.o
ifdef DC_SHA1_EXTERNAL
@ -2989,7 +3053,6 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_REGEX=\''$(subst ','\'',$(subst ','\'',$(NO_REGEX)))'\' >>$@+
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
@echo SANITIZE_ADDRESS=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_ADDRESS)))'\' >>$@+
@echo X=\'$(X)\' >>$@+

View File

@ -150,7 +150,7 @@ static int branch_merged(int kind, const char *name,
if (!reference_rev)
reference_rev = head_rev;
merged = in_merge_bases(rev, reference_rev);
merged = reference_rev ? in_merge_bases(rev, reference_rev) : 0;
/*
* After the safety valve is fully redefined to "check with
@ -160,7 +160,7 @@ static int branch_merged(int kind, const char *name,
* a gentle reminder is in order.
*/
if ((head_rev != reference_rev) &&
in_merge_bases(rev, head_rev) != merged) {
(head_rev ? in_merge_bases(rev, head_rev) : 0) != merged) {
if (merged)
warning(_("deleting branch '%s' that has been merged to\n"
" '%s', but not yet merged to HEAD."),
@ -235,11 +235,8 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
}
branch_name_pos = strcspn(fmt, "%");
if (!force) {
if (!force)
head_rev = lookup_commit_reference(the_repository, &head_oid);
if (!head_rev)
die(_("Couldn't look up commit object for HEAD"));
}
for (i = 0; i < argc; i++, strbuf_reset(&bname)) {
char *target = NULL;

View File

@ -249,6 +249,10 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
if (opts.debug_unpack)
opts.fn = debug_merge;
/* If we're going to prime_cache_tree later, skip cache tree update */
if (nr_trees == 1 && !opts.prefix)
opts.skip_cache_tree_update = 1;
cache_tree_free(&active_cache_tree);
for (i = 0; i < nr_trees; i++) {
struct tree *tree = trees[i];

View File

@ -32,7 +32,6 @@ static int write_bitmaps = -1;
static int use_delta_islands;
static int run_update_server_info = 1;
static char *packdir, *packtmp_name, *packtmp;
static char *cruft_expiration;
static const char *const git_repack_usage[] = {
N_("git repack [<options>]"),
@ -150,7 +149,8 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name)
}
static void prepare_pack_objects(struct child_process *cmd,
const struct pack_objects_args *args)
const struct pack_objects_args *args,
const char *out)
{
strvec_push(&cmd->args, "pack-objects");
if (args->window)
@ -173,7 +173,7 @@ static void prepare_pack_objects(struct child_process *cmd,
strvec_push(&cmd->args, "--quiet");
if (delta_base_offset)
strvec_push(&cmd->args, "--delta-base-offset");
strvec_push(&cmd->args, packtmp);
strvec_push(&cmd->args, out);
cmd->git_cmd = 1;
cmd->out = -1;
}
@ -241,7 +241,7 @@ static void repack_promisor_objects(const struct pack_objects_args *args,
FILE *out;
struct strbuf line = STRBUF_INIT;
prepare_pack_objects(&cmd, args);
prepare_pack_objects(&cmd, args, packtmp);
cmd.in = -1;
/*
@ -657,7 +657,9 @@ static void remove_redundant_bitmaps(struct string_list *include,
}
static int write_cruft_pack(const struct pack_objects_args *args,
const char *destination,
const char *pack_prefix,
const char *cruft_expiration,
struct string_list *names,
struct string_list *existing_packs,
struct string_list *existing_kept_packs)
@ -667,8 +669,10 @@ static int write_cruft_pack(const struct pack_objects_args *args,
struct string_list_item *item;
FILE *in, *out;
int ret;
const char *scratch;
int local = skip_prefix(destination, packdir, &scratch);
prepare_pack_objects(&cmd, args);
prepare_pack_objects(&cmd, args, destination);
strvec_push(&cmd.args, "--cruft");
if (cruft_expiration)
@ -693,6 +697,10 @@ static int write_cruft_pack(const struct pack_objects_args *args,
* By the time it is read here, it contains only the pack(s)
* that were just written, which is exactly the set of packs we
* want to consider kept.
*
* If `--expire-to` is given, the double-use served by `names`
* ensures that the pack written to `--expire-to` excludes any
* objects contained in the cruft pack.
*/
in = xfdopen(cmd.in, "w");
for_each_string_list_item(item, names)
@ -710,9 +718,14 @@ static int write_cruft_pack(const struct pack_objects_args *args,
if (line.len != the_hash_algo->hexsz)
die(_("repack: Expecting full hex object ID lines only "
"from pack-objects."));
item = string_list_append(names, line.buf);
item->util = populate_pack_exts(line.buf);
/*
* avoid putting packs written outside of the repository in the
* list of names
*/
if (local) {
item = string_list_append(names, line.buf);
item->util = populate_pack_exts(line.buf);
}
}
fclose(out);
@ -744,6 +757,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
struct pack_objects_args cruft_po_args = {NULL};
int geometric_factor = 0;
int write_midx = 0;
const char *cruft_expiration = NULL;
const char *expire_to = NULL;
struct option builtin_repack_options[] = {
OPT_BIT('a', NULL, &pack_everything,
@ -793,6 +808,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
N_("find a geometric progression with factor <N>")),
OPT_BOOL('m', "write-midx", &write_midx,
N_("write a multi-pack index of the resulting packs")),
OPT_STRING(0, "expire-to", &expire_to, N_("dir"),
N_("pack prefix to store a pack containing pruned objects")),
OPT_END()
};
@ -858,7 +875,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
split_pack_geometry(geometry, geometric_factor);
}
prepare_pack_objects(&cmd, &po_args);
prepare_pack_objects(&cmd, &po_args, packtmp);
show_progress = !po_args.quiet && isatty(2);
@ -984,11 +1001,45 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
cruft_po_args.local = po_args.local;
cruft_po_args.quiet = po_args.quiet;
ret = write_cruft_pack(&cruft_po_args, pack_prefix, &names,
ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix,
cruft_expiration, &names,
&existing_nonkept_packs,
&existing_kept_packs);
if (ret)
return ret;
if (delete_redundant && expire_to) {
/*
* If `--expire-to` is given with `-d`, it's possible
* that we're about to prune some objects. With cruft
* packs, pruning is implicit: any objects from existing
* packs that weren't picked up by new packs are removed
* when their packs are deleted.
*
* Generate an additional cruft pack, with one twist:
* `names` now includes the name of the cruft pack
* written in the previous step. So the contents of
* _this_ cruft pack exclude everything contained in the
* existing cruft pack (that is, all of the unreachable
* objects which are no older than
* `--cruft-expiration`).
*
* To make this work, cruft_expiration must become NULL
* so that this cruft pack doesn't actually prune any
* objects. If it were non-NULL, this call would always
* generate an empty pack (since every object not in the
* cruft pack generated above will have an mtime older
* than the expiration).
*/
ret = write_cruft_pack(&cruft_po_args, expire_to,
pack_prefix,
NULL,
&names,
&existing_nonkept_packs,
&existing_kept_packs);
if (ret)
return ret;
}
}
string_list_sort(&names);

View File

@ -73,9 +73,11 @@ static int reset_index(const char *ref, const struct object_id *oid, int reset_t
case HARD:
opts.update = 1;
opts.reset = UNPACK_RESET_OVERWRITE_UNTRACKED;
opts.skip_cache_tree_update = 1;
break;
case MIXED:
opts.reset = UNPACK_RESET_PROTECT_UNTRACKED;
opts.skip_cache_tree_update = 1;
/* but opts.update=0, so working tree not updated */
break;
default:

View File

@ -260,7 +260,7 @@ macos-latest)
else
MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python2)"
MAKEFLAGS="$MAKEFLAGS NO_APPLE_COMMON_CRYPTO=NoThanks"
MAKEFLAGS="$MAKEFLAGS DC_SHA1=YesPlease NO_OPENSSL=NoThanks"
MAKEFLAGS="$MAKEFLAGS NO_OPENSSL=NoThanks"
fi
;;
esac

View File

@ -1025,7 +1025,6 @@ set(NO_PERL )
set(NO_PTHREADS )
set(NO_PYTHON )
set(PAGER_ENV "LESS=FRX LV=-c")
set(DC_SHA1 YesPlease)
set(RUNTIME_PREFIX true)
set(NO_GETTEXT )
@ -1061,7 +1060,6 @@ file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "NO_PERL='${NO_PERL}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "NO_PTHREADS='${NO_PTHREADS}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "NO_UNIX_SOCKETS='${NO_UNIX_SOCKETS}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "PAGER_ENV='${PAGER_ENV}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "DC_SHA1='${DC_SHA1}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "X='${EXE_EXTENSION}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "NO_GETTEXT='${NO_GETTEXT}'\n")
file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "RUNTIME_PREFIX='${RUNTIME_PREFIX}'\n")

View File

@ -128,6 +128,7 @@ int reset_head(struct repository *r, const struct reset_head_opts *opts)
unpack_tree_opts.update = 1;
unpack_tree_opts.merge = 1;
unpack_tree_opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
unpack_tree_opts.skip_cache_tree_update = 1;
init_checkout_metadata(&unpack_tree_opts.meta, switch_to_branch, oid, NULL);
if (reset_hard)
unpack_tree_opts.reset = UNPACK_RESET_PROTECT_UNTRACKED;

View File

@ -3748,6 +3748,7 @@ static int do_reset(struct repository *r,
unpack_tree_opts.merge = 1;
unpack_tree_opts.update = 1;
unpack_tree_opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
unpack_tree_opts.skip_cache_tree_update = 1;
init_checkout_metadata(&unpack_tree_opts.meta, name, &oid, NULL);
if (repo_read_index_unmerged(r)) {
@ -4128,11 +4129,14 @@ static int write_update_refs_state(struct string_list *refs_to_oids)
struct string_list_item *item;
char *path;
if (!refs_to_oids->nr)
return 0;
path = rebase_path_update_refs(the_repository->gitdir);
if (!refs_to_oids->nr) {
if (unlink(path) && errno != ENOENT)
result = error_errno(_("could not unlink: %s"), path);
goto cleanup;
}
if (safe_create_leading_directories(path)) {
result = error(_("unable to create leading directories of %s"),
path);

View File

@ -17,6 +17,7 @@ void git_SHA1DCInit(SHA1_CTX *);
void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
#define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
#define platform_SHA_CTX SHA1_CTX
#define platform_SHA1_Init git_SHA1DCInit
#define platform_SHA1_Update git_SHA1DCUpdate

View File

@ -0,0 +1,64 @@
#include "test-tool.h"
#include "cache.h"
#include "tree.h"
#include "cache-tree.h"
#include "parse-options.h"
static char const * const test_cache_tree_usage[] = {
N_("test-tool cache-tree <options> (control|prime|update)"),
NULL
};
int cmd__cache_tree(int argc, const char **argv)
{
struct object_id oid;
struct tree *tree;
int empty = 0;
int invalidate_qty = 0;
int i;
struct option options[] = {
OPT_BOOL(0, "empty", &empty,
N_("clear the cache tree before each iteration")),
OPT_INTEGER_F(0, "invalidate", &invalidate_qty,
N_("number of entries in the cache tree to invalidate (default 0)"),
PARSE_OPT_NONEG),
OPT_END()
};
setup_git_directory();
argc = parse_options(argc, argv, NULL, options, test_cache_tree_usage, 0);
if (read_cache() < 0)
die(_("unable to read index file"));
oidcpy(&oid, &the_index.cache_tree->oid);
tree = parse_tree_indirect(&oid);
if (!tree)
die(_("not a tree object: %s"), oid_to_hex(&oid));
if (empty) {
/* clear the cache tree & allocate a new one */
cache_tree_free(&the_index.cache_tree);
the_index.cache_tree = cache_tree();
} else if (invalidate_qty) {
/* invalidate the specified number of unique paths */
float f_interval = (float)the_index.cache_nr / invalidate_qty;
int interval = f_interval < 1.0 ? 1 : (int)f_interval;
for (i = 0; i < invalidate_qty && i * interval < the_index.cache_nr; i++)
cache_tree_invalidate_path(&the_index, the_index.cache[i * interval]->name);
}
if (argc != 1)
usage_with_options(test_cache_tree_usage, options);
else if (!strcmp(argv[0], "prime"))
prime_cache_tree(the_repository, &the_index, tree);
else if (!strcmp(argv[0], "update"))
cache_tree_update(&the_index, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
/* use "control" subcommand to specify no-op */
else if (!!strcmp(argv[0], "control"))
die(_("Unhandled subcommand '%s'"), argv[0]);
return 0;
}

View File

@ -5,3 +5,11 @@ int cmd__sha1(int ac, const char **av)
{
return cmd_hash_impl(ac, av, GIT_HASH_SHA1);
}
int cmd__sha1_is_sha1dc(int argc UNUSED, const char **argv UNUSED)
{
#ifdef platform_SHA_IS_SHA1DC
return 0;
#endif
return 1;
}

View File

@ -14,6 +14,7 @@ static struct test_cmd cmds[] = {
{ "bitmap", cmd__bitmap },
{ "bloom", cmd__bloom },
{ "bundle-uri", cmd__bundle_uri },
{ "cache-tree", cmd__cache_tree },
{ "chmtime", cmd__chmtime },
{ "config", cmd__config },
{ "crontab", cmd__crontab },
@ -73,6 +74,7 @@ static struct test_cmd cmds[] = {
{ "scrap-cache-tree", cmd__scrap_cache_tree },
{ "serve-v2", cmd__serve_v2 },
{ "sha1", cmd__sha1 },
{ "sha1-is-sha1dc", cmd__sha1_is_sha1dc },
{ "sha256", cmd__sha256 },
{ "sigchain", cmd__sigchain },
{ "simple-ipc", cmd__simple_ipc },

View File

@ -8,6 +8,7 @@ int cmd__advise_if_enabled(int argc, const char **argv);
int cmd__bitmap(int argc, const char **argv);
int cmd__bloom(int argc, const char **argv);
int cmd__bundle_uri(int argc, const char **argv);
int cmd__cache_tree(int argc, const char **argv);
int cmd__chmtime(int argc, const char **argv);
int cmd__config(int argc, const char **argv);
int cmd__crontab(int argc, const char **argv);
@ -66,6 +67,7 @@ int cmd__run_command(int argc, const char **argv);
int cmd__scrap_cache_tree(int argc, const char **argv);
int cmd__serve_v2(int argc, const char **argv);
int cmd__sha1(int argc, const char **argv);
int cmd__sha1_is_sha1dc(int argc, const char **argv);
int cmd__oid_array(int argc, const char **argv);
int cmd__sha256(int argc, const char **argv);
int cmd__sigchain(int argc, const char **argv);

View File

@ -49,6 +49,14 @@ test_perf "read-tree br_base br_ballast ($nr_files)" '
git read-tree -n -m br_base br_ballast
'
test_perf "read-tree br_ballast_plus_1 ($nr_files)" '
# Run read-tree 100 times for clearer performance results & comparisons
for i in $(test_seq 100)
do
git read-tree -n -m br_ballast_plus_1 || return 1
done
'
test_perf "switch between br_base br_ballast ($nr_files)" '
git checkout -q br_base &&
git checkout -q br_ballast

36
t/perf/p0090-cache-tree.sh Executable file
View File

@ -0,0 +1,36 @@
#!/bin/sh
test_description="Tests performance of cache tree update operations"
. ./perf-lib.sh
test_perf_large_repo
test_checkout_worktree
count=100
test_expect_success 'setup cache tree' '
git write-tree
'
test_cache_tree () {
test_perf "$1, $3" "
for i in \$(test_seq $count)
do
test-tool cache-tree $4 $2
done
"
}
test_cache_tree_update_functions () {
test_cache_tree 'no-op' 'control' "$1" "$2"
test_cache_tree 'prime_cache_tree' 'prime' "$1" "$2"
test_cache_tree 'cache_tree_update' 'update' "$1" "$2"
}
test_cache_tree_update_functions "clean" ""
test_cache_tree_update_functions "invalidate 2" "--invalidate 2"
test_cache_tree_update_functions "invalidate 50" "--invalidate 50"
test_cache_tree_update_functions "empty" "--empty"
test_done

21
t/perf/p7102-reset.sh Executable file
View File

@ -0,0 +1,21 @@
#!/bin/sh
test_description='performance of reset'
. ./perf-lib.sh
test_perf_default_repo
test_checkout_worktree
test_perf 'reset --hard with change in tree' '
base=$(git rev-parse HEAD) &&
test_commit --no-tag A &&
new=$(git rev-parse HEAD) &&
for i in $(test_seq 10)
do
git reset --hard $new &&
git reset --hard $base || return $?
done
'
test_done

View File

@ -6,9 +6,11 @@ TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
TEST_DATA="$TEST_DIRECTORY/t0013"
if test -z "$DC_SHA1"
test_lazy_prereq SHA1_IS_SHA1DC 'test-tool sha1-is-sha1dc'
if ! test_have_prereq SHA1_IS_SHA1DC
then
skip_all='skipping sha1 collision tests, DC_SHA1 not set'
skip_all='skipping sha1 collision tests, not using sha1collisiondetection'
test_done
fi

View File

@ -130,7 +130,8 @@ World
EOF
test_expect_success 'run_command runs in parallel with more jobs available than tasks' '
test-tool run-command run-command-parallel 5 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
test-tool run-command run-command-parallel 5 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>actual &&
test_must_be_empty out &&
test_cmp expect actual
'
@ -141,7 +142,8 @@ test_expect_success 'run_command runs ungrouped in parallel with more jobs avail
'
test_expect_success 'run_command runs in parallel with as many jobs as tasks' '
test-tool run-command run-command-parallel 4 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
test-tool run-command run-command-parallel 4 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>actual &&
test_must_be_empty out &&
test_cmp expect actual
'
@ -152,7 +154,8 @@ test_expect_success 'run_command runs ungrouped in parallel with as many jobs as
'
test_expect_success 'run_command runs in parallel with more tasks than jobs available' '
test-tool run-command run-command-parallel 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
test-tool run-command run-command-parallel 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>actual &&
test_must_be_empty out &&
test_cmp expect actual
'
@ -172,7 +175,8 @@ asking for a quick stop
EOF
test_expect_success 'run_command is asked to abort gracefully' '
test-tool run-command run-command-abort 3 false 2>actual &&
test-tool run-command run-command-abort 3 false >out 2>actual &&
test_must_be_empty out &&
test_cmp expect actual
'
@ -187,7 +191,8 @@ no further jobs available
EOF
test_expect_success 'run_command outputs ' '
test-tool run-command run-command-no-jobs 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
test-tool run-command run-command-no-jobs 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>actual &&
test_must_be_empty out &&
test_cmp expect actual
'

View File

@ -19,7 +19,7 @@ test_expect_success 'read-tree in partial clone prefetches in one batch' '
git -C server config uploadpack.allowfilter 1 &&
git -C server config uploadpack.allowanysha1inwant 1 &&
git clone --bare --filter=blob:none "file://$(pwd)/server" client &&
GIT_TRACE_PACKET="$(pwd)/trace" git -C client read-tree $TREE &&
GIT_TRACE_PACKET="$(pwd)/trace" git -C client read-tree $TREE $TREE &&
# "done" marks the end of negotiation (once per fetch). Expect that
# only one fetch occurs.

View File

@ -95,7 +95,7 @@ test_expect_success 'git hook run -- out-of-repo runs excluded' '
test_expect_success 'git -c core.hooksPath=<PATH> hook run' '
mkdir my-hooks &&
write_script my-hooks/test-hook <<-\EOF &&
echo Hook ran $1 >>actual
echo Hook ran $1
EOF
cat >expect <<-\EOF &&

View File

@ -279,6 +279,42 @@ test_expect_success 'git branch -M and -C fail on detached HEAD' '
test_cmp expect err
'
test_expect_success 'git branch -d on orphan HEAD (merged)' '
test_when_finished git checkout main &&
git checkout --orphan orphan &&
test_when_finished "rm -rf .git/objects/commit-graph*" &&
git commit-graph write --reachable &&
git branch --track to-delete main &&
git branch -d to-delete
'
test_expect_success 'git branch -d on orphan HEAD (merged, graph)' '
test_when_finished git checkout main &&
git checkout --orphan orphan &&
git branch --track to-delete main &&
git branch -d to-delete
'
test_expect_success 'git branch -d on orphan HEAD (unmerged)' '
test_when_finished git checkout main &&
git checkout --orphan orphan &&
test_when_finished "git branch -D to-delete" &&
git branch to-delete main &&
test_must_fail git branch -d to-delete 2>err &&
grep "not fully merged" err
'
test_expect_success 'git branch -d on orphan HEAD (unmerged, graph)' '
test_when_finished git checkout main &&
git checkout --orphan orphan &&
test_when_finished "git branch -D to-delete" &&
git branch to-delete main &&
test_when_finished "rm -rf .git/objects/commit-graph*" &&
git commit-graph write --reachable &&
test_must_fail git branch -d to-delete 2>err &&
grep "not fully merged" err
'
test_expect_success 'git branch -v -d t should work' '
git branch t &&
git rev-parse --verify refs/heads/t &&

View File

@ -1964,6 +1964,113 @@ test_expect_success 'respect user edits to update-ref steps' '
test_cmp_rev HEAD refs/heads/no-conflict-branch
'
test_expect_success '--update-refs: all update-ref lines removed' '
git checkout -b test-refs-not-removed no-conflict-branch &&
git branch -f base HEAD~4 &&
git branch -f first HEAD~3 &&
git branch -f second HEAD~3 &&
git branch -f third HEAD~1 &&
git branch -f tip &&
test_commit test-refs-not-removed &&
git commit --amend --fixup first &&
git rev-parse first second third tip no-conflict-branch >expect-oids &&
(
set_cat_todo_editor &&
test_must_fail git rebase -i --update-refs base >todo.raw &&
sed -e "/^update-ref/d" <todo.raw >todo
) &&
(
set_replace_editor todo &&
git rebase -i --update-refs base
) &&
# Ensure refs are not deleted and their OIDs have not changed
git rev-parse first second third tip no-conflict-branch >actual-oids &&
test_cmp expect-oids actual-oids
'
test_expect_success '--update-refs: all update-ref lines removed, then some re-added' '
git checkout -b test-refs-not-removed2 no-conflict-branch &&
git branch -f base HEAD~4 &&
git branch -f first HEAD~3 &&
git branch -f second HEAD~3 &&
git branch -f third HEAD~1 &&
git branch -f tip &&
test_commit test-refs-not-removed2 &&
git commit --amend --fixup first &&
git rev-parse first second third >expect-oids &&
(
set_cat_todo_editor &&
test_must_fail git rebase -i \
--autosquash --update-refs \
base >todo.raw &&
sed -e "/^update-ref/d" <todo.raw >todo
) &&
# Add a break to the end of the todo so we can edit later
echo "break" >>todo &&
(
set_replace_editor todo &&
git rebase -i --autosquash --update-refs base &&
echo "update-ref refs/heads/tip" >todo &&
git rebase --edit-todo &&
git rebase --continue
) &&
# Ensure first/second/third are unchanged, but tip is updated
git rev-parse first second third >actual-oids &&
test_cmp expect-oids actual-oids &&
test_cmp_rev HEAD tip
'
test_expect_success '--update-refs: --edit-todo with no update-ref lines' '
git checkout -b test-refs-not-removed3 no-conflict-branch &&
git branch -f base HEAD~4 &&
git branch -f first HEAD~3 &&
git branch -f second HEAD~3 &&
git branch -f third HEAD~1 &&
git branch -f tip &&
test_commit test-refs-not-removed3 &&
git commit --amend --fixup first &&
git rev-parse first second third tip no-conflict-branch >expect-oids &&
(
set_cat_todo_editor &&
test_must_fail git rebase -i \
--autosquash --update-refs \
base >todo.raw &&
sed -e "/^update-ref/d" <todo.raw >todo
) &&
# Add a break to the beginning of the todo so we can resume with no
# update-ref lines
echo "break" >todo.new &&
cat todo >>todo.new &&
(
set_replace_editor todo.new &&
git rebase -i --autosquash --update-refs base &&
# Make no changes when editing so update-refs is still empty
cat todo >todo.new &&
git rebase --edit-todo &&
git rebase --continue
) &&
# Ensure refs are not deleted and their OIDs have not changed
git rev-parse first second third tip no-conflict-branch >actual-oids &&
test_cmp expect-oids actual-oids
'
test_expect_success '--update-refs: check failed ref update' '
git checkout -B update-refs-error no-conflict-branch &&
git branch -f base HEAD~4 &&

View File

@ -178,6 +178,7 @@ test_expect_success "submodule.recurse option triggers recursive fetch" '
'
test_expect_success "fetch --recurse-submodules -j2 has the same output behaviour" '
test_when_finished "rm -f trace.out" &&
add_submodule_commits &&
(
cd downstream &&
@ -705,15 +706,22 @@ test_expect_success "'fetch.recurseSubmodules=on-demand' works also without .git
test_expect_success 'fetching submodules respects parallel settings' '
git config fetch.recurseSubmodules true &&
test_when_finished "rm -f downstream/trace.out" &&
(
cd downstream &&
GIT_TRACE=$(pwd)/trace.out git fetch &&
grep "1 tasks" trace.out &&
>trace.out &&
GIT_TRACE=$(pwd)/trace.out git fetch --jobs 7 &&
grep "7 tasks" trace.out &&
>trace.out &&
git config submodule.fetchJobs 8 &&
GIT_TRACE=$(pwd)/trace.out git fetch &&
grep "8 tasks" trace.out &&
>trace.out &&
GIT_TRACE=$(pwd)/trace.out git fetch --jobs 9 &&
grep "9 tasks" trace.out &&
>trace.out &&

View File

@ -543,4 +543,125 @@ test_expect_success '-n overrides repack.updateServerInfo=true' '
test_server_info_missing
'
test_expect_success '--expire-to stores pruned objects (now)' '
git init expire-to-now &&
(
cd expire-to-now &&
git branch -M main &&
test_commit base &&
git checkout -b cruft &&
test_commit --no-tag cruft &&
git rev-list --objects --no-object-names main..cruft >moved.raw &&
sort moved.raw >moved.want &&
git rev-list --all --objects --no-object-names >expect.raw &&
sort expect.raw >expect &&
git checkout main &&
git branch -D cruft &&
git reflog expire --all --expire=all &&
git init --bare expired.git &&
git repack -d \
--cruft --cruft-expiration="now" \
--expire-to="expired.git/objects/pack/pack" &&
expired="$(ls expired.git/objects/pack/pack-*.idx)" &&
test_path_is_file "${expired%.idx}.mtimes" &&
# Since the `--cruft-expiration` is "now", the effective
# behavior is to move _all_ unreachable objects out to
# the location in `--expire-to`.
git show-index <$expired >expired.raw &&
cut -d" " -f2 expired.raw | sort >expired.objects &&
git rev-list --all --objects --no-object-names \
>remaining.objects &&
# ...in other words, the combined contents of this
# repository and expired.git should be the same as the
# set of objects we started with.
cat expired.objects remaining.objects | sort >actual &&
test_cmp expect actual &&
# The "moved" objects (i.e., those in expired.git)
# should be the same as the cruft objects which were
# expired in the previous step.
test_cmp moved.want expired.objects
)
'
test_expect_success '--expire-to stores pruned objects (5.minutes.ago)' '
git init expire-to-5.minutes.ago &&
(
cd expire-to-5.minutes.ago &&
git branch -M main &&
test_commit base &&
# Create two classes of unreachable objects, one which
# is older than 5 minutes (stale), and another which is
# newer (recent).
for kind in stale recent
do
git checkout -b $kind main &&
test_commit --no-tag $kind || return 1
done &&
git rev-list --objects --no-object-names main..stale >in &&
stale="$(git pack-objects $objdir/pack/pack <in)" &&
mtime="$(test-tool chmtime --get =-600 $objdir/pack/pack-$stale.pack)" &&
# expect holds the set of objects we expect to find in
# this repository after repacking
git rev-list --objects --no-object-names recent >expect.raw &&
sort expect.raw >expect &&
# moved.want holds the set of objects we expect to find
# in expired.git
git rev-list --objects --no-object-names main..stale >out &&
sort out >moved.want &&
git checkout main &&
git branch -D stale recent &&
git reflog expire --all --expire=all &&
git prune-packed &&
git init --bare expired.git &&
git repack -d \
--cruft --cruft-expiration=5.minutes.ago \
--expire-to="expired.git/objects/pack/pack" &&
# Some of the remaining objects in this repository are
# unreachable, so use `cat-file --batch-all-objects`
# instead of `rev-list` to get their names
git cat-file --batch-all-objects --batch-check="%(objectname)" \
>remaining.objects &&
sort remaining.objects >actual &&
test_cmp expect actual &&
(
cd expired.git &&
expired="$(ls objects/pack/pack-*.mtimes)" &&
test-tool pack-mtimes $(basename $expired) >out &&
cut -d" " -f1 out | sort >../moved.got &&
# Ensure that there are as many objects with the
# expected mtime as were moved to expired.git.
#
# In other words, ensure that the recorded
# mtimes of any moved objects was written
# correctly.
grep " $mtime$" out >matching &&
test_line_count = $(wc -l <../moved.want) matching
) &&
test_cmp moved.want moved.got
)
'
test_done

View File

@ -2043,7 +2043,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
if (!ret) {
if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
cache_tree_verify(the_repository, &o->result);
if (!cache_tree_fully_valid(o->result.cache_tree))
if (!o->skip_cache_tree_update &&
!cache_tree_fully_valid(o->result.cache_tree))
cache_tree_update(&o->result,
WRITE_TREE_SILENT |
WRITE_TREE_REPAIR);

View File

@ -71,7 +71,8 @@ struct unpack_trees_options {
quiet,
exiting_early,
show_all_errors,
dry_run;
dry_run,
skip_cache_tree_update;
enum unpack_trees_reset_type reset;
const char *prefix;
int cache_bottom;