1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-09 14:06:12 +02:00
Commit Graph

436 Commits

Author SHA1 Message Date
Junio C Hamano ac162a606b Merge branch 'jk/clone-unborn-head-in-bare'
"git clone" from a repository whose HEAD is unborn into a bare
repository didn't follow the branch name the other side used, which
is corrected.

* jk/clone-unborn-head-in-bare:
  clone: handle unborn branch in bare repos
2021-10-03 21:49:17 -07:00
Elijah Newren 1b5f37334a Remove ignored files by default when they are in the way
Change several commands to remove ignored files by default when they are
in the way.  Since some commands (checkout, merge) take a
--no-overwrite-ignore option to allow the user to configure this, and it
may make sense to add that option to more commands (and in the case of
merge, actually plumb that configuration option through to more of the
backends than just the fast-forwarding special case), add little
comments about where such flags would be used.

Incidentally, this fixes a test failure in t7112.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
Elijah Newren 04988c8d18 unpack-trees: introduce preserve_ignored to unpack_trees_options
Currently, every caller of unpack_trees() that wants to ensure ignored
files are overwritten by default needs to:
   * allocate unpack_trees_options.dir
   * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
   * call setup_standard_excludes
AND then after the call to unpack_trees() needs to
   * call dir_clear()
   * deallocate unpack_trees_options.dir
That's a fair amount of boilerplate, and every caller uses identical
code.  Make this easier by instead introducing a new boolean value where
the default value (0) does what we want so that new callers of
unpack_trees() automatically get the appropriate behavior.  And move all
the handling of unpack_trees_options.dir into unpack_trees() itself.

While preserve_ignored = 0 is the behavior we feel is the appropriate
default, we defer fixing commands to use the appropriate default until a
later commit.  So, this commit introduces several locations where we
manually set preserve_ignored=1.  This makes it clear where code paths
were previously preserving ignored files when they should not have been;
a future commit will flip these to instead use a value of 0 to get the
behavior we want.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 13:38:37 -07:00
Junio C Hamano bbeca063cf Merge branch 'ar/submodule-add-more'
More parts of "git submodule add" has been rewritten in C.

* ar/submodule-add-more:
  submodule--helper: rename compute_submodule_clone_url()
  submodule--helper: remove resolve-relative-url subcommand
  submodule--helper: remove add-config subcommand
  submodule--helper: remove add-clone subcommand
  submodule--helper: convert the bulk of cmd_add() to C
  dir: libify and export helper functions from clone.c
  submodule--helper: remove repeated code in sync_submodule()
  submodule--helper: refactor resolve_relative_url() helper
  submodule--helper: add options for compute_submodule_clone_url()
2021-09-20 15:20:43 -07:00
Junio C Hamano deec8aa2d0 Merge branch 'ps/fetch-optim'
Optimize code that handles large number of refs in the "git fetch"
code path.

* ps/fetch-optim:
  fetch: avoid second connectivity check if we already have all objects
  fetch: merge fetching and consuming refs
  fetch: refactor fetch refs to be more extendable
  fetch-pack: optimize loading of refs via commit graph
  connected: refactor iterator to return next object ID directly
  fetch: avoid unpacking headers in object existence check
  fetch: speed up lookup of want refs via commit-graph
2021-09-20 15:20:39 -07:00
Jeff King 6b58df54cf clone: handle unborn branch in bare repos
When cloning a repository with an unborn HEAD, we'll set the local HEAD
to match it only if the local repository is non-bare. This is
inconsistent with all other combinations:

  remote HEAD       | local repo | local HEAD
  -----------------------------------------------
  points to commit  | non-bare   | same as remote
  points to commit  | bare       | same as remote
  unborn            | non-bare   | same as remote
  unborn            | bare       | local default

So I don't think this is some clever or subtle behavior, but just a bug
in 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05). And it's
easy to see how we ended up there. Before that commit, the code to set
up the HEAD for an empty repo was guarded by "if (!option_bare)". That's
because the only thing it did was call install_branch_config(), and we
don't want to do so for a bare repository (unborn HEAD or not).

That commit put the handling of unborn HEADs into the same block, since
those also need to call install_branch_config(). But the unborn case has
an additional side effect of calling create_symref(), and we want that
to happen whether we are bare or not.

This patch just pulls all of the "figure out the default branch" code
out of the "!option_bare" block. Only the actual config installation is
kept there.

Note that this does mean we might allocate "ref" and not use it (if the
remote is empty but did not advertise an unborn HEAD). But that's not
really a big deal since this isn't a hot code path, and it keeps the
code simple. The alternative would be handling unborn_head_target
separately, but that gets confusing since its memory ownership is
tangled up with the "ref" variable.

There's just one new test, for the case we're fixing. The other ones in
the table are handled elsewhere (the unborn non-bare case just above,
and the actually-born cases in t5601, t5606, and t5609, as they do not
require v2's "unborn" protocol extension).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 14:05:36 -07:00
Junio C Hamano fd0d7036e0 Merge branch 'ab/retire-advice-config'
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.

* ab/retire-advice-config:
  advice: move advice.graftFileDeprecated squashing to commit.[ch]
  advice: remove use of global advice_add_embedded_repo
  advice: remove read uses of most global `advice_` variables
  advice: add enum variants for missing advice variables
2021-09-10 11:46:29 -07:00
Patrick Steinhardt 9fec7b2130 connected: refactor iterator to return next object ID directly
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.

Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:

    Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
      Time (mean ± σ):     30.110 s ±  0.148 s    [User: 27.161 s, System: 5.075 s]
      Range (min … max):   29.934 s … 30.406 s    10 runs

    Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
      Time (mean ± σ):     29.899 s ±  0.109 s    [User: 26.916 s, System: 5.104 s]
      Range (min … max):   29.696 s … 29.996 s    10 runs

    Summary
      '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
        1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'

While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 12:43:56 -07:00
Mahi Kolla 48072e3d68 clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabled
Based on current experience, when running git clone --recurse-submodules,
developers do not expect other commands such as pull or checkout to run
recursively into active submodules. However, setting submodule.recurse=true
at this step could make for a simpler workflow by eliminating the need for
the --recurse-submodules option in subsequent commands. To collect more
data on developers' preference in regards to making submodule.recurse=true
a default config value in the future, deploy this feature under the opt in
submodule.stickyRecursiveClone flag.

Signed-off-by: Mahi Kolla <mkolla2@illinois.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30 14:23:17 -07:00
Ben Boeckel ed9bff0817 advice: remove read uses of most global `advice_` variables
In c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25), a new API for
accessing advice variables was introduced and deprecated `advice_config`
in favor of a new array, `advice_setting`.

This patch ports all but two uses which read the status of the global
`advice_` variables over to the new `advice_enabled` API. We'll deal
with advice_add_embedded_repo and advice_graft_file_deprecated
separately.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:52 -07:00
Atharva Raykar ed86301f68 dir: libify and export helper functions from clone.c
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Junio C Hamano f4f7304b44 Merge branch 'jk/clone-clean-upon-transport-error'
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.

* jk/clone-clean-upon-transport-error:
  clone: clean up directory after transport_fetch_refs() failure
2021-06-14 13:33:26 +09:00
Jeff King 6aacb7d861 clone: clean up directory after transport_fetch_refs() failure
git-clone started respecting errors from the transport subsystem in
aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors,
2020-12-03). However, that commit didn't handle the cleanup of the
filesystem quite right.

The cleanup of the directory that cmd_clone() creates is done by an
atexit() handler, which we control with a flag. It starts as
JUNK_LEAVE_NONE ("clean up everything"), then progresses to
JUNK_LEAVE_REPO when we know we have a valid repo but not working tree,
and then finally JUNK_LEAVE_ALL when we have a successful checkout.

Most errors cause us to die(), which then triggers the handler to do the
right thing based on how far into cmd_clone() we got. But the checks
added by aab179d937 instead set the "err" variable and then jump to a
new "cleanup" label, which then returns our non-zero status. However,
the code after the cleanup label includes setting the flag to
JUNK_LEAVE_ALL, and so we accidentally leave the repository and working
tree in place.

One obvious option to fix this is to reorder the end of the function to
set the flag first, before cleanup code, and put the label between them.

But we can observe another small bug: the error return from
transport_fetch_refs() is generally "-1", and we propagate that to the
return value of cmd_clone(), which ultimately becomes the exit code of
the process. And we try to avoid transmitting negative values via exit
codes (only the low 8 bits are passed along as an unsigned value, though
in practice for "-1" this at least retains the property that it's
non-zero).

Instead, let's just die(). That makes us consistent with rest of the
code in the function. It does add a new "fatal:" line to the output, but
I'd argue that's a good thing:

  - in the rare case that the transport code didn't say anything, now
    the user gets _some_ error message

  - even if the transport code said something like "error: ssh died of
    signal 9", it's nice to also say "fatal" to indicate that we
    considered that to be a show-stopper.

Triggering this in the test suite turns out to be surprisingly
difficult. Almost every error we'd encounter, including ones deep inside
the transport code, cause us to just die() right there! However, one way
is to put a fake wrapper around git-upload-pack that sends the complete
packfile but exits with a failure code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 21:14:59 +09:00
brian m. carlson 14228447c9 hash: provide per-algorithm null OIDs
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
Junio C Hamano 22eee7f455 Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.

* ll/clone-reject-shallow:
  builtin/clone.c: add --reject-shallow option
2021-04-08 13:23:25 -07:00
Li Linchao 4fe788b1b0 builtin/clone.c: add --reject-shallow option
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.

The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.

Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:58:58 -07:00
Andrzej Hunt 0c4542738e clone: free or UNLEAK further pointers when finished
Most of these pointers can safely be freed when cmd_clone() completes,
therefore we make sure to free them. The one exception is that we
have to UNLEAK(repo) because it can point either to argv[0], or a
malloc'd string returned by absolute_pathdup().

We also have to free(path) in the middle of cmd_clone(): later during
cmd_clone(), path is unconditionally overwritten with a different path,
triggering a leak. Freeing the first path immediately after use (but
only in the case where it contains data) seems like the cleanest
solution, as opposed to freeing it unconditionally before path is reused
for another path. This leak appears to have been introduced in:
  f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17)

These leaks were found when running t0001 with LSAN, see also an excerpt
of the LSAN output below (the full list is omitted because it's far too
long, and mostly consists of indirect leakage of members of the refs we
are freeing).

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21
    #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3
    #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4
    #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2
    #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10
    #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
Junio C Hamano 69571dfe21 Merge branch 'jt/clone-unborn-head'
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository.  The protocol v2 learned how to do so.

* jt/clone-unborn-head:
  clone: respect remote unborn HEAD
  connect, transport: encapsulate arg in struct
  ls-refs: report unborn targets of symrefs
2021-02-17 17:21:40 -08:00
Jonathan Tan 4f37d45706 clone: respect remote unborn HEAD
Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:55 -08:00
Jonathan Tan 39835409d1 connect, transport: encapsulate arg in struct
In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:54 -08:00
Junio C Hamano 772bdcd429 Merge branch 'js/init-defaultbranch-advice'
Our users are going to be trained to prepare for future change of
init.defaultBranch configuration variable.

* js/init-defaultbranch-advice:
  init: provide useful advice about init.defaultBranch
  get_default_branch_name(): prepare for showing some advice
  branch -m: allow renaming a yet-unborn branch
  init: document `init.defaultBranch` better
2020-12-18 15:15:17 -08:00
Johannes Schindelin cc0f13c57d get_default_branch_name(): prepare for showing some advice
We are about to introduce a message giving users running `git init` some
advice about `init.defaultBranch`. This will necessarily be done in
`repo_default_branch_name()`.

Not all code paths want to show that advice, though. In particular, the
`git clone` codepath _specifically_ asks for `init_db()` to be quiet,
via the `INIT_DB_QUIET` flag.

In preparation for showing users above-mentioned advice, let's change
the function signature of `get_default_branch_name()` to accept the
parameter `quiet`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 15:53:50 -08:00
Taylor Blau aab179d937 builtin/clone.c: don't ignore transport_fetch_refs() errors
If 'git clone' couldn't execute 'transport_fetch_refs()' (e.g., because
of an error on the remote's side in 'git upload-pack'), then it will
silently ignore it.

Even though this has been the case at least since clone was ported to C
(way back in 8434c2f1af (Build in clone, 2008-04-27)), 'git fetch'
doesn't ignore these and reports any failures it sees.

That suggests that ignoring the return value in 'git clone' is simply an
oversight that should be corrected. That's exactly what this patch does.
(Noticing and fixing this is no coincidence, we'll want it in the next
patch in order to demonstrate a regression in 'git upload-pack' via a
'git clone'.)

There's no additional logging here, but that matches how 'git fetch'
handles the same case. An assumption there is that whichever part of
transport_fetch_refs() fails will complain loudly, so any additional
logging here is redundant.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03 12:42:29 -08:00
Junio C Hamano 40696c6727 Merge branch 'sb/clone-origin'
"git clone" learned clone.defaultremotename configuration variable
to customize what nickname to use to call the remote the repository
was cloned from.

* sb/clone-origin:
  clone: allow configurable default for `-o`/`--origin`
  clone: read new remote name from remote_name instead of option_origin
  clone: validate --origin option before use
  refs: consolidate remote name validation
  remote: add tests for add and rename with invalid names
  clone: use more conventional config/option layering
  clone: add tests for --template and some disallowed option pairs
2020-10-27 15:09:50 -07:00
Sean Barag de9ed3ef37 clone: allow configurable default for `-o`/`--origin`
While the default remote name of "origin" can be changed at clone-time
with `git clone`'s `--origin` option, it was previously not possible
to specify a default value for the name of that remote.  Add support for
a new `clone.defaultRemoteName` config, with the newly-created remote
name resolved in priority order:

1. (Highest priority) A remote name passed directly to `git clone -o`
2. A `clone.defaultRemoteName=new_name` in config `git clone -c`
3. A `clone.defaultRemoteName` value set in `/path/to/template/config`,
   where `--template=/path/to/template` is provided
4. A `clone.defaultRemoteName` value set in a non-template config file
5. The default value of `origin`

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag 75ca3906b1 clone: read new remote name from remote_name instead of option_origin
In a future patch, the name of the remote created by `git clone` may
come from multiple sources.  To avoid confusion, convert most uses of
option_origin to remote_name, leaving option_origin to exclusively
represent the -o/--origin option.

Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag ebe7e28a36 clone: validate --origin option before use
Providing a bad origin name to `git clone` currently reports an
'invalid refspec' error instead of a more explicit message explaining
that the `--origin` option was malformed.  This behavior dates back to
since 8434c2f1 (Build in clone, 2008-04-27).  Reintroduce
validation for the provided `--origin` option, but notably _don't_
include a multi-level check (e.g. "foo/bar") that was present in the
original `git-clone.sh`.  `git remote` allows multi-level remote names
since at least 46220ca100 (remote.c: Fix overtight refspec validation,
2008-03-20), so that appears to be the desired behavior.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag 552955ed7f clone: use more conventional config/option layering
Parsing command-line options before reading from config required careful
handling to ensure CLI options were treated with higher priority.  Read
config first to let parsed CLI naively overwrite matching config values.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
brian m. carlson 47ac970309 builtin/clone: avoid failure with GIT_DEFAULT_HASH
If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to
"sha256", then we can end up with a repository where the repository
format version is 0 but the extensions.objectformat key is set to
"sha256".  This is both wrong (the user has a SHA-1 repository) and
nonfunctional (because the extension cannot be used in a v0 repository).

This happens because in a clone, we initially set up the repository, and
then change its algorithm based on what the remote side tells us it's
using.  We've initially set up the repository as SHA-256 in this case,
and then later on reset the repository version without clearing the
extension.

We could just always set the extension in this case, but that would mean
that our SHA-1 repositories weren't compatible with older Git versions,
even though there's no reason why they shouldn't be.  And we also don't
want to initialize the repository as SHA-1 initially, since that means
if we're cloning an empty repository, we'll have failed to honor the
GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a
SHA-256 repository.

Neither of those are appealing, so let's tell the repository
initialization code if we're doing a reinit like this, and if so, to
clear the extension if we're using SHA-1.  This makes sure we produce a
valid and functional repository and doesn't break any of our other use
cases.

Reported-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22 09:22:32 -07:00
René Scharfe 1af8b8c0a5 refspec: add and use refspec_appendf()
Add a function for building a refspec using printf-style formatting.  It
frees callers from managing their own buffer.  Use it throughout the
tree to shorten and simplify its callers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 13:15:46 -07:00
Junio C Hamano 46b225f153 Merge branch 'jk/strvec'
The argv_array API is useful for not just managing argv but any
"vector" (NULL-terminated array) of strings, and has seen adoption
to a certain degree.  It has been renamed to "strvec" to reduce the
barrier to adoption.

* jk/strvec:
  strvec: rename struct fields
  strvec: drop argv_array compatibility layer
  strvec: update documention to avoid argv_array
  strvec: fix indentation in renamed calls
  strvec: convert remaining callers away from argv_array name
  strvec: convert more callers away from argv_array name
  strvec: convert builtin/ callers away from argv_array name
  quote: rename sq_dequote_to_argv_array to mention strvec
  strvec: rename files from argv-array to strvec
  argv-array: rename to strvec
  argv-array: use size_t for count and alloc
2020-08-10 10:23:57 -07:00
Jeff King d70a9eb611 strvec: rename struct fields
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").

Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30 19:18:06 -07:00
Junio C Hamano f175e9b845 Merge branch 'bw/fail-cloning-into-non-empty' into master
"git clone --separate-git-dir=$elsewhere" used to stomp on the
contents of the existing directory $elsewhere, which has been
taught to fail when $elsewhere is not an empty directory.

* bw/fail-cloning-into-non-empty:
  git clone: don't clone into non-empty directory
2020-07-30 13:20:32 -07:00
Jeff King 22f9b7f3f5 strvec: convert builtin/ callers away from argv_array name
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).

This patch converts all of the files in builtin/ to keep the diff to a
manageable size.

The conversion was done purely mechanically with:

  git ls-files '*.c' '*.h' |
  xargs perl -i -pe '
    s/ARGV_ARRAY/STRVEC/g;
    s/argv_array/strvec/g;
  '

and then selectively staging files with "git add builtin/". We'll deal
with any indentation/style fallouts separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28 15:02:18 -07:00
Ben Wijen dfaa209a79 git clone: don't clone into non-empty directory
When using git clone with --separate-git-dir realgitdir and
realgitdir already exists, it's content is destroyed.

So, make sure we don't clone into an existing non-empty directory.

When d45420c1 (clone: do not clean up directories we didn't create,
2018-01-02) tightened the clean-up procedure after a failed cloning
into an empty directory, it assumed that the existing directory
given is an empty one so it is OK to keep that directory, while
running the clean-up procedure that is designed to remove everything
in it (since there won't be any, anyway).  Check and make sure that
the $GIT_DIR is empty even cloning into an existing repository.

Signed-off-by: Ben Wijen <ben@wijen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 11:43:29 -07:00
Junio C Hamano 11cbda2add Merge branch 'js/default-branch-name'
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.

* js/default-branch-name:
  contrib: subtree: adjust test to change in fmt-merge-msg
  testsvn: respect `init.defaultBranch`
  remote: use the configured default branch name when appropriate
  clone: use configured default branch name when appropriate
  init: allow setting the default for the initial branch name via the config
  init: allow specifying the initial branch name for the new repository
  docs: add missing diamond brackets
  submodule: fall back to remote's HEAD for missing remote.<name>.branch
  send-pack/transport-helper: avoid mentioning a particular branch
  fmt-merge-msg: stop treating `master` specially
2020-07-06 22:09:17 -07:00
Junio C Hamano 12210859da Merge branch 'bc/sha-256-part-2'
SHA-256 migration work continues.

* bc/sha-256-part-2: (44 commits)
  remote-testgit: adapt for object-format
  bundle: detect hash algorithm when reading refs
  t5300: pass --object-format to git index-pack
  t5704: send object-format capability with SHA-256
  t5703: use object-format serve option
  t5702: offer an object-format capability in the test
  t/helper: initialize the repository for test-sha1-array
  remote-curl: avoid truncating refs with ls-remote
  t1050: pass algorithm to index-pack when outside repo
  builtin/index-pack: add option to specify hash algorithm
  remote-curl: detect algorithm for dumb HTTP by size
  builtin/ls-remote: initialize repository based on fetch
  t5500: make hash independent
  serve: advertise object-format capability for protocol v2
  connect: parse v2 refs with correct hash algorithm
  connect: pass full packet reader when parsing v2 refs
  Documentation/technical: document object-format for protocol v2
  t1302: expect repo format version 1 for SHA-256
  builtin/show-index: provide options to determine hash algo
  t5302: modernize test formatting
  ...
2020-07-06 22:09:13 -07:00
Johannes Schindelin 0cc1b475bb clone: use configured default branch name when appropriate
When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.

As part of the implicit initialization of the local repository, Git just
learned to respect `init.defaultBranch` to choose a different initial
branch name. We now really want that branch name to be used as a
fall-back.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-24 09:14:21 -07:00
Johannes Schindelin 32ba12dab2 init: allow specifying the initial branch name for the new repository
There is a growing number of projects and companies desiring to change
the main branch name of their repositories (see e.g.
https://twitter.com/mislav/status/1270388510684598272 for background on
this).

To change that branch name for new repositories, currently the only way
to do that automatically is by copying all of Git's template directory,
then hard-coding the desired default branch name into the `.git/HEAD`
file, and then configuring `init.templateDir` to point to those copied
template files.

To make this process much less cumbersome, let's introduce a new option:
`--initial-branch=<branch-name>`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-24 09:14:21 -07:00
Junio C Hamano 524caf8035 Merge branch 'js/reflog-anonymize-for-clone-and-fetch'
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.

* js/reflog-anonymize-for-clone-and-fetch:
  clone/fetch: anonymize URLs in the reflog
2020-06-17 21:54:01 -07:00
Johannes Schindelin 46da295a77 clone/fetch: anonymize URLs in the reflog
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.

Let's scrub them before writing them to the reflog.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-04 13:20:21 -07:00
brian m. carlson b65dc2cebd builtin/clone: initialize hash algorithm properly
When performing a clone, we don't know what hash algorithm the other end
will support.  Currently, we don't support fetching data belonging to a
different algorithm, so we must know what algorithm the remote side is
using in order to properly initialize the repository.  We can know that
only after fetching the refs, so if the remote side has any references,
use that information to reinitialize the repository with the correct
hash algorithm information.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27 10:07:06 -07:00
Junio C Hamano 9404128b34 Merge branch 'jc/log-no-mailmap'
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"

* jc/log-no-mailmap:
  log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
  clone: reorder --recursive/--recurse-submodules
  parse-options: teach "git cmd -h" to show alias as alias
2020-04-28 15:50:00 -07:00
Junio C Hamano 3ea2b46628 Merge branch 'jk/use-quick-lookup-in-clone-for-tag-following'
The logic to auto-follow tags by "git clone --single-branch" was
not careful to avoid lazy-fetching unnecessary tags, which has been
corrected.

* jk/use-quick-lookup-in-clone-for-tag-following:
  clone: use "quick" lookup while following tags
2020-04-22 13:42:51 -07:00
Junio C Hamano 0c601052a5 Merge branch 'jt/connectivity-check-optim-in-partial-clone'
Simplify the commit ancestry connectedness check in a partial clone
repository in which "promised" objects are assumed to be obtainable
lazily on-demand from promisor remote repositories.

* jt/connectivity-check-optim-in-partial-clone:
  connected: always use partial clone optimization
2020-04-22 13:42:43 -07:00
Jeff King 167a575e2d clone: use "quick" lookup while following tags
When cloning with --single-branch, we implement git-fetch's usual
tag-following behavior, grabbing any tag objects that point to objects
we have locally.

When we're a partial clone, though, our has_object_file() check will
actually lazy-fetch each tag. That not only defeats the purpose of
--single-branch, but it does it incredibly slowly, potentially kicking
off a new fetch for each tag. This is even worse for a shallow clone,
which implies --single-branch, because even tags which are supersets of
each other will be fetched individually.

We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call,
which is what git-fetch does in this case.

Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch
does. The rationale is discussed in 5827a03545 (fetch: use "quick"
has_sha1_file for tag following, 2016-10-13), but here the tradeoff
would apply even more so because clone is very unlikely to be racing
with another process repacking our newly-created repository.

This may provide a very small speedup even in the non-partial case case,
as we'd avoid calling reprepare_packed_git() for each tag (though in
practice, we'd only have a single packfile, so that reprepare should be
quite cheap).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-01 09:56:41 -07:00
Jonathan Tan 2b98478c6f connected: always use partial clone optimization
With 50033772d5 ("connected: verify promisor-ness of partial clone",
2020-01-30), the fast path (checking promisor packs) in
check_connected() now passes a subset of the slow path (rev-list) - if
all objects to be checked are found in promisor packs, both the fast
path and the slow path will pass; otherwise, the fast path will
definitely not pass. This means that we can always attempt the fast path
whenever we need to do the slow path.

The fast path is currently guarded by a flag; therefore, remove that
flag. Also, make the fast path fallback to the slow path - if the fast
path fails, the failing OID and all remaining OIDs will be passed to
rev-list.

The main user-visible benefit is the performance of fetch from a partial
clone - specifically, the speedup of the connectivity check done before
the fetch. In particular, a no-op fetch into a partial clone on my
computer was sped up from 7 seconds to 0.01 seconds. This is a
complement to the work in 2df1aa239c ("fetch: forgo full
connectivity check if --filter", 2020-01-30), which is the child of the
aforementioned 50033772d5. In that commit, the connectivity check
*after* the fetch was sped up.

The addition of the fast path might cause performance reductions in
these cases:

 - If a partial clone or a fetch into a partial clone fails, Git will
   fruitlessly run rev-list (it is expected that everything fetched
   would go into promisor packs, so if that didn't happen, it is most
   likely that rev-list will fail too).

 - Any connectivity checks done by receive-pack, in the (in my opinion,
   unlikely) event that a partial clone serves receive-pack.

I think that these cases are rare enough, and the performance reduction
in this case minor enough (additional object DB access), that the
benefit of avoiding a flag outweighs these.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-29 10:37:44 -07:00
Junio C Hamano 4e4baee3f4 Merge branch 'bc/filter-process'
Provide more information (e.g. the object of the tree-ish in which
the blob being converted appears, in addition to its path, which
has already been given) to smudge/clean conversion filters.

* bc/filter-process:
  t0021: test filter metadata for additional cases
  builtin/reset: compute checkout metadata for reset
  builtin/rebase: compute checkout metadata for rebases
  builtin/clone: compute checkout metadata for clones
  builtin/checkout: compute checkout metadata for checkouts
  convert: provide additional metadata to filters
  convert: permit passing additional metadata to filter processes
  builtin/checkout: pass branch info down to checkout_worktree
2020-03-26 17:11:20 -07:00
Junio C Hamano f8cb64e3d4 Merge branch 'bc/sha-256-part-1-of-4'
SHA-256 transition continues.

* bc/sha-256-part-1-of-4: (22 commits)
  fast-import: add options for rewriting submodules
  fast-import: add a generic function to iterate over marks
  fast-import: make find_marks work on any mark set
  fast-import: add helper function for inserting mark object entries
  fast-import: permit reading multiple marks files
  commit: use expected signature header for SHA-256
  worktree: allow repository version 1
  init-db: move writing repo version into a function
  builtin/init-db: add environment variable for new repo hash
  builtin/init-db: allow specifying hash algorithm on command line
  setup: allow check_repository_format to read repository format
  t/helper: make repository tests hash independent
  t/helper: initialize repository if necessary
  t/helper/test-dump-split-index: initialize git repository
  t6300: make hash algorithm independent
  t6300: abstract away SHA-1-specific constants
  t: use hash-specific lookup tables to define test constants
  repository: require a build flag to use SHA-256
  hex: add functions to parse hex object IDs in any algorithm
  hex: introduce parsing variants taking hash algorithms
  ...
2020-03-26 17:11:20 -07:00
Junio C Hamano c28b036fe3 clone: reorder --recursive/--recurse-submodules
The previous step made an option that is an alias to another option
identify itself as an alias to the latter.  Because it is easier to
scan the list when a pointer goes backward to what a reader already
has seen, mention "recurse-submodules" first with its true short
help string, and then "recurse" with the statement that it is a
synonym to "recurse-submodules".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16 14:27:07 -07:00
brian m. carlson dfc8cdc677 builtin/clone: compute checkout metadata for clones
When checking out a commit, provide metadata to the filter process
including the ref we're using.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16 11:37:02 -07:00
Alexandr Miloslavskiy 3d7747e318 real_path: remove unsafe API
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.

There was an unfinished effort to abolish this [1].

Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.

This patch uses a local `strbuf` for most places where `real_path()` was
previously called.

However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
    read_gitfile_gently()
    get_superproject_working_tree()

[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/

Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-10 11:41:40 -07:00
Junio C Hamano b22db265d6 Merge branch 'es/recursive-single-branch-clone'
"git clone --recurse-submodules --single-branch" now uses the same
single-branch option when cloning the submodules.

* es/recursive-single-branch-clone:
  clone: pass --single-branch during --recurse-submodules
  submodule--helper: use C99 named initializer
2020-03-05 10:43:03 -08:00
Emily Shaffer 132f600b06 clone: pass --single-branch during --recurse-submodules
Previously, performing "git clone --recurse-submodules --single-branch"
resulted in submodules cloning all branches even though the superproject
cloned only one branch. Pipe --single-branch through the submodule
helper framework to make it to 'clone' later on.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-25 10:00:38 -08:00
brian m. carlson 8b8f7189df builtin/init-db: allow specifying hash algorithm on command line
Allow the user to specify the hash algorithm on the command line by
using the --object-format option to git init.  Validate that the user is
not attempting to reinitialize a repository with a different hash
algorithm.  Ensure that if we are writing a non-SHA-1 repository that we
set the repository version to 1 and write the objectFormat extension.

Restrict this option to work only when ENABLE_SHA256 is set until the
codebase is in a situation to fully support this.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-24 09:33:27 -08:00
Junio C Hamano 433b8aac2e Merge branch 'ds/sparse-checkout-harden'
Some rough edges in the sparse-checkout feature, especially around
the cone mode, have been cleaned up.

* ds/sparse-checkout-harden:
  sparse-checkout: fix cone mode behavior mismatch
  sparse-checkout: improve docs around 'set' in cone mode
  sparse-checkout: escape all glob characters on write
  sparse-checkout: use C-style quotes in 'list' subcommand
  sparse-checkout: unquote C-style strings over --stdin
  sparse-checkout: write escaped patterns in cone mode
  sparse-checkout: properly match escaped characters
  sparse-checkout: warn on globs in cone patterns
  sparse-checkout: detect short patterns
  sparse-checkout: cone mode does not recognize "**"
  sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  clone: fix --sparse option with URLs
  sparse-checkout: create leading directories
  t1091: improve here-docs
  t1091: use check_files to reduce boilerplate
2020-02-14 12:54:22 -08:00
Jonathan Tan 50033772d5 connected: verify promisor-ness of partial clone
Commit dfa33a298d ("clone: do faster object check for partial clones",
2019-04-21) optimized the connectivity check done when cloning with
--filter to check only the existence of objects directly pointed to by
refs. But this is not sufficient: they also need to be promisor objects.
Make this check more robust by instead checking that these objects are
promisor objects, that is, they appear in a promisor pack.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-30 10:55:31 -08:00
Derrick Stolee 47dbf10d8a clone: fix --sparse option with URLs
The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.

Modify the test to use a "file://" URL, which would output this error
before the code change:

  Cloning into 'clone'...
  fatal: cannot change to 'file://.../repo': No such file or directory
  error: failed to initialize sparse-checkout

These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.

Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.

Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-24 13:26:54 -08:00
Junio C Hamano bd72a08d6c Merge branch 'ds/sparse-cone'
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.

* ds/sparse-cone: (21 commits)
  sparse-checkout: improve OS ls compatibility
  sparse-checkout: respect core.ignoreCase in cone mode
  sparse-checkout: check for dirty status
  sparse-checkout: update working directory in-process for 'init'
  sparse-checkout: cone mode should not interact with .gitignore
  sparse-checkout: write using lockfile
  sparse-checkout: use in-process update for disable subcommand
  sparse-checkout: update working directory in-process
  sparse-checkout: sanitize for nested folders
  unpack-trees: add progress to clear_ce_flags()
  unpack-trees: hash less in cone mode
  sparse-checkout: init and set in cone mode
  sparse-checkout: use hashmaps for cone patterns
  sparse-checkout: add 'cone' mode
  trace2: add region in clear_ce_flags
  sparse-checkout: create 'disable' subcommand
  sparse-checkout: add '--stdin' option to set subcommand
  sparse-checkout: 'set' subcommand
  clone: add --sparse mode
  sparse-checkout: create 'init' subcommand
  ...
2019-12-25 11:21:58 -08:00
Junio C Hamano 7034cd094b Sync with Git 2.24.1 2019-12-09 22:17:55 -08:00
Johannes Schindelin 67af91c47a Sync with 2.23.1
* maint-2.23: (44 commits)
  Git 2.23.1
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  ...
2019-12-06 16:31:39 +01:00
Johannes Schindelin 7fd9fd94fb Sync with 2.22.2
* maint-2.22: (43 commits)
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  ...
2019-12-06 16:31:30 +01:00
Johannes Schindelin 5421ddd8d0 Sync with 2.21.1
* maint-2.21: (42 commits)
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  ...
2019-12-06 16:31:23 +01:00
Johannes Schindelin fc346cb292 Sync with 2.20.2
* maint-2.20: (36 commits)
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  ...
2019-12-06 16:31:12 +01:00
Johannes Schindelin d851d94151 Sync with 2.19.3
* maint-2.19: (34 commits)
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  ...
2019-12-06 16:30:49 +01:00
Johannes Schindelin 7c9fbda6e2 Sync with 2.18.2
* maint-2.18: (33 commits)
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  ...
2019-12-06 16:30:38 +01:00
Johannes Schindelin 14af7ed5a9 Sync with 2.17.3
* maint-2.17: (32 commits)
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  ...
2019-12-06 16:29:15 +01:00
Johannes Schindelin bdfef0492c Sync with 2.16.6
* maint-2.16: (31 commits)
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  ...
2019-12-06 16:27:36 +01:00
Johannes Schindelin 9ac92fed5b Sync with 2.15.4
* maint-2.15: (29 commits)
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  clone --recurse-submodules: prevent name squatting on Windows
  is_ntfs_dotgit(): only verify the leading segment
  ...
2019-12-06 16:27:18 +01:00
Johannes Schindelin d3ac8c3f27 Sync with 2.14.6
* maint-2.14: (28 commits)
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  clone --recurse-submodules: prevent name squatting on Windows
  is_ntfs_dotgit(): only verify the leading segment
  test-path-utils: offer to run a protectNTFS/protectHFS benchmark
  ...
2019-12-06 16:26:55 +01:00
Johannes Schindelin 0060fd1511 clone --recurse-submodules: prevent name squatting on Windows
In addition to preventing `.git` from being tracked by Git, on Windows
we also have to prevent `git~1` from being tracked, as the default NTFS
short name (also known as the "8.3 filename") for the file name `.git`
is `git~1`, otherwise it would be possible for malicious repositories to
write directly into the `.git/` directory, e.g. a `post-checkout` hook
that would then be executed _during_ a recursive clone.

When we implemented appropriate protections in 2b4c6efc82 (read-cache:
optionally disallow NTFS .git variants, 2014-12-16), we had analyzed
carefully that the `.git` directory or file would be guaranteed to be
the first directory entry to be written. Otherwise it would be possible
e.g. for a file named `..git` to be assigned the short name `git~1` and
subsequently, the short name generated for `.git` would be `git~2`. Or
`git~3`. Or even `~9999999` (for a detailed explanation of the lengths
we have to go to protect `.gitmodules`, see the commit message of
e7cb0b4455 (is_ntfs_dotgit: match other .git files, 2018-05-11)).

However, by exploiting two issues (that will be addressed in a related
patch series close by), it is currently possible to clone a submodule
into a non-empty directory:

- On Windows, file names cannot end in a space or a period (for
  historical reasons: the period separating the base name from the file
  extension was not actually written to disk, and the base name/file
  extension was space-padded to the full 8/3 characters, respectively).
  Helpfully, when creating a directory under the name, say, `sub.`, that
  trailing period is trimmed automatically and the actual name on disk
  is `sub`.

  This means that while Git thinks that the submodule names `sub` and
  `sub.` are different, they both access `.git/modules/sub/`.

- While the backslash character is a valid file name character on Linux,
  it is not so on Windows. As Git tries to be cross-platform, it
  therefore allows backslash characters in the file names stored in tree
  objects.

  Which means that it is totally possible that a submodule `c` sits next
  to a file `c\..git`, and on Windows, during recursive clone a file
  called `..git` will be written into `c/`, of course _before_ the
  submodule is cloned.

Note that the actual exploit is not quite as simple as having a
submodule `c` next to a file `c\..git`, as we have to make sure that the
directory `.git/modules/b` already exists when the submodule is checked
out, otherwise a different code path is taken in `module_clone()` that
does _not_ allow a non-empty submodule directory to exist already.

Even if we will address both issues nearby (the next commit will
disallow backslash characters in tree entries' file names on Windows,
and another patch will disallow creating directories/files with trailing
spaces or periods), it is a wise idea to defend in depth against this
sort of attack vector: when submodules are cloned recursively, we now
_require_ the directory to be empty, addressing CVE-2019-1349.

Note: the code path we patch is shared with the code path of `git
submodule update --init`, which must not expect, in general, that the
directory is empty. Hence we have to introduce the new option
`--force-init` and hand it all the way down from `git submodule` to the
actual `git submodule--helper` process that performs the initial clone.

Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-04 13:20:05 +01:00
Junio C Hamano fce9e836d3 Merge branch 'jt/fetch-remove-lazy-fetch-plugging'
"git fetch" codepath had a big "do not lazily fetch missing objects
when I ask if something exists" switch.  This has been corrected by
marking the "does this thing exist?" calls with "if not please do not
lazily fetch it" flag.

* jt/fetch-remove-lazy-fetch-plugging:
  promisor-remote: remove fetch_if_missing=0
  clone: remove fetch_if_missing=0
  fetch: remove fetch_if_missing=0
2019-12-01 09:04:38 -08:00
Junio C Hamano dfc03e48ec Merge branch 'mr/clone-dir-exists-to-path-exists'
Code cleanup.

* mr/clone-dir-exists-to-path-exists:
  clone: rename static function `dir_exists()`.
2019-12-01 09:04:30 -08:00
Derrick Stolee d89f09c828 clone: add --sparse mode
When someone wants to clone a large repository, but plans to work
using a sparse-checkout file, they either need to do a full
checkout first and then reduce the patterns they included, or
clone with --no-checkout, set up their patterns, and then run
a checkout manually. This requires knowing a lot about the repo
shape and how sparse-checkout works.

Add a new '--sparse' option to 'git clone' that initializes the
sparse-checkout file to include the following patterns:

	/*
	!/*/

These patterns include every file in the root directory, but
no directories. This allows a repo to include files like a
README or a bootstrapping script to grow enlistments from that
point.

During the 'git sparse-checkout init' call, we must first look
to see if HEAD is valid, since 'git clone' does not have a valid
HEAD at the point where it initializes the sparse-checkout. The
following checkout within the clone command will create the HEAD
ref and update the working directory correctly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22 16:11:43 +09:00
Jonathan Tan e362fadcd0 clone: remove fetch_if_missing=0
Commit 6462d5eb9a ("fetch: remove fetch_if_missing=0", 2019-11-08)
strove to remove the need for fetch_if_missing=0 from the fetching
mechanism, so it is plausible to attempt removing fetch_if_missing=0
from clone as well. But doing so reveals a bug - when the server does
not send an object directly pointed to by a ref, this should be an
error, not a trigger for a lazy fetch. (This case in the fetching
mechanism was covered by a test using "git clone", not "git fetch",
which is why the aforementioned commit didn't uncover the bug.)

The bug can be fixed by suppressing lazy-fetching during the
connectivity check. Fix this bug, and remove fetch_if_missing from
clone.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-13 11:48:47 +09:00
Miriam Rubio 6c02042139 clone: rename static function `dir_exists()`.
builtin/clone.c has a static function dir_exists() that
checks if a given path exists on the filesystem.  It returns
true (and it is correct for it to return true) when the
given path exists as a non-directory (e.g. a regular file).

This is confusing.  What the caller wants to check, and what
this function wants to return, is if the path exists, so
rename it to path_exists().

Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-29 11:54:23 +09:00
Junio C Hamano a4c5d9f66e Merge branch 'rs/dedup-includes'
Code cleanup.

* rs/dedup-includes:
  treewide: remove duplicate #include directives
2019-10-11 14:24:48 +09:00
Junio C Hamano 676278f8ea Merge branch 'bc/object-id-part17'
Preparation for SHA-256 upgrade continues.

* bc/object-id-part17: (26 commits)
  midx: switch to using the_hash_algo
  builtin/show-index: replace sha1_to_hex
  rerere: replace sha1_to_hex
  builtin/receive-pack: replace sha1_to_hex
  builtin/index-pack: replace sha1_to_hex
  packfile: replace sha1_to_hex
  wt-status: convert struct wt_status to object_id
  cache: remove null_sha1
  builtin/worktree: switch null_sha1 to null_oid
  builtin/repack: write object IDs of the proper length
  pack-write: use hash_to_hex when writing checksums
  sequencer: convert to use the_hash_algo
  bisect: switch to using the_hash_algo
  sha1-lookup: switch hard-coded constants to the_hash_algo
  config: use the_hash_algo in abbrev comparison
  combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo
  bundle: switch to use the_hash_algo
  connected: switch GIT_SHA1_HEXSZ to the_hash_algo
  show-index: switch hard-coded constants to the_hash_algo
  blame: remove needless comparison with GIT_SHA1_HEXSZ
  ...
2019-10-11 14:24:46 +09:00
René Scharfe 2fe44394c8 treewide: remove duplicate #include directives
Found with "git grep '^#include ' '*.c' | sort | uniq -d".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-04 08:16:00 +09:00
Junio C Hamano 627b826834 Merge branch 'md/list-objects-filter-combo'
The list-objects-filter API (used to create a sparse/lazy clone)
learned to take a combined filter specification.

* md/list-objects-filter-combo:
  list-objects-filter-options: make parser void
  list-objects-filter-options: clean up use of ALLOC_GROW
  list-objects-filter-options: allow mult. --filter
  strbuf: give URL-encoding API a char predicate fn
  list-objects-filter-options: make filter_spec a string_list
  list-objects-filter-options: move error check up
  list-objects-filter: implement composite filters
  list-objects-filter-options: always supply *errbuf
  list-objects-filter: put omits set in filter struct
  list-objects-filter: encapsulate filter components
2019-09-18 11:50:09 -07:00
brian m. carlson 8d4d86b0f0 cache: remove null_sha1
All of the existing uses of null_sha1 can be converted into uses of
null_oid, so do so.  Remove null_sha1 and is_null_sha1, and define
is_null_oid in terms of null_oid.  This also has the additional benefit
of removing several uses of sha1_to_hex.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 15:04:59 -07:00
Junio C Hamano dea6737bb7 Merge branch 'ds/close-object-store' into maint
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.

* ds/close-object-store:
  packfile: rename close_all_packs to close_object_store
  packfile: close commit-graph in close_all_packs
  commit-graph: use raw_object_store when closing
  commit-graph: extract write_commit_graph_file()
  commit-graph: extract copy_oids_to_commits()
  commit-graph: extract count_distinct_commits()
  commit-graph: extract fill_oids_from_all_packs()
  commit-graph: extract fill_oids_from_commit_hex()
  commit-graph: extract fill_oids_from_packs()
  commit-graph: create write_commit_graph_context
  commit-graph: remove Future Work section
  commit-graph: collapse parameters into flags
  commit-graph: return with errors during write
  commit-graph: fix the_repository reference
2019-07-29 12:38:22 -07:00
Junio C Hamano 080af915a3 Merge branch 'mt/dir-iterator-updates'
Adjust the dir-iterator API and apply it to the local clone
optimization codepath.

* mt/dir-iterator-updates:
  clone: replace strcmp by fspathcmp
  clone: use dir-iterator to avoid explicit dir traversal
  clone: extract function from copy_or_link_directory
  clone: copy hidden paths at local clone
  dir-iterator: add flags parameter to dir_iterator_begin
  dir-iterator: refactor state machine model
  dir-iterator: use warning_errno when possible
  dir-iterator: add tests for dir-iterator API
  clone: better handle symlinked files at .git/objects/
  clone: test for our behavior on odd objects/* content
2019-07-25 13:59:22 -07:00
Matheus Tavares c4d9c506f7 clone: replace strcmp by fspathcmp
Replace the use of strcmp by fspathcmp at copy_or_link_directory, which
is more permissive/friendly to case-insensitive file systems.

Suggested-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11 13:52:16 -07:00
Matheus Tavares ff7ccc8c9a clone: use dir-iterator to avoid explicit dir traversal
Replace usage of opendir/readdir/closedir API to traverse directories
recursively, at copy_or_link_directory function, by the dir-iterator
API. This simplifies the code and avoids recursive calls to
copy_or_link_directory.

This process also makes copy_or_link_directory call die() in case of an
error on readdir or stat inside dir_iterator_advance. Previously it
would just print a warning for errors on stat and ignore errors on
readdir, which isn't nice because a local git clone could succeed even
though the .git/objects copy didn't fully succeed.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11 13:52:16 -07:00
Matheus Tavares 14954b799f clone: extract function from copy_or_link_directory
Extract dir creation code snippet from copy_or_link_directory to its own
function named mkdir_if_missing. This change will help to remove
copy_or_link_directory's explicit recursion, which will be done in a
following patch. Also makes the code more readable.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11 13:52:15 -07:00
Matheus Tavares 68c7c59cf2 clone: copy hidden paths at local clone
Make the copy_or_link_directory function no longer skip hidden
directories. This function, used to copy .git/objects, currently skips
all hidden directories but not hidden files, which is an odd behaviour.
The reason for that could be unintentional: probably the intention was
to skip '.' and '..' only but it ended up accidentally skipping all
directories starting with '.'. Besides being more natural, the new
behaviour is more permissive to the user.

Also adjust tests to reflect this behaviour change.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11 13:52:15 -07:00
Matheus Tavares 36596fd2df clone: better handle symlinked files at .git/objects/
There is currently an odd behaviour when locally cloning a repository
with symlinks at .git/objects: using --no-hardlinks all symlinks are
dereferenced but without it, Git will try to hardlink the files with the
link() function, which has an OS-specific behaviour on symlinks. On OSX
and NetBSD, it creates a hardlink to the file pointed by the symlink
whilst on GNU/Linux, it creates a hardlink to the symlink itself.

On Manjaro GNU/Linux:
    $ touch a
    $ ln -s a b
    $ link b c
    $ ls -li a b c
    155 [...] a
    156 [...] b -> a
    156 [...] c -> a

But on NetBSD:
    $ ls -li a b c
    2609160 [...] a
    2609164 [...] b -> a
    2609160 [...] c

It's not good to have the result of a local clone to be OS-dependent and
besides that, the current behaviour on GNU/Linux may result in broken
symlinks. So let's standardize this by making the hardlinks always point
to dereferenced paths, instead of the symlinks themselves. Also, add
tests for symlinked files at .git/objects/.

Note: Git won't create symlinks at .git/objects itself, but it's better
to handle this case and be friendly with users who manually create them.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11 13:52:15 -07:00
Junio C Hamano f496b064fc Merge branch 'nd/switch-and-restore'
Two new commands "git switch" and "git restore" are introduced to
split "checking out a branch to work on advancing its history" and
"checking out paths out of the index and/or a tree-ish to work on
advancing the current history" out of the single "git checkout"
command.

* nd/switch-and-restore: (46 commits)
  completion: disable dwim on "git switch -d"
  switch: allow to switch in the middle of bisect
  t2027: use test_must_be_empty
  Declare both git-switch and git-restore experimental
  help: move git-diff and git-reset to different groups
  doc: promote "git restore"
  user-manual.txt: prefer 'merge --abort' over 'reset --hard'
  completion: support restore
  t: add tests for restore
  restore: support --patch
  restore: replace --force with --ignore-unmerged
  restore: default to --source=HEAD when only --staged is specified
  restore: reject invalid combinations with --staged
  restore: add --worktree and --staged
  checkout: factor out worktree checkout code
  restore: disable overlay mode by default
  restore: make pathspec mandatory
  restore: take tree-ish from --source option instead
  checkout: split part of it to new command 'restore'
  doc: promote "git switch"
  ...
2019-07-09 15:25:44 -07:00
Junio C Hamano 5cb7c73589 Merge branch 'ds/close-object-store'
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.

* ds/close-object-store:
  packfile: rename close_all_packs to close_object_store
  packfile: close commit-graph in close_all_packs
  commit-graph: use raw_object_store when closing
2019-07-09 15:25:37 -07:00
Matthew DeVore cf9ceb5a12 list-objects-filter-options: make filter_spec a string_list
Make the filter_spec string a string_list rather than a raw C string.
The list of strings must be concatted together to make a complete
filter_spec. A future patch will use this capability to build "combine:"
filter specs gradually.

A strbuf would seem to be a more natural choice for this object, but it
unfortunately requires initialization besides just zero'ing out the
memory.  This results in all container structs, and all containers of
those structs, etc., to also require initialization. Initializing them
all would be more cumbersome that simply using a string_list, which
behaves properly when its contents are zero'd.

For the purposes of code simplification, change behavior in how filter
specs are conveyed over the protocol: do not normalize the tree:<depth>
filter specs since there should be no server in existence that supports
tree:# but not tree:#k etc.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-28 08:41:53 -07:00
Junio C Hamano 14f49b2058 Merge branch 'xl/record-partial-clone-origin'
When creating a partial clone, the object filtering criteria is
recorded for the origin of the clone, but this incorrectly used a
hardcoded name "origin" to name that remote; it has been corrected
to honor the "--origin <name>" option.

* xl/record-partial-clone-origin:
  clone: respect user supplied origin name when setting up partial clone
2019-06-17 10:15:20 -07:00
Junio C Hamano 94760948f1 Merge branch 'ba/clone-remote-submodules'
"git clone --recurse-submodules" learned to set up the submodules
to ignore commit object names recorded in the superproject gitlink
and instead use the commits that happen to be at the tip of the
remote-tracking branches from the get-go, by passing the new
"--remote-submodules" option.

* ba/clone-remote-submodules:
  clone: add `--remote-submodules` flag
2019-06-17 10:15:17 -07:00
Junio C Hamano c0e78f7e46 Merge branch 'jk/unused-params-final-batch'
* jk/unused-params-final-batch:
  verify-commit: simplify parameters to run_gpg_verify()
  show-branch: drop unused parameter from show_independent()
  rev-list: drop unused void pointer from finish_commit()
  remove_all_fetch_refspecs(): drop unused "remote" parameter
  receive-pack: drop unused "commands" from prepare_shallow_update()
  pack-objects: drop unused rev_info parameters
  name-rev: drop unused parameters from is_better_name()
  mktree: drop unused length parameter
  wt-status: drop unused status parameter
  read-cache: drop unused parameter from threaded load
  clone: drop dest parameter from copy_alternates()
  submodule: drop unused prefix parameter from some functions
  builtin: consistently pass cmd_* prefix to parse_options
  cmd_{read,write}_tree: rename "unused" variable that is used
2019-06-13 13:19:34 -07:00
Derrick Stolee 2d511cfc0b packfile: rename close_all_packs to close_object_store
The close_all_packs() method is now responsible for more than just pack-files.
It also closes the commit-graph and the multi-pack-index. Rename the function
to be more descriptive of its larger role. The name also fits because the
input parameter is a raw_object_store.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-12 11:33:54 -07:00
Xin Li 1c4a9f9114 clone: respect user supplied origin name when setting up partial clone
Signed-off-by: Xin Li <delphij@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-29 15:13:18 -07:00
Ben Avison 4c6910163a clone: add `--remote-submodules` flag
When using `git clone --recurse-submodules` there was previously no way to
pass a `--remote` switch to the implicit `git submodule update` command for
any use case where you want the submodules to be checked out on their
remote-tracking branch rather than with the SHA-1 recorded in the superproject.

This patch rectifies this situation. It actually passes `--no-fetch` to
`git submodule update` as well on the grounds they the submodule has only just
been cloned, so fetching from the remote again only serves to slow things down.

Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-28 09:22:02 -07:00
Junio C Hamano 2cfab60877 Merge branch 'nd/parse-options-aliases'
Attempt to use an abbreviated option in "git clone --recurs" is
responded by a request to disambiguate between --recursive and
--recurse-submodules, which is bad because these two are synonyms.
The parse-options API has been extended to define such synonyms
more easily and not produce an unnecessary failure.

* nd/parse-options-aliases:
  parse-options: don't emit "ambiguous option" for aliases
2019-05-19 16:45:28 +09:00
Junio C Hamano 5b51f0d38d Merge branch 'js/partial-clone-connectivity-check'
During an initial "git clone --depth=..." partial clone, it is
pointless to spend cycles for a large portion of the connectivity
check that enumerates and skips promisor objects (which by
definition is all objects fetched from the other side).  This has
been optimized out.

* js/partial-clone-connectivity-check:
  t/perf: add perf script for partial clones
  clone: do faster object check for partial clones
2019-05-13 23:50:32 +09:00
Jeff King 3c1dce8835 clone: drop dest parameter from copy_alternates()
Ever since the inception of this function in e6baf4a1ae (clone: clone
from a repository with relative alternates, 2011-08-22), the "dest"
parameter has been unused. Instead, we use add_to_alternates_file(),
which relies on git_pathdup() to find the right file. That in turn works
because we will have initialized and entered the destination repo by
this point.

It's a bit subtle, but this is how it has always worked. And if our
assumptions change, the test in t5601 from e6baf4a1ae should let us
know.

In the meantime, let's drop this unused and confusing parameter from
copy_alternates().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-13 14:22:54 +09:00