1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-01 22:56:10 +02:00
Commit Graph

268 Commits

Author SHA1 Message Date
René Scharfe ddbb47fde9 replace and remove run_command_v_opt()
Replace the remaining calls of run_command_v_opt() with run_command()
calls and explict struct child_process variables.  This is more verbose,
but not by much overall.  The code becomes more flexible, e.g. it's easy
to extend to conditionally add a new argument.

Then remove the now unused function and its own flag names, simplifying
the run-command API.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:51 -04:00
René Scharfe ef249b398e replace and remove run_command_v_opt_cd_env_tr2()
The convenience function run_command_v_opt_cd_env_tr2() has no external
callers left.  Inline it and remove it from the API.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:50 -04:00
René Scharfe d82dbbd849 replace and remove run_command_v_opt_tr2()
The convenience function run_command_v_opt_tr2() is only used by a
single caller.  Use struct child_process and run_command() directly
instead and remove the underused function.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:48 -04:00
René Scharfe eb5b6b57d0 replace and remove run_command_v_opt_cd_env()
run_command_v_opt_cd_env() is only used in an example in a comment.  Use
the struct child_process member "env" and run_command() directly instead
and then remove the unused convenience function.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:47 -04:00
Jeff King 716c1f649e pipe_command(): mark stdin descriptor as non-blocking
Our pipe_command() helper lets you both write to and read from a child
process on its stdin/stdout. It's supposed to work without deadlocks
because we use poll() to check when descriptors are ready for reading or
writing. But there's a bug: if both the data to be written and the data
to be read back exceed the pipe buffer, we'll deadlock.

The issue is that the code assumes that if you have, say, a 2MB buffer
to write and poll() tells you that the pipe descriptor is ready for
writing, that calling:

  write(cmd->in, buf, 2*1024*1024);

will do a partial write, filling the pipe buffer and then returning what
it did write. And that is what it would do on a socket, but not for a
pipe. When writing to a pipe, at least on Linux, it will block waiting
for the child process to read() more. And now we have a potential
deadlock, because the child may be writing back to us, waiting for us to
read() ourselves.

An easy way to trigger this is:

  git -c add.interactive.useBuiltin=true \
      -c interactive.diffFilter=cat \
      checkout -p HEAD~200

The diff against HEAD~200 will be big, and the filter wants to write all
of it back to us (obviously this is a dummy filter, but in the real
world something like diff-highlight would similarly stream back a big
output).

If you set add.interactive.useBuiltin to false, the problem goes away,
because now we're not using pipe_command() anymore (instead, that part
happens in perl). But this isn't a bug in the interactive code at all.
It's the underlying pipe_command() code which is broken, and has been
all along.

We presumably didn't notice because most calls only do input _or_
output, not both. And the few that do both, like gpg calls, may have
large inputs or outputs, but never both at the same time (e.g., consider
signing, which has a large payload but a small signature comes back).

The obvious fix is to put the descriptor into non-blocking mode, and
indeed, that makes the problem go away. Callers shouldn't need to
care, because they never see the descriptor (they hand us a buffer to
feed into it).

The included test fails reliably on Linux without this patch. Curiously,
it doesn't fail in our Windows CI environment, but has been reported to
do so for individual developers. It should pass in any environment after
this patch (courtesy of the compat/ layers added in the last few
commits).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:41 -07:00
Jeff King c6d3cce6f3 pipe_command(): handle ENOSPC when writing to a pipe
When write() to a non-blocking pipe fails because the buffer is full,
POSIX says we should see EAGAIN. But our mingw_write() compat layer on
Windows actually returns ENOSPC for this case. This is probably
something we want to correct, but given that we don't plan to use
non-blocking descriptors in a lot of places, we can work around it by
just catching ENOSPC alongside EAGAIN. If we ever do fix mingw_write(),
then this patch can be reverted.

We don't actually use a non-blocking pipe yet, so this is still just
preparation.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:41 -07:00
Jeff King 14eab817e4 pipe_command(): avoid xwrite() for writing to pipe
If xwrite() sees an EAGAIN response, it will loop forever until the
write succeeds (or encounters a real error). This is due to ef1cf0167a
(xwrite: poll on non-blocking FDs, 2016-06-26), with the idea that we
won't be surprised by a descriptor unexpectedly set as non-blocking.

But that will make things awkward when we do want a non-blocking
descriptor, and a future patch will switch pipe_command() to using one.
In that case, looping on EAGAIN is bad, because the process on the other
end of the pipe may be waiting on us before doing another read() on the
pipe, which would mean we deadlock.

In practice we're not supposed to ever see EAGAIN here, since poll()
will have just told us the descriptor is ready for writing. But our
Windows emulation of poll() will always return "ready" for writing to a
pipe descriptor! This is due to 94f4d01932 (mingw: workaround for hangs
when sending STDIN, 2020-02-17).

Our best bet in that case is to keep handling other descriptors, as any
read() we do may allow the child command to make forward progress (i.e.,
its write() finishes, and then it read()s from its stdin, freeing up
space in the pipe buffer). This means we might busy-loop between poll()
and write() on Windows if the child command is slow to read our input,
but it's much better than the alternative of deadlocking.

In practice, this busy-looping should be rare:

  - for small inputs, we'll just write the whole thing in a single
    write() anyway, non-blocking or not

  - for larger inputs where the child reads input and then processes it
    before writing (e.g., gpg verifying a signature), we may make a few
    extra write() calls that get EAGAIN during the initial write, but
    once it has taken in the whole input, we'll correctly block waiting
    to read back the data.

  - for larger inputs where the child process is streaming output back
    (like a diff filter), we'll likewise see some extra EAGAINs, but
    most of them will be followed immediately by a read(), which will
    let the child command make forward progress.

Of course it won't happen at all for now, since we don't yet use a
non-blocking pipe. This is just preparation for when we do.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-17 09:21:40 -07:00
Junio C Hamano da4827056a Merge branch 'js/wait-or-whine-can-fail'
We used to log an error return from wait_or_whine() as process
termination of the waited child, which was incorrect.

* js/wait-or-whine-can-fail:
  run-command: don't spam trace2_child_exit()
2022-06-13 15:53:44 -07:00
Junio C Hamano 1a7f6be5b1 Merge branch 'ab/hooks-regression-fix'
In Git 2.36 we revamped the way how hooks are invoked.  One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release.  This is getting corrected.

* ab/hooks-regression-fix:
  hook API: fix v2.36.0 regression: hooks should be connected to a TTY
  run-command: add an "ungroup" option to run_process_parallel()
2022-06-13 15:53:41 -07:00
Josh Steadmon ce3986bb22 run-command: don't spam trace2_child_exit()
In rare cases[1], wait_or_whine() cannot determine a child process's
status (and will return -1 in this case). This can cause Git to issue
trace2 child_exit events despite the fact that the child may still be
running. In pathological cases, we've seen > 80 million exit events in
our trace logs for a single child process.

Fix this by only issuing trace2 events in finish_command_in_signal() if
we get a value other than -1 from wait_or_whine(). This can lead to
missing child_exit events in such a case, but that is preferable to
duplicating events on a scale that threatens to fill the user's
filesystem with invalid trace logs.

[1]: This can happen when:

* waitpid() returns -1 and errno != EINTR
* waitpid() returns an invalid PID
* the status set by waitpid() has neither the WIFEXITED() nor
  WIFSIGNALED() flags

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-07 12:48:19 -07:00
Ævar Arnfjörð Bjarmason fd3aaf53f7 run-command: add an "ungroup" option to run_process_parallel()
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.

This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:

	$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
	TTY
	TTY
	$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
	NTTY
	NTTY

Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:

	$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
	Benchmark 1: parallel  seq ::: 10000000 >/dev/null
	  Time (mean ± σ):     220.2 ms ±   9.3 ms    [User: 124.9 ms, System: 96.1 ms]
	  Range (min … max):   212.3 ms … 230.5 ms    3 runs

	Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
	  Time (mean ± σ):     154.7 ms ±   0.9 ms    [User: 136.2 ms, System: 25.1 ms]
	  Range (min … max):   153.9 ms … 155.7 ms    3 runs

	Summary
	  'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
	    1.42 ± 0.06 times faster than 'parallel  seq ::: 10000000 >/dev/null '

A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.

For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.

For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.

We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.

Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.

It's safe to do this with a global variable because:

 A) hook.c is the only user of it that sets it to non-zero, and before
    we'll get any other API users we'll refactor away this method of
    passing in the option, i.e. re-roll [1].

 B) Even if hook.c wasn't the only user we don't have callers of this
    API that concurrently invoke this parallel process starting API
    itself in parallel.

As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-07 10:01:41 -07:00
Ævar Arnfjörð Bjarmason b3193252c4 run-command API users: use "env" not "env_array" in comments & names
Follow-up on a preceding commit which changed all references to the
"env_array" when referring to the "struct child_process" member. These
changes are all unnecessary for the compiler, but help the code's
human readers.

All the comments that referred to "env_array" have now been updated,
as well as function names and variables that had "env_array" in their
name, they now refer to "env".

In addition the "out" name for the submodule.h prototype was
inconsistent with the function definition's use of "env_array" in
submodule.c. Both of them use "env" now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-02 14:31:27 -07:00
Ævar Arnfjörð Bjarmason 29fda24dd1 run-command API: rename "env_array" to "env"
Start following-up on the rename mentioned in c7c4bdeccf (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".

The "env_array" name was picked in 19a583dc39 (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.

This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.

The rest of this is all a result of applying [1]:

 * make contrib/coccinelle/run_command.cocci.patch
 * patch -p1 <contrib/coccinelle/run_command.cocci.patch
 * git add -u

1. cat contrib/coccinelle/run_command.pending.cocci
   @@
   struct child_process E;
   @@
   - E.env_array
   + E.env

   @@
   struct child_process *E;
   @@
   - E->env_array
   + E->env

I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-02 14:31:16 -07:00
Junio C Hamano c70bc338e9 Merge branch 'ab/config-based-hooks-2'
More "config-based hooks".

* ab/config-based-hooks-2:
  run-command: remove old run_hook_{le,ve}() hook API
  receive-pack: convert push-to-checkout hook to hook.h
  read-cache: convert post-index-change to use hook.h
  commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
  git-p4: use 'git hook' to run hooks
  send-email: use 'git hook run' for 'sendemail-validate'
  git hook run: add an --ignore-missing flag
  hooks: convert worktree 'post-checkout' hook to hook library
  hooks: convert non-worktree 'post-checkout' hook to hook library
  merge: convert post-merge to use hook.h
  am: convert applypatch-msg to use hook.h
  rebase: convert pre-rebase to use hook.h
  hook API: add a run_hooks_l() wrapper
  am: convert {pre,post}-applypatch to use hook.h
  gc: use hook library for pre-auto-gc hook
  hook API: add a run_hooks() wrapper
  hook: add 'run' subcommand
2022-02-09 14:21:00 -08:00
Junio C Hamano 4b51386bbf Merge branch 'ab/usage-die-message'
Code clean-up to hide vreportf() from public API.

* ab/usage-die-message:
  config API: use get_error_routine(), not vreportf()
  usage.c + gc: add and use a die_message_errno()
  gc: return from cmd_gc(), don't call exit()
  usage.c API users: use die_message() for error() + exit 128
  usage.c API users: use die_message() for "fatal :" + exit 128
  usage.c: add a die_message() routine
2022-01-10 11:52:53 -08:00
Emily Shaffer 95ba86a203 run-command: remove old run_hook_{le,ve}() hook API
The new hook.h library has replaced all run-command.h hook-related
functionality. So let's delete this dead code.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-07 15:19:35 -08:00
Emily Shaffer dbb1c61365 read-cache: convert post-index-change to use hook.h
Move the post-index-change hook away from run-command.h to and over to
the new hook.h library.

This removes the last direct user of "run_hook_ve()" outside of
run-command.c ("run_hook_le()" still uses it). So we can make the
function static now. A subsequent commit will remove this code
entirely when "run_hook_le()" itself goes away.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-07 15:19:35 -08:00
Junio C Hamano 832ec72c3e Merge branch 'ab/run-command'
API clean-up.

* ab/run-command:
  run-command API: remove "env" member, always use "env_array"
  difftool: use "env_array" to simplify memory management
  run-command API: remove "argv" member, always use "args"
  run-command API users: use strvec_push(), not argv construction
  run-command API users: use strvec_pushl(), not argv construction
  run-command tests: use strvec_pushv(), not argv assignment
  run-command API users: use strvec_pushv(), not argv assignment
  upload-archive: use regular "struct child_process" pattern
  worktree: stop being overly intimate with run_command() internals
2021-12-15 09:39:47 -08:00
Junio C Hamano 97991dfab7 Merge branch 'jk/t7006-sigpipe-tests-fix'
The function to cull a child process and determine the exit status
had two separate code paths for normal callers and callers in a
signal handler, and the latter did not yield correct value when the
child has caught a signal.  The handling of the exit status has
been unified for these two code paths.  An existing test with
flakiness has also been corrected.

* jk/t7006-sigpipe-tests-fix:
  t7006: simplify exit-code checks for sigpipe tests
  t7006: clean up SIGPIPE handling in trace2 tests
  run-command: unify signal and regular logic for wait_or_whine()
2021-12-10 14:35:13 -08:00
Ævar Arnfjörð Bjarmason e081a7c3b7 usage.c API users: use die_message() for "fatal :" + exit 128
Change code that printed its own "fatal: " message and exited with a
status code of 128 to use the die_message() function added in a
preceding commit.

This change also demonstrates why the return value of
die_message_routine() needed to be that of "report_fn". We have
callers such as the run-command.c::child_err_spew() which would like
to replace its error routine with the return value of
"get_die_message_routine()".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:15 -08:00
Ævar Arnfjörð Bjarmason c7c4bdeccf run-command API: remove "env" member, always use "env_array"
Remove the "env" member from "struct child_process" in favor of always
using the "env_array". As with the preceding removal of "argv" in
favor of "args" this gets rid of current and future oddities around
memory management at the API boundary (see the amended API docs).

For some of the conversions we can replace patterns like:

    child.env = env->v;

With:

    strvec_pushv(&child.env_array, env->v);

But for others we need to guard the strvec_pushv() with a NULL check,
since we're not passing in the "v" member of a "struct strvec",
e.g. in the case of tmp_objdir_env()'s return value.

Ideally we'd rename the "env_array" member to simply "env" as a
follow-up, since it and "args" are now inconsistent in not having an
"_array" suffix, and seemingly without any good reason, unless we look
at the history of how they came to be.

But as we've currently got 122 in-tree hits for a "git grep env_array"
let's leave that for now (and possibly forever). Doing that rename
would be too disruptive.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25 22:15:08 -08:00
Ævar Arnfjörð Bjarmason d3b2159712 run-command API: remove "argv" member, always use "args"
Remove the "argv" member from the run-command API, ever since "args"
was added in c460c0ecdc (run-command: store an optional argv_array,
2014-05-15) being able to provide either "argv" or "args" has led to
some confusion and bugs.

If we hadn't gone in that direction and only had an "argv" our
problems wouldn't have been solved either, as noted in [1] (and in the
documentation amended here) it comes with inherent memory management
issues: The caller would have to hang on to the "argv" until the
run-command API was finished. If the "argv" was an argument to main()
this wasn't an issue, but if it it was manually constructed using the
API might be painful.

We also have a recent report[2] of a user of the API segfaulting,
which is a direct result of it being complex to use. This commit
addresses the root cause of that bug.

This change is larger than I'd like, but there's no easy way to avoid
it that wouldn't involve even more verbose intermediate steps. We use
the "argv" as the source of truth over the "args", so we need to
change all parts of run-command.[ch] itself, as well as the trace2
logging at the same time.

The resulting Windows-specific code in start_command() is a bit nasty,
as we're now assigning to a strvec's "v" member, instead of to our own
"argv". There was a suggestion of some alternate approaches in reply
to an earlier version of this commit[3], but let's leave larger a
larger and needless refactoring of this code for now.

1. http://lore.kernel.org/git/YT6BnnXeAWn8BycF@coredump.intra.peff.net
2. https://lore.kernel.org/git/20211120194048.12125-1-ematsumiya@suse.de/
3. https://lore.kernel.org/git/patch-5.5-ea1011f7473-20211122T153605Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25 22:15:07 -08:00
Ævar Arnfjörð Bjarmason 6def0ff878 run-command API users: use strvec_pushv(), not argv assignment
Migrate those run-command API users that assign directly to the "argv"
member to use a strvec_pushv() of "args" instead.

In these cases it did not make sense to further refactor these
callers, e.g. daemon.c could be made to construct the arguments closer
to handle(), but that would require moving the construction from its
cmd_main() and pass "argv" through two intermediate functions.

It would be possible for a change like this to introduce a regression
if we were doing:

      cp.argv = argv;
      argv[1] = "foo";

And changed the code, as is being done here, to:

      strvec_pushv(&cp.args, argv);
      argv[1] = "foo";

But as viewing this change with the "-W" flag reveals none of these
functions modify variable that's being pushed afterwards in a way that
would introduce such a logic error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25 22:15:07 -08:00
Jeff King 96bfb2d8ce run-command: unify signal and regular logic for wait_or_whine()
Since 507d7804c0 (pager: don't use unsafe functions in signal handlers,
2015-09-04), we have a separate code path in wait_or_whine() for the
case that we're in a signal handler. But that code path misses some of
the cases handled by the main logic.

This was improved in be8fc53e36 (pager: properly log pager exit code
when signalled, 2021-02-02), but that covered only case: actually
returning the correct error code. But there are some other cases:

  - if waitpid() returns failure, we wouldn't notice and would look at
    uninitialized garbage in the status variable; it's not clear if it's
    possible to trigger this or not

  - if the process exited by signal, then we would still report "-1"
    rather than the correct signal code

This latter case even had a test added in be8fc53e36, but it doesn't
work reliably. It sets the pager command to:

  >pager-used; test-tool sigchain

The latter command will die by signal, but because there are multiple
commands, there will be a shell in between. And it's the shell whose
waitpid() call will see the signal death, and it will then exit with
code 143, which is what Git will see.

To make matters even more confusing, some shells (such as bash) will
realize that there's nothing for the shell to do after test-tool
finishes, and will turn it into an exec. So the test was only checking
what it thought when /bin/sh points to a shell like bash (we're relying
on the shell used internally by Git to spawn sub-commands here, so even
running the test under bash would not be enough).

This patch adjusts the tests to explicitly call "exec" in the pager
command, which produces a consistent outcome regardless of shell. Note
that without the code change in this patch it _should_ fail reliably,
but doesn't. That test, like its siblings, tries to trigger SIGPIPE in
the git-log process writing to the pager, but only do so racily. That
will be fixed in a follow-on patch.

For the code change here, we have two options:

  - we can teach the in_signal code to handle WIFSIGNALED()

  - we can stop returning early when in_signal is set, and instead
    annotate individual calls that we need to skip in this case

The former is a simpler patch, but means we're essentially duplicating
all of the logic. So instead I went with the latter. The result is a
bigger patch, and we do run the risk of new code being added but
forgetting to handle in_signal. But in the long run it seems more
maintainable.

I've skipped any non-trivial calls for the in_signal case, like calling
error(). We'll also skip the call to clear_child_for_cleanup(), as we
were before. This is arguably the wrong thing to do, since we wouldn't
want to try to clean it up again. But:

  - we can't call it as-is, because it calls free(), which we must avoid
    in a signal handler (we'd have to pass in_signal so it can skip the
    free() call)

  - we'll only go through the list of children to clean once, since our
    cleanup_children_on_signal() handler pops itself after running (and
    then re-raises, so eventually we'd just exit). So this cleanup only
    matters if a process is on the cleanup list _and_ it has a separate
    handler to clean itself up. Which is questionable in the first place
    (and AFAIK we do not do).

  - double-cleanup isn't actually that bad anyway. waitpid() will just
    return an error, which we won't even report because of in_signal.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-22 15:43:18 -08:00
Junio C Hamano a73934c320 Merge branch 'vd/pthread-setspecific-g11-fix'
One CI task based on Fedora image noticed a not-quite-kosher
consturct recently, which has been corrected.

* vd/pthread-setspecific-g11-fix:
  async_die_is_recursing: work around GCC v11.x issue on Fedora
2021-11-04 12:07:47 -07:00
Victoria Dye 4b540cf913 async_die_is_recursing: work around GCC v11.x issue on Fedora
This fix corrects an issue found in the `dockerized(pedantic, fedora)` CI
build, first appearing after the introduction of a new version of the Fedora
docker image version. This image includes a version of `glibc` with the
attribute `__attr_access_none` added to `pthread_setspecific` [1], the
implementation of which only exists for GCC 11.X - the version included in
the Fedora image. The attribute requires that the pointer provided in the
second argument of `pthread_getspecific` must, if not NULL, be a pointer to
a valid object. In the usage in `async_die_is_recursing`, `(void *)1` is not
valid, causing the error.

This fix imitates a workaround added in SELinux [2] by using the pointer to
the static `async_die_counter` itself as the second argument to
`pthread_setspecific`. This guaranteed non-NULL, valid pointer matches the
intent of the current usage while not triggering the build error.

[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=a1561c3bbe8
[2] https://lore.kernel.org/all/20211021140519.6593-1-cgzones@googlemail.com/

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 23:12:10 -07:00
Junio C Hamano af303ee392 Merge branch 'jh/builtin-fsmonitor-part1'
Built-in fsmonitor (part 1).

* jh/builtin-fsmonitor-part1:
  t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command
  run-command: create start_bg_command
  simple-ipc/ipc-win32: add Windows ACL to named pipe
  simple-ipc/ipc-win32: add trace2 debugging
  simple-ipc: move definition of ipc_active_state outside of ifdef
  simple-ipc: preparations for supporting binary messages.
  trace2: add trace2_child_ready() to report on background children
2021-10-13 15:15:58 -07:00
Ævar Arnfjörð Bjarmason 5e3aba33da hook.[ch]: move find_hook() from run-command.c to hook.c
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.

Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 09:44:54 -07:00
Junio C Hamano 0a4cb1f1f2 Merge branch 'mr/bisect-in-c-4'
Rewrite of "git bisect" in C continues.

* mr/bisect-in-c-4:
  bisect--helper: retire `--bisect-next-check` subcommand
  bisect--helper: reimplement `bisect_run` shell function in C
  bisect--helper: reimplement `bisect_visualize()` shell function in C
  run-command: make `exists_in_PATH()` non-static
  t6030-bisect-porcelain: add test for bisect visualize
  t6030-bisect-porcelain: add tests to control bisect run exit cases
2021-09-23 13:44:48 -07:00
Junio C Hamano c042ad5ad5 Merge branch 'js/run-command-close-packs'
The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.

* js/run-command-close-packs:
  Close object store closer to spawning child processes
  run_auto_maintenance(): implicitly close the object store
  run-command: offer to close the object store before running
  run-command: prettify the `RUN_COMMAND_*` flags
  pull: release packs before fetching
  commit-graph: when closing the graph, also release the slab
2021-09-20 15:20:45 -07:00
Jeff Hostetler fdb1322651 run-command: create start_bg_command
Create a variation of `run_command()` and `start_command()` to launch a command
into the background and optionally wait for it to become "ready" before returning.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 08:57:58 -07:00
Pranit Bauva 3f36e6f30c run-command: make `exists_in_PATH()` non-static
Remove the `static` keyword from `exists_in_PATH()` function
and declare the function in `run-command.h` file.
The function will be used in bisect_visualize() in a later
commit.

Mentored by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 13:37:37 -07:00
Junio C Hamano fd0d7036e0 Merge branch 'ab/retire-advice-config'
Code clean up to migrate callers from older advice_config[] based
API to newer advice_if_enabled() and advice_enabled() API.

* ab/retire-advice-config:
  advice: move advice.graftFileDeprecated squashing to commit.[ch]
  advice: remove use of global advice_add_embedded_repo
  advice: remove read uses of most global `advice_` variables
  advice: add enum variants for missing advice variables
2021-09-10 11:46:29 -07:00
Johannes Schindelin 5a22a334cb run_auto_maintenance(): implicitly close the object store
Before spawning the auto maintenance, we need to make sure that we
release all open file handles to all the `.pack` files (and MIDX files
and commit-graph files and...) so that the maintenance process has the
freedom to delete those files.

So far, we did this manually every time before calling
`run_auto_maintenance()`. With the new `close_object_store` flag, we can
do that implicitly in that function, which is more robust because future
callers won't be able to forget to close the object store.

Note: this changes behavior slightly, as we previously _always_ closed
the object store, but now we only close the object store when actually
running the auto maintenance. In practice, this should not matter (if
anything, it might speed up operations where auto maintenance is
disabled).

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
Johannes Schindelin 28d04e1ec1 run-command: offer to close the object store before running
Especially on Windows, where files cannot be deleted if _any_ process
holds an open file handle to them, it is important to close the object
store (releasing all handles to all `.pack` files) before running a
command that might spawn a garbage collection.

This scenario is so common that we frequently see the pattern of closing
the object store before running auto maintenance or another Git command.

Let's make this much more convenient by teaching the `run_command()`
machinery a new flag to release the object store before spawning the
process.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09 12:56:11 -07:00
René Scharfe 66e905b7dd use xopen() to handle fatal open(2) failures
Add and apply a semantic patch for using xopen() instead of calling
open(2) and die() or die_errno() explicitly.  This makes the error
messages more consistent and shortens the code.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 14:39:08 -07:00
Ben Boeckel ed9bff0817 advice: remove read uses of most global `advice_` variables
In c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25), a new API for
accessing advice variables was introduced and deprecated `advice_config`
in favor of a new array, `advice_setting`.

This patch ports all but two uses which read the status of the global
`advice_` variables over to the new `advice_enabled` API. We'll deal
with advice_add_embedded_repo and advice_graft_file_deprecated
separately.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25 12:07:52 -07:00
Junio C Hamano 8721e2eaed Merge branch 'jt/partial-clone-submodule-1'
Prepare the internals for lazily fetching objects in submodules
from their promisor remotes.

* jt/partial-clone-submodule-1:
  promisor-remote: teach lazy-fetch in any repo
  run-command: refactor subprocess env preparation
  submodule: refrain from filtering GIT_CONFIG_COUNT
  promisor-remote: support per-repository config
  repository: move global r_f_p_c to repo struct
2021-07-16 17:42:53 -07:00
Ævar Arnfjörð Bjarmason 5726a6b401 *.c *_init(): define in terms of corresponding *_INIT macro
Change the common patter in the codebase of duplicating the
initialization logic between an *_INIT macro and a
corresponding *_init() function to use the macro as the canonical
source of truth.

Now we no longer need to keep the function up-to-date with the macro
version. This implements a suggestion by Jeff King who found that
under -O2 [1] modern compilers will init new version in place without
the extra copy[1]. The performance of a single *_init() won't matter
in most cases, but even if it does we're going to be producing
efficient machine code to perform these operations.

1. https://lore.kernel.org/git/YNyrDxUO1PlGJvCn@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
Jonathan Tan d1fa94356d run-command: refactor subprocess env preparation
submodule.c has functionality that prepares the environment for running
a subprocess in a new repo. The lazy-fetching code (used in partial
clones) will need this in a subsequent commit, so move it to a more
central location.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:58:01 -07:00
René Scharfe ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
Junio C Hamano 56a57652ef Sync with Git 2.30.2 for CVE-2021-21300
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 16:09:07 -08:00
Johannes Schindelin e4e68081bb Sync with 2.29.3
* maint-2.29:
  Git 2.29.3
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:51:12 +01:00
Johannes Schindelin d7bdabe52f Sync with 2.28.1
* maint-2.28:
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:14 +01:00
Johannes Schindelin 3f01e56686 Sync with 2.27.1
* maint-2.27:
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:09 +01:00
Johannes Schindelin 2d1142a3e8 Sync with 2.26.3
* maint-2.26:
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:04 +01:00
Johannes Schindelin 97d1dcb1ef Sync with 2.24.4
* maint-2.24:
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:55 +01:00
Johannes Schindelin bcf08f33d8 Sync with 2.21.4
* maint-2.21:
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:41 +01:00
Johannes Schindelin 804963848e Sync with 2.19.6
* maint-2.19:
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:17 +01:00
Johannes Schindelin fb049fd85b Sync with 2.18.5
* maint-2.18:
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:47 +01:00