Fix the way recently added tests interpolate variables defined
outside them, and document the best practice to help future
developers.
* ps/t7800-variable-interpolation-fix:
t/README: document how to loop around test cases
t7800: use single quotes for test bodies
t7800: improve test descriptions with empty arguments
A unit test for reftable code tried to enumerate all files in a
directory after reftable operations and expected to see nothing but
the files it wanted to leave there, but was fooled by .nfs* cruft
files left, which has been corrected.
* ps/reftable-unit-test-nfs-workaround:
reftable: fix tests being broken by NFS' delete-after-close semantics
Hints that suggest what to do after resolving conflicts can now be
squelched by disabling advice.mergeConflict.
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
cf. <e040c631-42d9-4501-a7b8-046f8dac6309@gmail.com>
* pb/advice-merge-conflict:
builtin/am: allow disabling conflict advice
sequencer: allow disabling conflict advice
"git config" corrupted literal HT characters written in the
configuration file as part of a value, which has been corrected.
* ds/config-internal-whitespace-fix:
config.txt: describe handling of whitespace further
t1300: add more tests for whitespace and inline comments
config: really keep value-internal whitespace verbatim
config: minor addition of whitespace
Code clean-up in the "git log" machinery that implements custom log
message formatting.
* jk/pretty-subject-cleanup:
format-patch: fix leak of empty header string
format-patch: simplify after-subject MIME header handling
format-patch: return an allocated string from log_write_email_headers()
log: do not set up extra_headers for non-email formats
pretty: drop print_email_subject flag
pretty: split oneline and email subject printing
shortlog: stop setting pp.print_email_subject
"git checkout --conflict=bad" reported a bad conflictStyle as if it
were given to a configuration variable; it has been corrected to
report that the command line option is bad.
* pw/checkout-conflict-errorfix:
checkout: fix interaction between --conflict and --merge
checkout: cleanup --conflict=<style> parsing
merge options: add a conflict style member
merge-ll: introduce LL_MERGE_OPTIONS_INIT
xdiff-interface: refactor parsing of merge.conflictstyle
The log_write_email_headers() function recently learned to return the
"extra_headers_p" variable to the caller as an allocated string. We
start by copying rev_info.extra_headers into a strbuf, and then detach
the strbuf at the end of the function. If there are no extra headers, we
leave the strbuf empty. Likewise, if there are no headers to return, we
pass back NULL.
This misses a corner case which can cause a leak. The "do we have any
headers to copy" check is done by looking for a NULL opt->extra_headers.
But the "do we have a non-empty string to return" check is done by
checking the length of the strbuf. That means if opt->extra_headers is
the empty string, we'll "copy" it into the strbuf, triggering an
allocation, but then leak the buffer when we return NULL from the
function.
We can solve this in one of two ways:
1. Rather than checking headers->len at the end, we could check
headers->alloc to see if we allocated anything. That retains the
original behavior before the recent change, where an empty
extra_headers string is "passed through" to the caller. In practice
this doesn't matter, though (the code which eventually looks at the
result treats NULL or the empty string the same).
2. Only bother copying a non-empty string into the strbuf. This has
the added bonus of avoiding a pointless allocation.
Arguably strbuf_addstr() could do this optimization itself, though
it may be slightly dangerous to do so (some existing callers may
not get a fresh allocation when they expect to). In theory callers
are all supposed to use strbuf_detach() in such a case, but there's
no guarantee that this is the case.
This patch uses option 2. Without it, building with SANITIZE=leak shows
many errors in t4021 and elsewhere.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases it makes sense to loop around test cases so that we can
execute the same test with slightly different arguments. There are some
gotchas around quoting here though that are easy to miss and that may
lead to easy-to-miss errors and portability issues.
Document the proper way to do this in "t/README".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In eb84c8b6ce (git-difftool--helper: honor `--trust-exit-code` with
`--dir-diff`, 2024-02-20) we have started to loop around some of the
tests in t7800 so that they are reexecuted with slightly different
arguments. As part of that refactoring the quoting of test bodies was
changed from single quotes (') to double quotes (") so that the value of
the loop variable is accessible to the body.
As the test body is later on passed to eval this change was not required
though. Let's revert it back to use single quotes as usual in our tests.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the tests in t7800 are executed repeatedly in a loop with
different arguments. To distinguish these tests, the value of that
variable is rendered into the test title. But given that one of the
values is the empty string, it results in a somewhat awkward test name:
difftool ignores exit code
Improve this by printing "without options" in case the value is empty.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it more clear what the whitespace characters are in the context of git
configuration files, and significantly improve the description of the leading
and trailing whitespace handling, especially how it works out together with
the presence of inline comments.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a handful of additional tests, to improve the coverage of the handling
of configuration file entries whose values contain internal whitespace,
leading and/or trailing whitespace, which may or may not be enclosed within
quotation marks, or which contain an additional inline comment.
At the same time, rework one already existing whitespace-related test a bit,
to ensure its consistency with the newly added tests. This change introduced
no functional changes to the already existing test.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in function parse_value() that prevented whitespace characters
(i.e. spaces and horizontal tabs) found inside configuration option values
from being parsed and returned in their original form. The bug caused any
number of consecutive whitespace characters to be wrongly "squashed" into
the same number of space characters.
This bug was introduced back in July 2009, in commit ebdaae372b ("config:
Keep inner whitespace verbatim").
Further investigation showed that setting a configuration value, by invoking
git-config(1), converts value-internal horizontal tabs into "\t" escape
sequences, which the buggy value-parsing logic in function parse_value()
didn't "squash" into spaces. That's why the test included in the ebdaae37
commit passed, which presumably made the bug remain undetected for this long.
On the other hand, value-internal literal horizontal tab characters, found in
a configuration file edited by hand, do get "squashed" by the value-parsing
logic, so the right choice was to fix this bug by making the value-internal
whitespace characters preserved verbatim.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, binary operators should be enclosed in a pair of leading and
trailing space (SP) characters. Thus, clean up one spotted expression that
for some reason had a "bunched up" operator.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The status.showUntrackedFiles configuration variable had a name
that tempts users to set a Boolean value expressed in our usual
"false", "off", and "0", but it only took "no". This has been
corrected so "true" and its synonyms are taken as "normal", while
"false" and its synonyms are taken as "no".
* jc/show-untracked-false:
status: allow --untracked=false and friends
status: unify parsing of --untracked= and status.showUntrackedFiles
"git diff" and friends learned two extra configuration variables.
* ph/diff-src-dst-prefix-config:
diff.*Prefix: use camelCase in the doc and test titles
diff: add diff.srcPrefix and diff.dstPrefix configuration variables
The output format for dates "iso-strict" has been tweaked to show
a time in the Zulu timezone with "Z" suffix, instead of "+00:00".
* bb/iso-strict-utc:
date: make "iso-strict" conforming for the UTC timezone
The status.showUntrackedFiles configuration variable was
incorrectly documented to accept "false", which has been corrected.
* jw/doc-show-untracked-files-fix:
doc: status.showUntrackedFiles does not take "false"
Mark-ups used in the documentation has been improved for
consistency.
* ja/doc-markup-fixes:
doc: git-clone: format placeholders
doc: git-clone: format verbatim words
doc: git-init: rework config item init.templateDir
doc: git-init: rework definition lists
doc: git-init: format placeholders
doc: git-init: format verbatim parts
The code to iterate over reflogs in the reftable has been optimized
to reduce memory allocation and deallocation.
Reviewed-by: Josh Steadmon <steadmon@google.com>
cf. <Ze9eX-aaWoVaqsPP@google.com>
* ps/reftable-reflog-iteration-perf:
refs/reftable: track last log record name via strbuf
reftable/record: use scratch buffer when decoding records
reftable/record: reuse message when decoding log records
reftable/record: reuse refnames when decoding log records
reftable/record: avoid copying author info
reftable/record: convert old and new object IDs to arrays
refs/reftable: reload correct stack when creating reflog iter
Users with safe.bareRepository=explicit can still work from within
$GIT_DIR of a seconary worktree (which resides at .git/worktrees/$name/)
of the primary worktree without explicitly specifying the $GIT_DIR
environment variable or the --git-dir=<path> option.
* jc/safe-implicit-bare:
setup: notice more types of implicit bare repositories
The code to find the effective end of log message can fall into an
endless loop, which has been corrected.
* fs/find-end-of-log-message-fix:
wt-status: don't find scissors line beyond buf len
The reftable code has its own custom binary search function whose
comparison callback has an unusual interface, which caused the
binary search to degenerate into a linear search, which has been
corrected.
* ps/reftable-block-search-fix:
reftable/block: fix binary search over restart counter
reftable/record: fix memory leak when decoding object records
The code in reftable backend that creates new table files works
better with the tempfile framework to avoid leaving cruft after a
failure.
* ps/reftable-stack-tempfile:
reftable/stack: register compacted tables as tempfiles
reftable/stack: register lockfiles during compaction
reftable/stack: register new tables as tempfiles
lockfile: report when rollback fails
The parse-options code that deals with abbreviated long option
names have been cleaned up.
Reviewed-by: Josh Steadmon <steadmon@google.com>
cf. <ZfDM5Or3EKw7Q9SA@google.com>
* rs/opt-parse-long-fixups:
parse-options: rearrange long_name matching code
parse-options: normalize arg and long_name before comparison
parse-options: detect ambiguous self-negation
parse-options: factor out register_abbrev() and struct parsed_option
parse-options: set arg of abbreviated option lazily
parse-options: recognize abbreviated negated option with arg
It was reported that the reftable unit tests in t0032 fail with the
following assertion when running on top of NFS:
running test_reftable_stack_compaction_concurrent_clean
reftable/stack_test.c: 1063: failed assertion count_dir_entries(dir) == 2
Aborted
Setting a breakpoint immediately before the assertion in fact shows the
following list of files:
./stack_test-1027.QJBpnd
./stack_test-1027.QJBpnd/0x000000000001-0x000000000003-dad7ac80.ref
./stack_test-1027.QJBpnd/.nfs000000000001729f00001e11
./stack_test-1027.QJBpnd/tables.list
Note the weird ".nfs*" file? This file is maintained by NFS clients in
order to emulate delete-after-last-close semantics that we rely on in
the reftable code [1]. Instead of unlinking the file right away and
keeping it open in the client, the NFS client will rename it to ".nfs*"
and then delete that temporary file when the last reference to it gets
dropped. Quoting the NFS FAQ:
> D2. What is a "silly rename"? Why do these .nfsXXXXX files keep
> showing up?
>
> A. Unix applications often open a scratch file and then unlink it.
> They do this so that the file is not visible in the file system name
> space to any other applications, and so that the system will
> automatically clean up (delete) the file when the application exits.
> This is known as "delete on last close", and is a tradition among
> Unix applications.
>
> Because of the design of the NFS protocol, there is no way for a
> file to be deleted from the name space but still remain in use by an
> application. Thus NFS clients have to emulate this using what
> already exists in the protocol. If an open file is unlinked, an NFS
> client renames it to a special name that looks like ".nfsXXXXX".
> This "hides" the file while it remains in use. This is known as a
> "silly rename." Note that NFS servers have nothing to do with this
> behavior.
This of course throws off the assertion that we got exactly two files in
that directory.
The test in question triggers this behaviour by holding two open file
descriptors to the "tables.list" file. One of the references is because
we are about to append to the stack, whereas the other reference is
because we want to compact it. As the compaction has just finished we
already rewrote "tables.list" to point to the new contents, but the
other file descriptor pointing to the old version is still open. Thus we
trigger the delete-after-last-close emulation.
Furthermore, it was reported that this behaviour only triggers with
4f36b8597c (reftable/stack: fix race in up-to-date check, 2024-01-18).
This is expected as well because it is the first point in time where we
actually keep the "tables.list" file descriptor open for the stat cache.
Fix this bug by skipping over any files that start with a leading dot
when counting files. While we could explicitly check for a prefix of
".nfs", other network file systems like SMB for example do the same
trickery but with a ".smb" prefix. In any case though, this loosening of
the assertion should be fine given that the reftable library would never
write files with leading dots by itself.
[1]: https://nfs.sourceforge.net/#faq_d2
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hg-to-git script is full of command injection vulnerabilities
against malicious branch and tag names. It's also old and largely
unmaintained; the last commit was over 4 years ago, and the last code
change before that was from 2013. Users are better off with a modern
remote-helper tool like cinnabar or remote-hg.
So rather than spending time to fix it, let's just get rid of it.
Reported-by: Matthew Rollings <admin@stealthcopter.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few cases left in gitremote-helpers.txt that are missing a
closing quote, so you end up with:
'option deepen-since <timestamp>
with a stray opening quote instead of rendering correctly in italics.
These should have been part of 51d41dc243 (doc/gitremote-helpers: fix
missing single-quote, 2024-03-07), but apparently my eyesight is not
what it once was. Hopefully this is now all of them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In log_write_email_headers(), we append our MIME headers to the set of
extra headers by creating a new strbuf, adding the existing headers, and
then adding our new ones. We had to do it this way when our output
buffer might point to the constant opt->extra_headers variable.
But since the previous commit, we always make a local copy of that
variable. Let's turn that into a strbuf, which lets the MIME code simply
append to it. That simplifies the function and avoids a pointless extra
copy of the headers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pretty-printing a commit in the email format, we have to fill in
the "after subject" field of the pretty_print_context with any extra
headers the user provided (e.g., from "--to" or "--cc" options) plus any
special MIME headers.
We return an out-pointer that sometimes points to a newly heap-allocated
string and sometimes not. To avoid leaking, we store the allocated
version in a buffer with static lifetime, which is ugly. Worse, as we
extend the header feature, we'll end up having to repeat this ugly
pattern.
Instead, let's have our out-pointer pass ownership back to the caller,
and duplicate the string when necessary. This does mean one extra
allocation per commit when you use extra headers, but in the context of
format-patch which is showing diffs, I don't think that's even
measurable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit pretty-printer code has an "after_subject" parameter which it
uses to insert extra headers into the email format. In show_log() we set
this by calling log_write_email_headers() if we are using an email
format, but otherwise default the variable to the rev_info.extra_headers
variable.
Since the pretty-printer code will ignore after_subject unless we are
using an email format, this default is pointless. We can just set
after_subject directly, eliminating an extra variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With one exception, the print_email_subject flag is set if and only if
the commit format is email based:
- in make_cover_letter() we set it along with CMIT_FMT_EMAIL
explicitly
- in show_log(), we set it if cmit_fmt_is_mail() is true. That covers
format-patch as well as "git log --format=email" (or mboxrd).
The one exception is "rev-list --format=email", which somewhat
nonsensically prints the author and date as email headers, but no
subject, like:
$ git rev-list --format=email HEAD
commit 64fc4c2cdd4db2645eaabb47aa4bac820b03cdba
From: Jeff King <peff@peff.net>
Date: Tue, 19 Mar 2024 19:39:26 -0400
this is the subject
this is the body
It's doubtful that this is a useful format at all (the "commit" lines
replace the "From" lines that would make it work as an actual mbox).
But I think that printing the subject as a header (like this patch does)
is the least surprising thing to do.
So let's drop this field, making the code a little simpler and easier to
reason about. Note that we do need to set the "rev" field of the
pretty_print_context in rev-list, since that is used to check for
subject_prefix, etc. It's not possible to set those fields via rev-list,
so we'll always just print "Subject: ". But unless we pass in our
rev_info, fmt_output_email_subject() would segfault trying to figure it
out.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pp_title_line() function is used for two formats: the oneline format
and the subject line of the email format. But most of the logic in the
function does not make any sense for oneline; it is about special
formatting of email headers.
Lumping the two formats together made sense long ago in 4234a76167
(Extend --pretty=oneline to cover the first paragraph, 2007-06-11), when
there was a lot of manual logic to paste lines together. But later,
88c44735ab (pretty: factor out format_subject(), 2008-12-27) pulled that
logic into its own function.
We can implement the oneline format by just calling that one function.
This makes the intention of the code much more clear, as we know we only
need to worry about those extra email options when dealing with actual
email.
While the intent here is cleanup, it is possible to trigger these cases
in practice by running format-patch with an explicit --oneline option.
But if you did, the results are basically nonsense. For example, with
the preserve_subject flag:
$ printf "%s\n" one two three | git commit --allow-empty -F -
$ git format-patch -1 --stdout -k | grep ^Subject
Subject: =?UTF-8?q?one=0Atwo=0Athree?=
$ git format-patch -1 --stdout -k --oneline --no-signature
2af7fbe one
two
three
Or with extra headers:
$ git format-patch -1 --stdout --cc=me --oneline --no-signature
2af7fbe one two three
Cc: me
So I'd actually consider this to be an improvement, though you are
probably crazy to use other formats with format-patch in the first place
(arguably it should forbid non-email formats entirely, but that's a
bigger change).
As a bonus, it eliminates some pointless extra allocations for the
oneline output. The email code, since it has to deal with wrapping,
formats into an extra auxiliary buffer. The speedup is tiny, though like
"rev-list --no-abbrev --format=oneline" seems to improve by a consistent
1-2% for me.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When shortlog processes a commit using its internal traversal, it may
pretty-print the subject line for the summary view. When we do so, we
set the "print_email_subject" flag in the pretty-print context. But this
flag does nothing! Since we are using CMIT_FMT_USERFORMAT, we skip most
of the usual formatting code entirely.
This flag is there due to commit 6d167fd7cc (pretty: use
fmt_output_email_subject(), 2017-03-01). But that just switched us away
from setting an empty "subject" header field, which was similarly
useless. That was added by dd2e794a21 (Refactor pretty_print_commit
arguments into a struct, 2009-10-19). Before using the struct, we had to
pass _something_ as the argument, so we passed the empty string (a NULL
would have worked equally well).
So this setting has never done anything, and we can drop the line. That
shortens the code, but more importantly, makes it easier to reason about
and refactor the other users of this flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for "%(trailers[:options])" placeholder in the
"--pretty" option of commands in the "git log" family has been
updated.
* bl/doc-key-val-sep-fix:
docs: adjust trailer `separator` and `key_value_separator` language
docs: correct trailer `key_value_separator` description
A few typoes in "git config --help" have been corrected.
* bl/doc-config-fixes:
docs: fix typo in git-config `--default`
docs: clarify file options in git-config `--edit`
"git bugreport --no-suffix" was not supported and instead
segfaulted, which has been corrected.
* js/bugreport-no-suffix-fix:
bugreport.c: fix a crash in `git bugreport` with `--no-suffix` option