1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-07 15:06:08 +02:00
Commit Graph

93 Commits

Author SHA1 Message Date
Junio C Hamano 5cd3f68add Merge branch 'kh/range-diff-notes'
"git range-diff --notes=foo" compared "log --notes=foo --notes" of
the two ranges, instead of using just the specified notes tree.

* kh/range-diff-notes:
  range-diff: treat notes like `log`
2023-09-29 09:04:15 -07:00
Kristoffer Haugsbakk 2e0d30d928 range-diff: treat notes like `log`
Currently, `range-diff` shows the default notes if no notes-related
arguments are given. This is also how `log` behaves. But unlike
`range-diff`, `log` does *not* show the default notes if
`--notes=<custom>` are given. In other words, this:

    git log --notes=custom

is equivalent to this:

    git log --no-notes --notes=custom

While:

    git range-diff --notes=custom

acts like this:

    git log --notes --notes-custom

This can’t be how the user expects `range-diff` to behave given that the
man page for `range-diff` under `--[no-]notes[=<ref>]` says:

> This flag is passed to the `git log` program (see git-log(1)) that
> generates the patches.

This behavior also affects `format-patch` since it uses `range-diff` for
the cover letter. Unlike `log`, though, `format-patch` is not supposed
to show the default notes if no notes-related arguments are given.[1]
But this promise is broken when the range-diff happens to have something
to say about the changes to the default notes, since that will be shown
in the cover letter.

Remedy this by introducing `--show-notes-by-default` that `range-diff` can
use to tell the `log` subprocess what to do.

§ Authors

• Fix by Johannes
• Tests by Kristoffer

† 1: See e.g. 66b2ed09c2 (Fix "log" family not to be too agressive about
    showing notes, 2010-01-20).

Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-19 14:40:19 -07:00
Jeff King beaa1d952b hashmap: use expected signatures for comparison functions
We prefer for callback functions to match the signature with which
they'll be called, rather than casting them to the correct type when
assigning function pointers. Even though casting often works in the real
world, it is a violation of the standard.

We did a mass conversion in 939af16eac (hashmap_cmp_fn takes
hashmap_entry params, 2019-10-06), but have grown a few new cases since
then. Because of the cast, the compiler does not complain. However, as
of clang-18, UBSan will catch these at run-time, and the case in
range-diff.c triggers when running t3206.

After seeing that one, I scanned the results of:

  git grep '_fn)[^(]' '*.c' | grep -v typedef

and found a similar case in compat/terminal.c (which presumably isn't
called in the test suite, since it doesn't trigger UBSan). There might
be other cases lurking if the cast is done using a typedef that doesn't
end in "_fn", but loosening it finds too many false positives. I also
looked for:

  git grep ' = ([a-z_]*) *[a-z]' '*.c'

to find assignments that cast, but nothing looked like a function.

The resulting code is unfortunately a little longer, but the bonus of
using container_of() is that we are no longer restricted to the
hashmap_entry being at the start of the struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-19 21:17:53 -07:00
Elijah Newren df6e874496 diff.h: remove unnecessary include of oidset.h
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers.  Add those now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 5e3f94dfe3 treewide: remove cache.h inclusion due to previous changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
Elijah Newren ca4eed708d pager.h: move declarations for pager.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Junio C Hamano e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason 035c7de9e9 cocci: apply the "revision.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"revision.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
Ævar Arnfjörð Bjarmason 085390328f cocci: apply the "diff.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"diff.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason d850b7a545 cocci: apply the "cache.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Elijah Newren 32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
René Scharfe d9165bef58 range-diff: avoid compiler warning when char is unsigned
Since 2b15969f61 (range-diff: let '--abbrev' option takes effect,
2023-02-20), GCC 11.3 on Ubuntu 22.04 on aarch64 warns (and errors
out if the make variable DEVELOPER is set):

range-diff.c: In function ‘output_pair_header’:
range-diff.c:388:20: error: comparison is always false due to limited range of data type [-Werror=type-limits]
  388 |         if (abbrev < 0)
      |                    ^
cc1: all warnings being treated as errors

That's because char is unsigned on that platform.  Use int instead, just
like in struct diff_options, to copy the value faithfully.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-28 14:43:05 -08:00
Teng Long 2b15969f61 range-diff: let '--abbrev' option takes effect
As mentioned in 'git-range-diff.txt': "`git range-diff` also accepts the
regular diff options (see linkgit:git-diff[1])...", but '--abbrev' is not
in the "regular" scope.

In Git, the "abbrev" of an object may not be a fixed value in different
repositories, depending on the needs of the them(Linus mentioned in
e6c587c7 in 2016: "the Linux kernel project needs 11 to 12 hexdigits"
at that time ), that's why a user may want to display abbrev according
to a specified length.

Although a similar effect can be achieved through configuration (like:
git -c core.abbrev=<abbrev>), but based on ease of use (many users may not
know that the -c option can be specified) and the description in existing
document, supporting users to directly use '--abbrev', could be a good way.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21 14:02:05 -08:00
Jeff King 61bdc7c5d8 diff: mark unused parameters in callbacks
The diff code provides a format_callback interface, but not every
callback needs each parameter (e.g., the "opt" and "data" parameters are
frequently left unused). Likewise for the output_prefix callback, the
low-level change/add_remove interfaces, the callbacks used by
xdi_diff(), etc.

Mark unused arguments in the callback implementations to quiet
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:23 +09:00
Junio C Hamano dd407f1c7c Merge branch 'ab/unused-annotation'
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.

* ab/unused-annotation:
  git-compat-util.h: use "deprecated" for UNUSED variables
  git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14 12:56:39 -07:00
Junio C Hamano a6b42ec0c6 Merge branch 'jk/unused-annotation'
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.

* jk/unused-annotation:
  is_path_owned_by_current_uid(): mark "report" parameter as unused
  run-command: mark unused async callback parameters
  mark unused read_tree_recursive() callback parameters
  hashmap: mark unused callback parameters
  config: mark unused callback parameters
  streaming: mark unused virtual method parameters
  transport: mark bundle transport_options as unused
  refs: mark unused virtual method parameters
  refs: mark unused reflog callback parameters
  refs: mark unused each_ref_fn parameters
  git-compat-util: add UNUSED macro
2022-09-14 12:56:39 -07:00
Ævar Arnfjörð Bjarmason 5cf88fd8b0 git-compat-util.h: use "UNUSED", not "UNUSED(var)"
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75d (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.

Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.

This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:48 -07:00
Johannes Schindelin b75747829f range-diff: optionally accept pathspecs
The `git range-diff` command can be quite expensive, which is not a
surprise given that the underlying algorithm to match up pairs of
commits between the provided two commit ranges has a cubic runtime.

Therefore it makes sense to restrict the commit ranges as much as
possible, to reduce the amount of input to that O(N^3) algorithm.

In chatty repositories with wide trees, this is not necessarily
possible merely by choosing commit ranges wisely.

Let's give users another option to restrict the commit ranges: by
providing a pathspec. That helps in repositories with wide trees because
it is likely that the user has a good idea which subset of the tree they
are actually interested in.

Example:

	git range-diff upstream/main upstream/seen HEAD -- range-diff.c

This shows commits that are either in the local branch or in `seen`, but
not in `main`, skipping all commits that do not touch `range-diff.c`.

Note: Since we piggy-back the pathspecs onto the `other_arg` mechanism
that was introduced to be able to pass through the `--notes` option to
the revision machinery, we must now ensure that the `other_arg` array is
appended at the end (the revision range must come before the pathspecs,
if any).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 09:49:26 -07:00
Jeff King 02c3c59e62 hashmap: mark unused callback parameters
Hashmap comparison functions must conform to a particular callback
interface, but many don't use all of their parameters. Especially the
void cmp_data pointer, but some do not use keydata either (because they
can easily form a full struct to pass when doing lookups). Let's mark
these to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
Junio C Hamano ecbd60ae99 Merge branch 'pb/range-diff-with-submodule'
"git -c diff.submodule=log range-diff" did not show anything for
submodules that changed in the ranges being compared, and
"git -c diff.submodule=diff range-diff" did not work correctly.
Fix this by including the "--submodule=short" output
unconditionally to be compared.

* pb/range-diff-with-submodule:
  range-diff: show submodule changes irrespective of diff.submodule
2022-06-13 15:53:41 -07:00
Philippe Blain 04b1f1fd9d range-diff: show submodule changes irrespective of diff.submodule
After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.

In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.

When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:

    error: git apply: bad git-diff - inconsistent old filename on line 1
    error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
    '
    error: could not parse log for '@{u}..@{1}'

Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.

Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 15:47:01 -07:00
Ævar Arnfjörð Bjarmason 1878b5edc0 revision.[ch]: provide and start using a release_revisions()
The users of the revision.[ch] API's "struct rev_info" are a major
source of memory leaks in the test suite under SANITIZE=leak, which in
turn adds a lot of noise when trying to mark up tests with
"TEST_PASSES_SANITIZE_LEAK=true".

The users of that API are largely one-shot, e.g. "git rev-list" or
"git log", or the "git checkout" and "git stash" being modified here

For these callers freeing the memory is arguably a waste of time, but
in many cases they've actually been trying to free the memory, and
just doing that in a buggy manner.

Let's provide a release_revisions() function for these users, and
start migrating them over per the plan outlined in [1]. Right now this
only handles the "pending" member of the struct, but more will be
added in subsequent commits.

Even though we only clear the "pending" member now, let's not leave a
trap in code like the pre-image of index_differs_from(), where we'd
start doing the wrong thing as soon as the release_revisions() learned
to clear its "diffopt". I.e. we need to call release_revisions() after
we've inspected any state in "struct rev_info".

This leaves in place e.g. clear_pathspec(&rev.prune_data) in
stash_working_tree() in builtin/stash.c, subsequent commits will teach
release_revisions() to free "prune_data" and other members that in
some cases are individually cleared by users of "struct rev_info" by
reaching into its members. Those subsequent commits will remove the
relevant calls to e.g. clear_pathspec().

We avoid amending code in index_differs_from() in diff-lib.c as well
as wt_status_collect_changes_index(), has_unstaged_changes() and
has_uncommitted_changes() in wt-status.c in a way that assumes that we
are already clearing the "diffopt" member. That will be handled in a
subsequent commit.

1. https://lore.kernel.org/git/87a6k8daeu.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 23:56:08 -07:00
Ævar Arnfjörð Bjarmason 2d102c2bca range-diff: plug memory leak in read_patches()
Amend code added in d9c66f0b5b (range-diff: first rudimentary
implementation, 2018-08-13) to use a "goto cleanup" pattern. This
makes for less code, and frees memory that we'd previously leak.

The reason for changing free(util) to FREE_AND_NULL(util) is because
at the end of the function we append the contents of "util" to a
"struct string_list" if it's non-NULL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason 4998e93fa6 range-diff: plug memory leak in common invocation
Create a public release_patch() version of the private free_patch()
function added in 13b5af22f3 (apply: move libified code from
builtin/apply.c to apply.{c,h}, 2016-04-22). Unlike the existing
function this one doesn't free() the "struct patch" itself, so we can
use it for variables on the stack.

Use it in range-diff.c to fix a memory leak in common range-diff
invocations, e.g.:

    git -P range-diff origin/master origin/next origin/seen

Would emit several errors when compiled with SANITIZE=leak, but now
runs cleanly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Jean-Noël Avila 43ea635c35 i18n: refactor "foo and bar are mutually exclusive"
Use static strings for constant parts of the sentences. They are all
turned into "cannot be used together".

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-05 13:29:23 -08:00
Junio C Hamano 3d141d8789 Merge branch 'rs/range-diff-avoid-segfault-with-I'
"git range-diff -I... <range> <range>" segfaulted, which has been
corrected.

* rs/range-diff-avoid-segfault-with-I:
  range-diff: avoid segfault with -I
2021-09-15 13:15:27 -07:00
René Scharfe 709b3f32d3 range-diff: avoid segfault with -I
output() reuses the same struct diff_options for multiple calls of
diff_flush().  Set the option no_free to instruct it to keep the
ignore regexes between calls and release them explicitly at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:03:13 -07:00
Junio C Hamano fb0b14df65 Merge branch 'jk/range-diff-fixes'
"git range-diff" code clean-up.

* jk/range-diff-fixes:
  range-diff: use ssize_t for parsed "len" in read_patches()
  range-diff: handle unterminated lines in read_patches()
  range-diff: drop useless "offset" variable from read_patches()
2021-08-30 16:06:05 -07:00
Jeff King c4d5907324 range-diff: use ssize_t for parsed "len" in read_patches()
As we iterate through the buffer containing git-log output, parsing
lines, we use an "int" to store the size of an individual line. This
should be a size_t, as we have no guarantee that there is not a
malicious 2GB+ commit-message line in the output.

Overflowing this integer probably doesn't do anything _too_ terrible. We
are not using the value to size a buffer, so the worst case is probably
an out-of-bounds read from before the array. But it's easy enough to
fix.

Note that we have to use ssize_t here, since we also store the length
result from parse_git_diff_header(), which may return a negative value
for error. That function actually returns an int itself, which has a
similar overflow problem, but I'll leave that for another day. Much
of the apply.c code uses ints and should be converted as a whole; in the
meantime, a negative return from parse_git_diff_header() will be
interpreted as an error, and we'll bail (so we can't handle such a case,
but given that it's likely to be malicious anyway, the important thing
is we don't have any memory errors).

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
Jeff King 7c86d365da range-diff: handle unterminated lines in read_patches()
When parsing our buffer of output from git-log, we have a
find_end_of_line() helper that finds the next newline, and gives us the
number of bytes to move past it, or the size of the whole remaining
buffer if there is no newline.

But trying to handle both those cases leads to some oddities:

  - we try to overwrite the newline with NUL in the caller, by writing
    over line[len-1]. This is at best redundant, since the helper will
    already have done so if it saw a newline. But if it didn't see a
    newline, it's actively wrong; we'll overwrite the byte at the end of
    the (unterminated) line.

    We could solve this just dropping the extra NUL assignment in the
    caller and just letting the helper do the right thing. But...

  - if we see a "diff --git" line, we'll restore the newline on top of
    the NUL byte, so we can pass the string to parse_git_diff_header().
    But if there was no newline in the first place, we can't do this.
    There's no place to put it (the current code writes a newline
    over whatever byte we obliterated earlier). The best we can do is
    feed the complete remainder of the buffer to the function (which is,
    in fact, a string, by virtue of being a strbuf).

To solve this, the caller needs to know whether we actually found a
newline or not. We could modify find_end_of_line() to return that
information, but we can further observe that it has only one caller.
So let's just inline it in that caller.

Nobody seems to have noticed this case, probably because git-log would
never produce input that doesn't end with a newline. Arguably we could
just return an error as soon as we see that the output does not end in a
newline. But the code to do so actually ends up _longer_, mostly because
of the cleanup we have to do in handling the error.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
Jeff King 47ac23d314 range-diff: drop useless "offset" variable from read_patches()
The "offset" variable was was introduced in 44b67cb62b (range-diff:
split lines manually, 2019-07-11), but it has never done anything
useful. We use it to count up the number of bytes we've consumed, but we
never look at the result. It was probably copied accidentally from an
almost-identical loop in apply.c:find_header() (and the point of that
commit was to make use of the parse_git_diff_header() function which
underlies both).

Because the variable was set but not used, most compilers didn't seem to
notice, but the upcoming clang-14 does complain about it, via its
-Wunused-but-set-variable warning.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:37:36 -07:00
Junio C Hamano 4da281e84d Merge branch 'ab/pickaxe-pcre2'
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.

* ab/pickaxe-pcre2: (22 commits)
  xdiff-interface: replace discard_hunk_line() with a flag
  xdiff users: use designated initializers for out_line
  pickaxe -G: don't special-case create/delete
  pickaxe -G: terminate early on matching lines
  xdiff-interface: allow early return from xdiff_emit_line_fn
  xdiff-interface: prepare for allowing early return
  pickaxe -S: slightly optimize contains()
  pickaxe: rename variables in has_changes() for brevity
  pickaxe -S: support content with NULs under --pickaxe-regex
  pickaxe: assert that we must have a needle under -G or -S
  pickaxe: refactor function selection in diffcore-pickaxe()
  perf: add performance test for pickaxe
  pickaxe/style: consolidate declarations and assignments
  diff.h: move pickaxe fields together again
  pickaxe: die when --find-object and --pickaxe-all are combined
  pickaxe: die when -G and --pickaxe-regex are combined
  pickaxe tests: add missing test for --no-pickaxe-regex being an error
  pickaxe tests: test for -G, -S and --find-object incompatibility
  pickaxe tests: add test for "log -S" not being a regex
  pickaxe tests: add test for diffgrep_consume() internals
  ...
2021-07-13 16:52:50 -07:00
Ævar Arnfjörð Bjarmason a8d5eb6dc0 xdiff-interface: prepare for allowing early return
Change the function prototype of xdiff_emit_line_fn to return an "int"
instead of "void". Change all of those functions to "return 0",
nothing checks those return values yet, and no behavior is being
changed.

In subsequent commits the interface will be changed to allow early
return via this new return value.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
brian m. carlson 14228447c9 hash: provide per-algorithm null OIDs
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
René Scharfe ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
Junio C Hamano dadc91ff0c Merge branch 'js/range-diff-one-side-only'
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.

* js/range-diff-one-side-only:
  range-diff: offer --left-only/--right-only options
  range-diff: move the diffopt initialization down one layer
  range-diff: combine all options in a single data structure
  range-diff: simplify code spawning `git log`
  range-diff: libify the read_patches() function again
  range-diff: avoid leaking memory in two error code paths
2021-02-17 17:21:41 -08:00
Johannes Schindelin 359f0d754a range-diff/format-patch: handle commit ranges other than A..B
In the `SPECIFYING RANGES` section of gitrevisions[7], two ways are
described to specify commit ranges that `range-diff` does not yet
accept: "<commit>^!" and "<commit>^-<n>".

Let's accept them, by parsing them via the revision machinery and
looking for at least one interesting and one uninteresting revision in
the resulting `pending` array.

This also finally lets us reject arguments that _do_ contain `..` but
are not actually ranges, e.g. `HEAD^{/do.. match this}`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:24:55 -08:00
Johannes Schindelin 1e79f97326 range-diff: offer --left-only/--right-only options
When comparing commit ranges, one is frequently interested only in one
side, such as asking the question "Has this patch that I submitted to
the Git mailing list been applied?": one would only care about the part
of the output that corresponds to the commits in a local branch.

To make that possible, imitate the `git rev-list` options `--left-only`
and `--right-only`.

This addresses https://github.com/gitgitgadget/git/issues/206

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
Johannes Schindelin 3e6046edad range-diff: move the diffopt initialization down one layer
It is actually only the `output()` function that uses those diffopts. By
moving the diffopt initialization down into that function, it is
encapsulated better.

Incidentally, it will also make it easier to implement the `--left-only`
and `--right-only` options in `git range-diff` because the `output()`
function is now receiving all range-diff options as a parameter, not
just the diffopts.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
Johannes Schindelin f1ce6c191e range-diff: combine all options in a single data structure
This will make it easier to implement the `--left-only` and
`--right-only` options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
Johannes Schindelin 5189bb8724 range-diff: simplify code spawning `git log`
Previously, we waited for the child process to be finished in every
failing code path as well as at the end of the function
`show_range_diff()`.

However, we do not need to wait that long. Directly after reading the
output of the child process, we can wrap up the child process.

This also has the advantage that we don't do a bunch of unnecessary work
in case `finish_command()` returns with an error anyway.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
Johannes Schindelin a2d474adf3 range-diff: libify the read_patches() function again
In library functions, we do want to avoid the (simple, but rather final)
`die()` calls, instead returning with a value indicating an error.

Let's do exactly that in the code introduced in b66885a30c
(range-diff: add section header instead of diff header, 2019-07-11) that
wants to error out if a diff header could not be parsed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
Johannes Schindelin 8c29b49794 range-diff: avoid leaking memory in two error code paths
In the code paths in question, we already release a lot of memory, but
the `current_filename` variable was missed. Fix that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-04 17:16:42 -08:00
Johannes Schindelin 679b5916cd range-diff/format-patch: refactor check for commit range
Currently, when called with exactly two arguments, `git range-diff`
tests for a literal `..` in each of the two. Likewise, the argument
provided via `--range-diff` to `git format-patch` is checked in the same
manner.

However, `<commit>^!` is a perfectly valid commit range, equivalent to
`<commit>^..<commit>` according to the `SPECIFYING RANGES` section of
gitrevisions[7].

In preparation for allowing more sophisticated ways to specify commit
ranges, let's refactor the check into its own function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-27 22:01:49 -08:00
Elijah Newren b19315d8ab Use new HASHMAP_INIT macro to simplify hashmap initialization
Now that hashamp has lazy initialization and a HASHMAP_INIT macro,
hashmaps allocated on the stack can be initialized without a call to
hashmap_init() and in some cases makes the code a bit shorter.  Convert
some callsites over to take advantage of this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:55:27 -08:00
Elijah Newren 6da1a25814 hashmap: provide deallocation function names
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed
for a while, but aren't necessarily the clearest names, especially with
hashmap_partial_clear() being added to the mix and lazy-initialization
now being supported.  Peff suggested we adopt the following names[1]:

  - hashmap_clear() - remove all entries and de-allocate any
    hashmap-specific data, but be ready for reuse

  - hashmap_clear_and_free() - ditto, but free the entries themselves

  - hashmap_partial_clear() - remove all entries but don't deallocate
    table

  - hashmap_partial_clear_and_free() - ditto, but free the entries

This patch provides the new names and converts all existing callers over
to the new naming scheme.

[1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02 12:15:50 -08:00
Jeff King d70a9eb611 strvec: rename struct fields
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").

Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30 19:18:06 -07:00
Jeff King f6d8942b1f strvec: fix indentation in renamed calls
Code which split an argv_array call across multiple lines, like:

  argv_array_pushl(&args, "one argument",
                   "another argument", "and more",
		   NULL);

was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:

  strvec_pushl(&args, "one argument",
                   "another argument", "and more",
		   NULL);

Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:

  git jump grep 'strvec_.*,$'

and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28 15:02:18 -07:00