1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-19 14:06:37 +02:00
Commit Graph

382 Commits

Author SHA1 Message Date
Junio C Hamano 9f8e894685 Merge branch 'ps/undecided-is-not-necessarily-sha1' into next
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-08 10:20:30 -07:00
Patrick Steinhardt 2e5c4758b7 cocci: apply rules to rewrite callers of "refs" interfaces
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Patrick Steinhardt ce992ce29a builtin/blame: don't access potentially unitialized `the_hash_algo`
We access `the_hash_algo` in git-blame(1) before we have executed
`parse_options_start()`, which may not be properly set up in case we
have no repository. This is fine for most of the part because all the
call paths that lead to it (git-blame(1), git-annotate(1) as well as
git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository.

There is one exception though, namely when passing `-h` to print the
help. Here we will access `the_hash_algo` even if there is no repo.
This works fine right now because `the_hash_algo` gets sets up to point
to the SHA1 algorithm via `initialize_repository()`. But we're about to
stop doing this, and thus the code would lead to a `NULL` pointer
exception.

Prepare the code for this and only access `the_hash_algo` after we are
sure that there is a proper repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
René Scharfe 9720d23e8c date: make DATE_MODE thread-safe
date_mode_from_type() modifies a static variable and returns a pointer
to it.  This is not thread-safe.  Most callers of date_mode_from_type()
use it via the macro DATE_MODE and pass its result on to functions like
show_date(), which take a const pointer and don't modify the struct.

Avoid the static storage by putting the variable on the stack and
returning the whole struct date_mode.  Change functions that take a
constant pointer to expect the whole struct instead.

Reduce the cost of passing struct date_mode around on 64-bit systems
by reordering its members to close the hole between the 32-bit wide
.type and the 64-bit aligned .strftime_fmt as well as the alignment
hole at the end.  sizeof reports 24 before and 16 with this change
on x64.  Keep .type at the top to still allow initialization without
designator -- though that's only done in a single location, in
builtin/blame.c.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 15:21:14 -07:00
Junio C Hamano 492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Elijah Newren eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Jeff King ba176db511 config: handle NULL value when parsing non-bools
When the config parser sees an "implicit" bool like:

  [core]
  someVariable

it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.

These are all fairly vanilla cases where the solution is just the usual
pattern of:

  if (!value)
        return config_error_nonbool(var);

though note that in a few cases we have to split initializers like:

  int some_var = initializer();

into:

  int some_var;
  if (!value)
        return config_error_nonbool(var);
  some_var = initializer();

There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.

Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:39 +09:00
Junio C Hamano ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
Junio C Hamano b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
Calvin Wan 91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Glen Choo a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Elijah Newren a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren ca4eed708d pager.h: move declarations for pager.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Junio C Hamano e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason ecb5091fd4 cocci: apply the "commit.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason d850b7a545 cocci: apply the "cache.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Elijah Newren d48be35ca6 write-or-die.h: move declarations for write-or-die.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren 32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren 41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
Elijah Newren 36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Ævar Arnfjörð Bjarmason df8738116f doc txt & -h consistency: make "annotate" consistent
The cmd_blame() already detected whether it was processing "blame" or
"annotate", but it didn't adjust its usage output accordingly. Let's
do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13 09:32:57 -07:00
SZEDER Gábor fa83cc834d parse-options: add support for parsing subcommands
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.

Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.

The approach is guided by the following observations:

  - Most subcommands [1] are implemented in dedicated functions, and
    most of those functions [2] either have a signature matching the
    'int cmd_foo(int argc, const char **argc, const char *prefix)'
    signature of builtin commands or can be trivially converted to
    that signature, because they miss only that last prefix parameter
    or have no parameters at all.

  - Subcommand arguments only have long form, and they have no double
    dash prefix, no negated form, and no description, and they don't
    take any arguments, and can't be abbreviated.

  - There must be exactly one subcommand among the arguments, or zero
    if the command has a default operation mode.

  - All arguments following the subcommand are considered to be
    arguments of the subcommand, and, conversely, arguments meant for
    the subcommand may not preceed the subcommand.

So in the end subcommand declaration and parsing would look something
like this:

    parse_opt_subcommand_fn *fn = NULL;
    struct option builtin_commit_graph_options[] = {
        OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"),
                   N_("the object directory to store the graph")),
        OPT_SUBCOMMAND("verify", &fn, graph_verify),
        OPT_SUBCOMMAND("write", &fn, graph_write),
        OPT_END(),
    };
    argc = parse_options(argc, argv, prefix, options,
                         builtin_commit_graph_usage, 0);
    return fn(argc, argv, prefix);

Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer.  parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed.  If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.

If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag.  In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged.  Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).

Some thoughts about the implementation:

  - The same pointer to 'fn' must be specified as 'value' for each
    OPT_SUBCOMMAND, because there can be only one set of mutually
    exclusive subcommands; parse_options() will BUG() otherwise.

    There are other ways to tell parse_options() where to put the
    function associated with the subcommand given on the command line,
    but I didn't like them:

      - Change parse_options()'s signature by adding a pointer to
        subcommand function to be set to the function associated with
        the given subcommand, affecting all callsites, even those that
        don't have subcommands.

      - Introduce a specific parse_options_and_subcommand() variant
        with that extra funcion parameter.

  - I decided against automatically calling the subcommand function
    from within parse_options(), because:

      - There are commands that have to perform additional actions
        after option parsing but before calling the function
        implementing the specified subcommand.

      - The return code of the subcommand is usually the return code
        of the git command, but preserving the return code of the
        automatically called subcommand function would have made the
        API awkward.

  - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
    option flag: we have two subcommands that are purposefully
    excluded from completion ('git remote rm' and 'git stash save'),
    so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
    flag.

  - Some of the 'parse_opt_flags' don't make sense with subcommands,
    and using them is probably just an oversight or misunderstanding.
    Therefore parse_options() will BUG() when invoked with any of the
    following flags while the options array contains at least one
    OPT_SUBCOMMAND:

      - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
        arguments when encountering a "--" argument, so it doesn't
        make sense to expect and keep one before a subcommand, because
        it would prevent the parsing of the subcommand.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
        might be meaningful for the command's default operation mode,
        e.g. to disambiguate refs and pathspecs.

      - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
        tells parse_options() to stop as soon as it encouners a
        non-option argument, but subcommands are by definition not
        options...  so how could they be parsed, then?!

      - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
        unknown --options and then pass them to a different command or
        subsystem.  Surely if a command has subcommands, then this
        functionality should rather be delegated to one of those
        subcommands, and not performed by the command itself.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
        --options to the default operation mode.

  - If the command with subcommands has a default operation mode, then
    all arguments to the command must preceed the arguments of the
    subcommand.

    AFAICT we don't have any commands where this makes a difference,
    because in those commands either only the command accepts any
    arguments ('notes' and 'remote'), or only the default subcommand
    ('reflog' and 'stash'), but never both.

  - The 'argv' array passed to subcommand functions currently starts
    with the name of the subcommand.  Keep this behavior.  AFAICT no
    subcommand functions depend on the actual content of 'argv[0]',
    but the parse_options() call handling their options expects that
    the options start at argv[1].

  - To support handling subcommands programmatically in our Bash
    completion script, 'git cmd --git-completion-helper' will now list
    both subcommands and regular --options, if any.  This means that
    the completion script will have to separate subcommands (i.e.
    words without a double dash prefix) from --options on its own, but
    that's rather easy to do, and it's not much work either, because
    the number of subcommands a command might have is rather low, and
    those commands accept only a single --option or none at all.  An
    alternative would be to introduce a separate option that lists
    only subcommands, but then the completion script would need not
    one but two git invocations and command substitutions for commands
    with subcommands.

    Note that this change doesn't affect the behavior of our Bash
    completion script, because when completing the --option of a
    command with subcommands, e.g. for 'git notes --<TAB>', then all
    subcommands will be filtered out anyway, as none of them will
    match the word to be completed starting with that double dash
    prefix.

[1] Except 'git rerere', because many of its subcommands are
    implemented in the bodies of the if-else if statements parsing the
    command's subcommand argument.

[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
    because some of the functions implementing their subcommands take
    special parameters.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
Junio C Hamano 2da81d1efb Merge branch 'ab/plug-leak-in-revisions'
Plug the memory leaks from the trickiest API of all, the revision
walker.

* ab/plug-leak-in-revisions: (27 commits)
  revisions API: add a TODO for diff_free(&revs->diffopt)
  revisions API: have release_revisions() release "topo_walk_info"
  revisions API: have release_revisions() release "date_mode"
  revisions API: call diff_free(&revs->pruning) in revisions_release()
  revisions API: release "reflog_info" in release revisions()
  revisions API: clear "boundary_commits" in release_revisions()
  revisions API: have release_revisions() release "prune_data"
  revisions API: have release_revisions() release "grep_filter"
  revisions API: have release_revisions() release "filter"
  revisions API: have release_revisions() release "cmdline"
  revisions API: have release_revisions() release "mailmap"
  revisions API: have release_revisions() release "commits"
  revisions API users: use release_revisions() for "prune_data" users
  revisions API users: use release_revisions() with UNLEAK()
  revisions API users: use release_revisions() in builtin/log.c
  revisions API users: use release_revisions() in http-push.c
  revisions API users: add "goto cleanup" for release_revisions()
  stash: always have the owner of "stash_info" free it
  revisions API users: use release_revisions() needing REV_INFO_INIT
  revision.[ch]: document and move code declared around "init"
  ...
2022-06-07 14:10:56 -07:00
Junio C Hamano bcccafbef0 Merge branch 'ea/progress-partial-blame'
The progress meter of "git blame" was showing incorrect numbers
when processing only parts of the file.

* ea/progress-partial-blame:
  blame: report correct number of lines in progress when using ranges
2022-05-10 17:41:11 -07:00
Ævar Arnfjörð Bjarmason 2108fe4a19 revisions API users: add straightforward release_revisions()
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 23:56:08 -07:00
Ævar Arnfjörð Bjarmason 4b59b2db97 blame: use "goto cleanup" for cleanup_scoreboard()
Amend a freeing pattern added in 0906ac2b54 (blame: use changed-path
Bloom filters, 2020-04-16) to use a "goto cleanup", so that we can be
sure that we call cleanup_scoreboard().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 23:56:08 -07:00
Edmundo Carmona Antoranz e5f5d7d42e blame: report correct number of lines in progress when using ranges
When using ranges, use the range sizes as the limit for progress
instead of the size of the full file.

Before:
$ git blame --progress builtin/blame.c > /dev/null
Blaming lines: 100% (1210/1210), done.
$ git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
Blaming lines:  10% (122/1210), done.
$

After:
$ ./git blame --progress builtin/blame.c > /dev/null
Blaming lines: 100% (1210/1210), done.
$ ./git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
Blaming lines: 100% (122/122), done.
$

Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:29:59 -07:00
Junio C Hamano d21d5ddfe6 Merge branch 'ja/i18n-common-messages'
Unify more messages to help l10n.

* ja/i18n-common-messages:
  i18n: fix some misformated placeholders in command synopsis
  i18n: remove from i18n strings that do not hold translatable parts
  i18n: factorize "invalid value" messages
  i18n: factorize more 'incompatible options' messages
2022-02-25 15:47:35 -08:00
Alex Henrie 087c745833 log: add a --no-graph option
It's useful to be able to countermand a previous --graph option, for
example if `git log --graph` is run via an alias.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-11 10:06:41 -08:00
Jean-Noël Avila 1a8aea857e i18n: factorize "invalid value" messages
Use the same message when an invalid value is passed to a command line
option or a configuration variable.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-04 13:58:28 -08:00
Junio C Hamano 8d2c37320b Merge branch 'ld/sparse-diff-blame'
Teach diff and blame to work well with sparse index.

* ld/sparse-diff-blame:
  blame: enable and test the sparse index
  diff: enable and test the sparse index
  diff: replace --staged with --cached in t1092 tests
  repo-settings: prepare_repo_settings only in git repos
  test-read-cache: set up repo after git directory
  commit-graph: return if there is no git directory
  git: ensure correct git directory setup with -h
2021-12-21 15:03:17 -08:00
Lessley Dennington add4c864b6 blame: enable and test the sparse index
Enable the sparse index for the 'git blame' command. The index was already
not expanded with this command, so the most interesting thing to do is to
add tests that verify that 'git blame' behaves correctly when the sparse
index is enabled and that its performance improves. More specifically, these
cases are:

1. The index is not expanded for 'blame' when given paths in the sparse
checkout cone at multiple levels.

2. Performance measurably improves for 'blame' with sparse index when given
paths in the sparse checkout cone at multiple levels.

The `p2000` tests demonstrate a ~60% execution time reduction when running
'blame' for a file two levels deep and and a ~30% execution time reduction
for a file three levels deep.

Test                                         before  after
----------------------------------------------------------------
2000.62: git blame f2/f4/a (full-v3)         0.31    0.32 +3.2%
2000.63: git blame f2/f4/a (full-v4)         0.29    0.31 +6.9%
2000.64: git blame f2/f4/a (sparse-v3)       0.55    0.23 -58.2%
2000.65: git blame f2/f4/a (sparse-v4)       0.57    0.23 -59.6%
2000.66: git blame f2/f4/f3/a (full-v3)      0.77    0.85 +10.4%
2000.67: git blame f2/f4/f3/a (full-v4)      0.78    0.81 +3.8%
2000.68: git blame f2/f4/f3/a (sparse-v3)    1.07    0.72 -32.7%
2000.99: git blame f2/f4/f3/a (sparse-v4)    1.05    0.73 -30.5%

We do not include paths outside the sparse checkout cone because blame
does not support blaming files that are not present in the working
directory. This is true in both sparse and full checkouts.

Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-06 09:55:06 -08:00
Junio C Hamano 65ca3245f9 Merge branch 'ab/parse-options-cleanup'
Random changes to parse-options implementation.

* ab/parse-options-cleanup:
  parse-options: change OPT_{SHORT,UNSET} to an enum
  parse-options tests: test optname() output
  parse-options.[ch]: make opt{bug,name}() "static"
  commit-graph: stop using optname()
  parse-options.c: move optname() earlier in the file
  parse-options.h: make the "flags" in "struct option" an enum
  parse-options.c: use exhaustive "case" arms for "enum parse_opt_result"
  parse-options.[ch]: consistently use "enum parse_opt_result"
  parse-options.[ch]: consistently use "enum parse_opt_flags"
  parse-options.h: move PARSE_OPT_SHELL_EVAL between enums
2021-10-25 16:06:59 -07:00
Ævar Arnfjörð Bjarmason 352e761388 parse-options.[ch]: consistently use "enum parse_opt_result"
Use the "enum parse_opt_result" instead of an "int flags" as the
return value of the applicable functions in parse-options.c.

This will help catch future bugs, such as the missing "case" arms in
the two existing users of the API in "blame.c" and "shortlog.c". A
third caller in 309be813c9 (update-index: migrate to parse-options
API, 2010-12-01) was already checking for these.

As can be seen when trying to sort through the deluge of warnings
produced when compiling this with CC=g++ (mostly unrelated to this
change) we're not consistently using "enum parse_opt_result" even now,
i.e. we'll return error() and "return 0;". See f41179f16b
(parse-options: avoid magic return codes, 2019-01-27) for a commit
which started changing some of that.

I'm not doing any more of that exhaustive migration here, and it's
probably not worthwhile past the point of being able to check "enum
parse_opt_result" in switch().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 14:13:11 -07:00
Ævar Arnfjörð Bjarmason 4eb2bfdc92 builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macro
Remove the commit_info_init() function addded in ea02ffa385 (mailmap:
simplify map_user() interface, 2013-01-05) and instead initialize the
"struct commit_info" with a macro.

This is the more idiomatic pattern in the codebase, and doesn't leave
us wondering when we see the *_init() function if this struct needs
more complex initialization than a macro can provide.

The get_commit_info() function is only called by the three callers
being changed here immediately after initializing the struct with the
macros, so by moving the initialization to the callers we don't need
to do it in get_commit_info() anymore.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 15:02:32 -07:00
Junio C Hamano 18decfd11d Merge branch 'rs/blame-optim'
Optimization in "git blame"

* rs/blame-optim:
  blame: remove unnecessary use of get_commit_info()
2021-02-25 16:43:29 -08:00
Rafael Silva 8e16effe97 blame: remove unnecessary use of get_commit_info()
When `git blame --color-by-age`, the determine_line_heat() is called to
select how to color the output based on the commit's author date.  It
uses the get_commit_info() to parse the information into a `commit_info`
structure, however, this is actually unnecessary because the
determine_line_heat() caller also does the same.

Instead, let's change the determine_line_heat() to take a `commit_info`
structure and remove the internal call to get_commit_info() thus
cleaning up and optimizing the code path.

Enabling Git's trace2 API in order to record the execution time for
every call to determine_line_heat() function:

   + trace2_region_enter("blame", "determine_line_heat", the_repository);
     determine_line_heat(ent, &default_color);
   + trace2_region_enter("blame", "determine_line_heat", the_repository);

Then, running `git blame` for "kernel/fork.c" in linux.git and summing
all the execution time for every call (around 1.3k calls) resulted in
2.6x faster execution (best out 3):

   git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
   git built from 328c109303 + this change                  = 16ms

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 11:04:17 -08:00
Junio C Hamano 42342b3ee6 Merge branch 'ab/mailmap'
Clean-up docs, codepaths and tests around mailmap.

* ab/mailmap: (22 commits)
  shortlog: remove unused(?) "repo-abbrev" feature
  mailmap doc + tests: document and test for case-insensitivity
  mailmap tests: add tests for empty "<>" syntax
  mailmap tests: add tests for whitespace syntax
  mailmap tests: add a test for comment syntax
  mailmap doc + tests: add better examples & test them
  tests: refactor a few tests to use "test_commit --append"
  test-lib functions: add an --append option to test_commit
  test-lib functions: add --author support to test_commit
  test-lib functions: document arguments to test_commit
  test-lib functions: expand "test_commit" comment template
  mailmap: test for silent exiting on missing file/blob
  mailmap tests: get rid of overly complex blame fuzzing
  mailmap tests: add a test for "not a blob" error
  mailmap tests: remove redundant entry in test
  mailmap tests: improve --stdin tests
  mailmap tests: modernize syntax & test idioms
  mailmap tests: use our preferred whitespace syntax
  mailmap doc: start by mentioning the comment syntax
  check-mailmap doc: note config options
  ...
2021-01-25 14:19:19 -08:00
Ævar Arnfjörð Bjarmason 4e168333a8 shortlog: remove unused(?) "repo-abbrev" feature
Remove support for the magical "repo-abbrev" comment in .mailmap
files. This was added to .mailmap parsing in [1], as a generalized
feature of the git-shortlog Perl script added earlier in [2].

There was no documentation or tests for this feature, and I don't
think it's used in practice anymore.

What it did was to allow you to specify a single string to be
search-replaced with "/.../" in the .mailmap file. E.g. for
linux.git's current .mailmap:

    git archive --remote=git@gitlab.com:linux-kernel/linux.git \
        HEAD -- .mailmap | grep -a repo-abbrev
    # repo-abbrev: /pub/scm/linux/kernel/git/

Then when running e.g.:

    git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge

We'd emit (the [...] is mine):

      Merge tag [...]git://git.kernel.org/.../tip/tip

But will now emit:

      Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip

I think at this point this is just a historical artifact we can get
rid of. It was initially meant for Linus's own use when we integrated
the Perl script[2], but since then it seems he's stopped using it.

Digging through Linus's release announcements on the LKML[3] the last
release I can find that made use of this output is Linux 2.6.25-rc6
back in March 2008[4]. Later on Linus started using --no-merges[5],
and nowadays seems to prefer some custom not-quite-shortlog format of
merges from lieutenants[6].

You will still see it on linux.git if you run "git shortlog" manually
yourself with --merges, with this removed you can still get the same
output with:

    git log --pretty=fuller v5.10-rc7..v5.10 |
    sed 's!/pub/scm/linux/kernel/git/!/.../!g' |
    git shortlog

Arguably we should do the same for the search-replacing of "[PATCH]"
at the beginning with "". That seems to be another relic of a bygone
era when linux.git patches would have their E-Mail subject lines
applied as-is by "git am" or whatever. But we documented that feature
in "git-shortlog(1)", and it seems more widely applicable than
something purely kernel-specific.

1. 7595e2ee6e (git-shortlog: make common repository prefix
   configurable with .mailmap, 2006-11-25)
2. fa375c7f1b (Add git-shortlog perl script, 2005-06-04)
3. https://lore.kernel.org/lkml/
4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/
5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/
6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
ZheNing Hu e73fe3dd02 builtin/*: update usage format
According to the guidelines in parse-options.h,
we should not end in a full stop or start with
a capital letter. Fix old error and usage
messages to match this expectation.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:10:49 -08:00
Philippe Blain 3af31e8786 blame: simplify 'setup_blame_bloom_data' interface
The penultimate commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_blame_bloom_data'. Since 'cmd_blame' is the only caller
of 'setup_blame_bloom_data', it is now unnecessary for
'setup_blame_bloom_data' to receive 'path' as a separate argument, as
'sb.path' is already initialized.

Remove this argument from setup_blame_bloom_data's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain 88894aaeea blame: simplify 'setup_scoreboard' interface
The previous commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_scoreboard'. Since 'cmd_blame' is the only caller of
'setup_scoreboard', it is now unnecessary for 'setup_scoreboard' to
receive 'path' as a separate argument, as 'sb.path' is already
initialized.

Remove this argument from setup_scoreboard's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain 9466e3809d blame: enable funcname blaming with userdiff driver
In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
blame_scoreboard' as the 'path' argument to
'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
to the local variable 'path' a few lines later at line 1137.

This 'path' argument is only used in 'parse_range_arg' if we are blaming
a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
is sent to 'parse_range_funcname', where it is used to determine if a
userdiff driver should be used for said <path> to match the given
funcname.

Since 'path' is yet unset, the userdiff driver is never used, so we fall
back to the default funcname regex, which is usually not appropriate for
paths that are set to use a specific userdiff driver, and thus either we
match some unrelated lines, or we die with

    fatal: -L parameter '<funcname>' starting at line 1: no match

This has been the case ever since `git blame` learned to blame a
funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
funcname, 2013-03-28).

Enable funcname blaming for paths using specific userdiff drivers by
initializing 'sb.path' earlier in 'cmd_blame', when some of its other
fields are initialized, so that it is set when passed to
'parse_range_arg'.

Add a regression test in 'annotate-tests.sh', which is sourced in
t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used
to test the userdiff patterns in t4018-diff-funcname.

Also, use 'sb.path' instead of 'path' when constructing the error
message at line 1114, for consistency.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain 180d641d7d line-log: mention both modes in 'blame' and 'log' short help
'git blame -h' and 'git log -h' both show '-L <n,m>' and describe this
option as "Process only line range n,m, counting from 1". No hint is
given that a function name regex can also be used.

Use <range> instead, and expand the description of the option to mention
both modes. Remove "counting from 1" as it's uneeded; it's uncommon to
refer to the first line of a file as "line 0".

Also, for 'git log', improve the wording to better reflect the long help.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:14 -08:00
René Scharfe db7d07f610 blame: handle deref_tag() returning NULL
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12 12:25:14 -07:00
Junio C Hamano 230ff3e997 Merge branch 'jc/blame-ignore-fix'
"git blame --ignore-rev/--ignore-revs-file" failed to validate
their input are valid revision, and failed to take into account
that the user may want to give an annotated tag instead of a
commit, which has been corrected.

* jc/blame-ignore-fix:
  blame: validate and peel the object names on the ignore list
  t8013: minimum preparatory clean-up
2020-10-04 12:49:07 -07:00
Junio C Hamano 610e2b9240 blame: validate and peel the object names on the ignore list
The command reads list of object names to place on the ignore list
either from the command line or from a file, but they are not
checked with their object type (those read from the file are not
even checked for object existence).

Extend the oidset_parse_file() API and allow it to take a callback
that can be used to die (e.g. when an inappropriate input is read)
or modify the object name read (e.g. when a tag pointing at a commit
is read, and the caller wants a commit object name), and use it in
the code that handles ignore list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 22:20:58 -07:00