1
0
mirror of https://github.com/git/git.git synced 2024-10-20 08:48:12 +02:00
Commit Graph

66175 Commits

Author SHA1 Message Date
Neeraj Singh
c0f4752ed2 core.fsyncmethod: batched disk flushes for loose-objects
When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin odb-transaction functionality, takes
advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1a. Create the object in a tmp-objdir.
2a. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1b. Issue an fsync against a dummy file to flush the log and hardware
   writeback cache, which should by now have seen the tmp-objdir writes.
2b. Rename all of the tmp-objdir files to their final names.
3b. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns. This sequence also ensures that no object files
appear in the main object store unless they are fsync-durable.

Batch mode is only enabled if core.fsync includes loose-objects. If
the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
not include loose-objects, we will use file-by-file fsyncing.

In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
work if no loose objects are ever added to the ODB. We use a tmp-objdir
to maintain the invariant that no loose-objects are visible in the main
ODB unless they are properly fsync-durable. This is important since
future ODB operations that try to create an object with specific
contents will silently drop the new data if an object with the target
hash exists without checking that the loose-object contents match the
hash. Only a full git-fsck would restore the ODB to a functional state
where dataloss doesn't occur.

In step (1b) of the sequence, we issue a fsync against a dummy file
created specifically for the purpose. This method has a little higher
cost than using one of the input object files, but makes adding new
callers of this mechanism easier, since we don't need to figure out
which object file is "last" or risk sharing violations by caching the fd
of the last object file.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:13:01 -07:00
Neeraj Singh
2c23d1b477 bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
Make it clearer in the naming and documentation of the plug_bulk_checkin
and unplug_bulk_checkin APIs that they can be thought of as
a "transaction" to optimize operations on the object database. These
transactions may be nested so that subsystems like the cache-tree
writing code can optimize their operations without caring whether the
top-level code has a transaction active.

Add a flush_odb_transaction API that will be used in update-index to
make objects visible even if a transaction is active. The flush call may
also be useful in future cases if we hold a transaction active around
calling hooks.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:02:09 -07:00
Neeraj Singh
897c9e2575 bulk-checkin: rename 'state' variable and separate 'plugged' boolean
This commit prepares for adding batch-fsync to the bulk-checkin
infrastructure.

The bulk-checkin infrastructure is currently used to batch up addition
of large blobs to a packfile. When a blob is larger than
big_file_threshold, we unconditionally add it to a pack. If bulk
checkins are 'plugged', we allow multiple large blobs to be added to a
single pack until we reach the packfile size limit; otherwise, we simply
make a new packfile for each large blob. The 'unplug' call tells us when
the series of blob additions is done so that we can finish the packfiles
and make their objects available to subsequent operations.

Stated another way, bulk-checkin allows callers to define a transaction
that adds multiple objects to the object database, where the object
database can optimize its internal operations within the transaction
boundary.

Batched fsync will fit into bulk-checkin by taking advantage of the
plug/unplug functionality to determine the appropriate time to fsync
and make newly-added objects available in the primary object database.

* Rename 'state' variable to 'bulk_checkin_packfile', since we will
  later be adding 'bulk_fsync_objdir'. This also makes the variable
  easier to find in the debugger, since the name is more unique.

* Rename finish_bulk_checkin to flush_bulk_checkin_packfile and call it
  unconditionally from unplug_bulk_checkin. Internally it will
  conditionally do a flush if there's any work to do.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

The net effect of these changes is to make a clear separation between
the portion of the bulk-checkin infrastructure that is related to the
packfile (nearly all of it at present) and the part that is related to
other future optimizations of the ODB.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:02:09 -07:00
Junio C Hamano
fca85986bb Merge branch 'ns/core-fsyncmethod' into ns/batch-fsync
* ns/core-fsyncmethod:
  configure.ac: fix HAVE_SYNC_FILE_RANGE definition
  core.fsyncmethod: correctly camel-case warning message
  core.fsync: fix incorrect expression for default configuration
  core.fsync: documentation and user-friendly aggregate options
  core.fsync: new option to harden the index
  core.fsync: add configuration parsing
  core.fsync: introduce granular fsync control infrastructure
  core.fsyncmethod: add writeout-only mode
  wrapper: make inclusion of Windows csprng header tightly scoped
2022-04-06 13:01:54 -07:00
Adam Dinwoodie
2e37594797 configure.ac: fix HAVE_SYNC_FILE_RANGE definition
If sync_file_range is not available when building the configure script,
there is a cosmetic bug when running that script reporting
"HAVE_SYNC_FILE_RANGE: command not found".  Remove that error message by
defining HAVE_SYNC_FILE_RANGE to an empty string, rather than generating
a script where that appears as a bare command.

Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 09:36:19 -07:00
Neeraj Singh
f12f3b9807 core.fsyncmethod: correctly camel-case warning message
The warning for an unrecognized fsyncMethod was not
camel-cased.

Reported-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-30 14:46:08 -07:00
Neeraj Singh
e5ec440c98 core.fsync: fix incorrect expression for default configuration
Commit b9f5d035 (core.fsync: documentation and user-friendly
aggregate options, 2022-03-15) introduced an incorrect value for
FSYNC_COMPONENTS_DEFAULT. We need an AND-NOT rather than OR-NOT.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-29 16:04:16 -07:00
Neeraj Singh
b9f5d0358d core.fsync: documentation and user-friendly aggregate options
This commit adds aggregate options for the core.fsync setting that are
more user-friendly. These options are specified in terms of 'levels of
safety', indicating which Git operations are considered to be sync
points for durability.

The new documentation is also included here in its entirety for ease of
review.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-15 12:32:55 -07:00
Junio C Hamano
b896f729e2 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-13 22:56:18 +00:00
Junio C Hamano
ccafbbfb4e Merge branch 'ab/plug-random-leaks'
Plug random memory leaks.

* ab/plug-random-leaks:
  repository.c: free the "path cache" in repo_clear()
  range-diff: plug memory leak in read_patches()
  range-diff: plug memory leak in common invocation
  lockfile API users: simplify and don't leak "path"
  commit-graph: stop fill_oids_from_packs() progress on error and free()
  commit-graph: fix memory leak in misused string_list API
  submodule--helper: fix trivial leak in module_add()
  transport: stop needlessly copying bundle header references
  bundle: call strvec_clear() on allocated strvec
  remote-curl.c: free memory in cmd_main()
  urlmatch.c: add and use a *_release() function
  diff.c: free "buf" in diff_words_flush()
  merge-base: free() allocated "struct commit **" list
  index-pack: fix memory leaks
2022-03-13 22:56:18 +00:00
Junio C Hamano
4eb845ac0a Merge branch 'nj/read-tree-doc-reffix'
Documentation mark-up fix.

* nj/read-tree-doc-reffix:
  Documentation: git-read-tree: separate links using commas
2022-03-13 22:56:18 +00:00
Junio C Hamano
386f806c7d Merge branch 'ps/fetch-atomic-fixup'
Test simplification.

* ps/fetch-atomic-fixup:
  t5503: simplify setup of test which exercises failure of backfill
2022-03-13 22:56:17 +00:00
Junio C Hamano
21b839e606 Merge branch 'fs/gpgsm-update'
Newer version of GPGSM changed its output in a backward
incompatible way to break our code that parses its output.  It also
added more processes our tests need to kill when cleaning up.
Adjustments have been made to accommodate these changes.

* fs/gpgsm-update:
  t/lib-gpg: kill all gpg components, not just gpg-agent
  t/lib-gpg: reload gpg components after updating trustlist
  gpg-interface/gpgsm: fix for v2.3
2022-03-13 22:56:17 +00:00
Junio C Hamano
bde1e3e80a Merge branch 'gc/parse-tree-indirect-errors'
Check the return value from parse_tree_indirect() to turn segfaults
into calls to die().

* gc/parse-tree-indirect-errors:
  checkout, clone: die if tree cannot be parsed
2022-03-13 22:56:17 +00:00
Junio C Hamano
8b44e05abf Merge branch 'en/merge-ort-align-verbosity-with-recursive'
Align the level of verbose output from the ort backend during inner
merge to that of the recursive backend.

* en/merge-ort-align-verbosity-with-recursive:
  merge-ort: exclude messages from inner merges by default
2022-03-13 22:56:17 +00:00
Junio C Hamano
f62106d750 Merge branch 'ab/make-optim-noop'
Makefile refactoring with a bit of suffixes rule stripping to
optimize the runtime overhead.

* ab/make-optim-noop:
  Makefiles: add and use wildcard "mkdir -p" template
  Makefile: add "$(QUIET)" boilerplate to shared.mak
  Makefile: move $(comma), $(empty) and $(space) to shared.mak
  Makefile: move ".SUFFIXES" rule to shared.mak
  Makefile: define $(LIB_H) in terms of $(FIND_SOURCE_FILES)
  Makefile: disable GNU make built-in wildcard rules
  Makefiles: add "shared.mak", move ".DELETE_ON_ERROR" to it
  scalar Makefile: use "The default target of..." pattern
2022-03-13 22:56:17 +00:00
Junio C Hamano
851d2f0ab1 Merge branch 'ps/fetch-atomic'
"git fetch" can make two separate fetches, but ref updates coming
from them were in two separate ref transactions under "--atomic",
which has been corrected.

* ps/fetch-atomic:
  fetch: make `--atomic` flag cover pruning of refs
  fetch: make `--atomic` flag cover backfilling of tags
  refs: add interface to iterate over queued transactional updates
  fetch: report errors when backfilling tags fails
  fetch: control lifecycle of FETCH_HEAD in a single place
  fetch: backfill tags before setting upstream
  fetch: increase test coverage of fetches
2022-03-13 22:56:16 +00:00
Neeraj Singh
ba95e96d4c core.fsync: new option to harden the index
This commit introduces the new ability for the user to harden
the index. In the event of a system crash, the index must be
durable for the user to actually find a file that has been added
to the repo and then deleted from the working tree.

We use the presence of the COMMIT_LOCK flag and absence of the
alternate_index_output as a proxy for determining whether we're
updating the persistent index of the repo or some temporary
index. We don't sync these temporary indexes.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Neeraj Singh
844a8ad4f8 core.fsync: add configuration parsing
This change introduces code to parse the core.fsync setting and
configure the fsync_components variable.

core.fsync is configured as a comma-separated list of component names to
sync. Each time a core.fsync variable is encountered in the
configuration heirarchy, we start off with a clean state with the
platform default value. Passing 'none' resets the value to indicate
nothing will be synced. We gather all negative and positive entries from
the comma separated list and then compute the new value by removing all
the negative entries and adding all of the positive entries.

We issue a warning for components that are not recognized so that the
configuration code is compatible with configs from future versions of
Git with more repo components.

Complete documentation for the new setting is included in a later patch
in the series so that it can be reviewed once in final form.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Neeraj Singh
020406eaa5 core.fsync: introduce granular fsync control infrastructure
This commit introduces the infrastructure for the core.fsync
configuration knob. The repository components we want to sync
are identified by flags so that we can turn on or off syncing
for specific components.

If core.fsyncObjectFiles is set and the core.fsync configuration
also includes FSYNC_COMPONENT_LOOSE_OBJECT, we will fsync any
loose objects. This picks the strictest data integrity behavior
if core.fsync and core.fsyncObjectFiles are set to conflicting values.

This change introduces the currently unused fsync_component
helper, which will be used by a later patch that adds fsyncing to
the refs backend.

Actual configuration and documentation of the fsync components
list are in other patches in the series to separate review of
the underlying mechanism from the policy of how it's configured.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Neeraj Singh
abf38abec2 core.fsyncmethod: add writeout-only mode
This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.

The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.

Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.

When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.

On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Neeraj Singh
19d3f228c8 wrapper: make inclusion of Windows csprng header tightly scoped
Including NTSecAPI.h in git-compat-util.h causes build errors in any
other file that includes winternl.h. NTSecAPI.h was included in order to
get access to the RtlGenRandom cryptographically secure PRNG. This
change scopes the inclusion of ntsecapi.h to wrapper.c, which is the only
place that it's actually needed.

The build breakage is due to the definition of UNICODE_STRING in
NtSecApi.h:
    #ifndef _NTDEF_
    typedef LSA_UNICODE_STRING UNICODE_STRING, *PUNICODE_STRING;
    typedef LSA_STRING STRING, *PSTRING ;
    #endif

LsaLookup.h:
    typedef struct _LSA_UNICODE_STRING {
        USHORT Length;
        USHORT MaximumLength;
    #ifdef MIDL_PASS
        [size_is(MaximumLength/2), length_is(Length/2)]
    #endif // MIDL_PASS
        PWSTR  Buffer;
    } LSA_UNICODE_STRING, *PLSA_UNICODE_STRING;

winternl.h also defines UNICODE_STRING:
    typedef struct _UNICODE_STRING {
        USHORT Length;
        USHORT MaximumLength;
        PWSTR  Buffer;
    } UNICODE_STRING;
    typedef UNICODE_STRING *PUNICODE_STRING;

Both definitions have equivalent layouts. Apparently these internal
Windows headers aren't designed to be included together. This is
an oversight in the headers and does not represent an incompatibility
between the APIs.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Junio C Hamano
1a4874565f The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-09 13:38:46 -08:00
Junio C Hamano
1f3c5f39e0 Merge branch 'ab/help-fixes'
Updates to how command line options to "git help" are handled.

* ab/help-fixes:
  help: don't print "\n" before single-section output
  help: add --no-[external-commands|aliases] for use with --all
  help: error if [-a|-g|-c] and [-i|-m|-w] are combined
  help: correct usage & behavior of "git help --all"
  help: note the option name on option incompatibility
  help.c: split up list_all_cmds_help() function
  help tests: test "git" and "git help [-a|-g] spacing
  help.c: use puts() instead of printf{,_ln}() for consistency
  help doc: add missing "]" to "[-a|--all]"
2022-03-09 13:38:24 -08:00
Junio C Hamano
69a3b75fa6 Merge branch 'ab/c99-variadic-macros'
Remove the escape hatch we added when we introduced the weather
balloon to use variadic macros unconditionally, to make it official
that we now have a hard dependency on the feature.

* ab/c99-variadic-macros:
  C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code
  git-compat-util.h: clarify GCC v.s. C99-specific in comment
2022-03-09 13:38:24 -08:00
Junio C Hamano
4763ccd7f4 Merge branch 'hn/reftable-no-empty-keys'
General clean-up in reftable implementation, including
clarification of the API documentation, tightening the code to
honor documented length limit, etc.

* hn/reftable-no-empty-keys:
  reftable: rename writer_stats to reftable_writer_stats
  reftable: add test for length of disambiguating prefix
  reftable: ensure that obj_id_len is >= 2 on writing
  reftable: avoid writing empty keys at the block layer
  reftable: add a test that verifies that writing empty keys fails
  reftable: reject 0 object_id_len
  Documentation: object_id_len goes up to 31
2022-03-09 13:38:24 -08:00
Junio C Hamano
d169d51504 Merge branch 'jc/cat-file-batch-commands'
"git cat-file" learns "--batch-command" mode, which is a more
flexible interface than the existing "--batch" or "--batch-check"
modes, to allow different kinds of inquiries made.

* jc/cat-file-batch-commands:
  cat-file: add --batch-command mode
  cat-file: add remove_timestamp helper
  cat-file: introduce batch_mode enum to replace print_contents
  cat-file: rename cmdmode to transform_mode
2022-03-09 13:38:24 -08:00
Junio C Hamano
47be28e51e Merge branch 'pw/xdiff-alloc-fail'
Improve failure case behaviour of xdiff library when memory
allocation fails.

* pw/xdiff-alloc-fail:
  xdiff: handle allocation failure when merging
  xdiff: refactor a function
  xdiff: handle allocation failure in patience diff
  xdiff: fix a memory leak
2022-03-09 13:38:23 -08:00
Junio C Hamano
82386b4496 Merge branch 'en/present-despite-skipped'
In sparse-checkouts, files mis-marked as missing from the working tree
could lead to later problems.  Such files were hard to discover, and
harder to correct.  Automatically detecting and correcting the marking
of such files has been added to avoid these problems.

* en/present-despite-skipped:
  repo_read_index: add config to expect files outside sparse patterns
  Accelerate clear_skip_worktree_from_present_files() by caching
  Update documentation related to sparsity and the skip-worktree bit
  repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
  unpack-trees: fix accidental loss of user changes
  t1011: add testcase demonstrating accidental loss of user modifications
2022-03-09 13:38:23 -08:00
Junio C Hamano
c2162907e9 The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-06 21:25:33 -08:00
Junio C Hamano
7a4e06c42a Merge branch 'jt/ls-files-stage-recurse'
Many output modes of "ls-files" do not work with its
"--recurse-submodules" option, but the "-s" mode has been taught to
work with it.

* jt/ls-files-stage-recurse:
  ls-files: support --recurse-submodules --stage
2022-03-06 21:25:33 -08:00
Junio C Hamano
11da0a5580 Merge branch 'gc/stash-on-branch-with-multi-level-name'
"git checkout -b branch/with/multi/level/name && git stash" only
recorded the last level component of the branch name, which has
been corrected.

* gc/stash-on-branch-with-multi-level-name:
  stash: strip "refs/heads/" with skip_prefix
2022-03-06 21:25:33 -08:00
Junio C Hamano
061fd5727d Merge branch 'ah/advice-switch-requires-detach-to-detach'
The error message given by "git switch HEAD~4" has been clarified
to suggest the "--detach" option that is required.

* ah/advice-switch-requires-detach-to-detach:
  switch: mention the --detach option when dying due to lack of a branch
2022-03-06 21:25:32 -08:00
Junio C Hamano
20d34c07ea Merge branch 'ab/c99-designated-initializers'
Use designated initializers we started using in mid 2017 in more
parts of the codebase that are relatively quiescent.

* ab/c99-designated-initializers:
  fast-import.c: use designated initializers for "partial" struct assignments
  refspec.c: use designated initializers for "struct refspec_item"
  convert.c: use designated initializers for "struct stream_filter*"
  userdiff.c: use designated initializers for "struct userdiff_driver"
  archive-*.c: use designated initializers for "struct archiver"
  object-file: use designated initializers for "struct git_hash_algo"
  trace2: use designated initializers for "struct tr2_dst"
  trace2: use designated initializers for "struct tr2_tgt"
  imap-send.c: use designated initializers for "struct imap_server_conf"
2022-03-06 21:25:32 -08:00
Junio C Hamano
283e4e7cd3 Merge branch 'mc/index-pack-report-max-size'
When "index-pack" dies due to incoming data exceeding the maximum
allowed input size, include the value of the limit in the error
message.

* mc/index-pack-report-max-size:
  index-pack: clarify the breached limit
2022-03-06 21:25:32 -08:00
Junio C Hamano
6d8d81ec36 Merge branch 'ac/usage-string-fixups'
Usage-string normalization.

* ac/usage-string-fixups:
  amend remaining usage strings according to style guide
2022-03-06 21:25:32 -08:00
Junio C Hamano
a281069e77 Merge branch 'ab/test-leak-diag'
Random test-framework clean-up.

* ab/test-leak-diag:
  test-lib: add "fast_unwind_on_malloc=0" to LSAN_OPTIONS
  test-lib: make $GIT_BUILD_DIR an absolute path
  test-lib: correct and assert TEST_DIRECTORY overriding
  test-lib: add GIT_SAN_OPTIONS, inherit [AL]SAN_OPTIONS
2022-03-06 21:25:31 -08:00
Junio C Hamano
6878ea6f14 Merge branch 'ab/hook-tests'
Test modernization.

* ab/hook-tests:
  hook tests: use a modern style for "pre-push" tests
  hook tests: test for exact "pre-push" hook input
2022-03-06 21:25:31 -08:00
Junio C Hamano
ae59346f09 Merge branch 'en/merge-ort-plug-leaks'
Leakfix.

* en/merge-ort-plug-leaks:
  merge-ort: fix small memory leak in unique_path()
  merge-ort: fix small memory leak in detect_and_process_renames()
2022-03-06 21:25:31 -08:00
Junio C Hamano
aae90a156d Merge branch 'ds/worktree-docs'
Tighten the language around "working tree" and "worktree" in the
docs.

* ds/worktree-docs:
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: use 'worktree' over 'working tree'
  worktree: extract checkout_worktree()
  worktree: extract copy_sparse_checkout()
  worktree: extract copy_filtered_worktree_config()
  worktree: combine two translatable messages
2022-03-06 21:25:31 -08:00
Junio C Hamano
50e0dd8fee Merge branch 'jc/rerere-train-modernise'
Small modernization of the rerere-train script (in contrib/).

* jc/rerere-train-modernise:
  rerere-train: two fixes to the use of "git show -s"
2022-03-06 21:25:30 -08:00
Junio C Hamano
e828747001 Merge branch 'rs/bisect-executable-not-found'
A not-so-common mistake is to write a script to feed "git bisect
run" without making it executable, in which case all tests will
exit with 126 or 127 error codes, even on revisions that are marked
as good.  Try to recognize this situation and stop iteration early.

* rs/bisect-executable-not-found:
  bisect--helper: double-check run command on exit code 126 and 127
  bisect: document run behavior with exit codes 126 and 127
  bisect--helper: release strbuf and strvec on run error
  bisect--helper: report actual bisect_state() argument on error
2022-03-06 21:25:30 -08:00
Junio C Hamano
967176465a Merge branch 'en/sparse-checkout-fixes'
Further polishing of "git sparse-checkout".

* en/sparse-checkout-fixes:
  sparse-checkout: reject arguments in cone-mode that look like patterns
  sparse-checkout: error or warn when given individual files
  sparse-checkout: pay attention to prefix for {set, add}
  sparse-checkout: correctly set non-cone mode when expected
  sparse-checkout: correct reapply's handling of options
2022-03-06 21:25:30 -08:00
Junio C Hamano
b6c596fd01 Merge branch 'cg/t3903-modernize'
Test modernization.

* cg/t3903-modernize:
  tests: make the code more readable
  tests: allow testing if a path is truly a file or a directory
  t/t3903-stash.sh: replace test [-d|-f] with test_path_is_*
2022-03-06 21:25:30 -08:00
Ævar Arnfjörð Bjarmason
759f340738 repository.c: free the "path cache" in repo_clear()
The "struct path_cache" added in 102de880d24 (path.c: migrate global
git_path_* to take a repository argument, 2018-05-17) is only used
directly by code in repository.[ch] (but populated in path.[ch]).

Let's move this code to repository.[ch], and stop leaking this memory
when we run repo_clear(). To avoid the cast change it from a "const
char *" to a "char *".

This also removes the "PATH_CACHE_INIT" macro, which has never been
used for anything. For the "struct repository" we already make a hard
assumption that it (and "the_repository") can be identically
initialized by making it a "static" variable, so making use of a
"PATH_CACHE_INIT" somewhere would have been confusing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason
2d102c2bca range-diff: plug memory leak in read_patches()
Amend code added in d9c66f0b5bf (range-diff: first rudimentary
implementation, 2018-08-13) to use a "goto cleanup" pattern. This
makes for less code, and frees memory that we'd previously leak.

The reason for changing free(util) to FREE_AND_NULL(util) is because
at the end of the function we append the contents of "util" to a
"struct string_list" if it's non-NULL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason
4998e93fa6 range-diff: plug memory leak in common invocation
Create a public release_patch() version of the private free_patch()
function added in 13b5af22f39 (apply: move libified code from
builtin/apply.c to apply.{c,h}, 2016-04-22). Unlike the existing
function this one doesn't free() the "struct patch" itself, so we can
use it for variables on the stack.

Use it in range-diff.c to fix a memory leak in common range-diff
invocations, e.g.:

    git -P range-diff origin/master origin/next origin/seen

Would emit several errors when compiled with SANITIZE=leak, but now
runs cleanly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason
ef3fe21448 lockfile API users: simplify and don't leak "path"
Fix a memory leak in code added in 6c622f9f0bb (commit-graph: write
commit-graph chains, 2019-06-18). We needed to free the "lock_name" if
we encounter errors, and the "graph_name" after we'd run unlink() on
it.

For the case of write_commit_graph_file() refactoring the code to free
the "lock_name" after we were done using the "struct lock_file lk"
would have made the control flow more complex. Luckily we can free the
"lock_file" right after the hold_lock_file_for_update() call, if it
makes use of "path" at all it'll have copied its contents to a "struct
strbuf" of its own.

While I'm at it let's fix code added in fb10ca5b543 (sparse-checkout:
write using lockfile, 2019-11-21) in write_patterns_and_update() to
avoid the same complexity that I thought I needed when I wrote the
initial fix for write_commit_graph_file(). We can free the
"sparse_filename" right after calling hold_lock_file_for_update(), we
don't need to wait until we're exiting the function to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason
51a94d8ffe commit-graph: stop fill_oids_from_packs() progress on error and free()
Fix a bug in fill_oids_from_packs(), we should always stop_progress(),
but did not do so if we returned an error here. This also plugs a
memory leak in those cases by releasing the two "struct strbuf"
variables the function uses.

While I'm at it stop hardcoding "-1" here and just use the return
value of error() instead, which happens to be "-1".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:19 -08:00
Ævar Arnfjörð Bjarmason
4a0479086a commit-graph: fix memory leak in misused string_list API
When this code was migrated to the string_list API in
d88b14b3fd6 (commit-graph: use string-list API for input, 2018-06-27)
it was made to use used both STRING_LIST_INIT_NODUP and a
strbuf_detach() pattern.

Those should not be used together if string_list_clear() is expected
to free the memory, instead we need to either use STRING_LIST_INIT_DUP
with a string_list_append_nodup(), or a STRING_LIST_INIT_NODUP and
manually fiddle with the "strdup_strings" member before calling
string_list_clear(). Let's do the former.

Since "strdup_strings = 1" is set now other code might be broken by
relying on "pack_indexes" not to duplicate it strings, but that
doesn't happen. When we pass this down to write_commit_graph() that
code uses the "struct string_list" without modifying it. Let's add a
"const" to the variable to have the compiler enforce that assumption.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:18 -08:00