1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-25 14:46:09 +02:00
Commit Graph

35 Commits

Author SHA1 Message Date
Jeff King 9ccf3e9b22 config: add core.commentString
The core.commentChar code recently learned to accept more than a
single ASCII character. But using it is annoying with multiple versions
of Git, since older ones will reject it outright:

    $ git.v2.44.0 -c core.commentchar=foo stripspace -s
    error: core.commentChar should only be one ASCII character
    fatal: unable to parse 'core.commentchar' from command-line config

Let's add an alias core.commentString. That's arguably a better name
anyway, since we now can handle strings, and it makes it possible to
have a config that works reasonably with both old and new versions of
Git (see the example in the documentation).

This is strictly an alias, so there's not much point in adding duplicate
tests; I added a single one to t0030 that exercises the alias code.

Note also that the error messages for invalid values will now show the
variable the config parser handed us, and thus will be normalized to
lowercase (rather than camelcase). A few tests in t0030 are adjusted to
match.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-27 08:48:54 -07:00
Jeff King 8b311478ad config: allow multi-byte core.commentChar
Now that all of the code handles multi-byte comment characters, it's
safe to allow users to set them.

There is one special case I kept: we still will not allow an empty
string for the commentChar. While it might make sense in some contexts
(e.g., output where you don't want any comment prefix), there are plenty
where it will behave badly (e.g., all of our starts_with() checks will
indicate that every line is a comment!). It might be reasonable to
assign some meaningful semantics, but it would probably involve checking
how each site behaves. In the interim let's forbid it and we can loosen
things later.

Likewise, the "commentChar cannot be a newline" rule is now extended to
"it cannot contain a newline" (for the same reason: it can confuse our
parsing loops).

Since comment_line_str is used in many parts of the code, it's hard to
cover all possibilities with tests. We can convert the existing
double-semicolon prefix test to show that "git status" works. And we'll
give it a more challenging case in t7507, where we confirm that
git-commit strips out the commit template along with any --verbose text
when reading the edited commit message back in. That covers the basics,
though it's possible there could be issues in more exotic spots (e.g.,
the sequencer todo list uses its own code).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12 13:28:11 -07:00
Jeff King be20128bfa add core.maxTreeDepth config
Most of our tree traversal algorithms use recursion to visit sub-trees.
For pathologically large trees, this can cause us to run out of stack
space and abort in an uncontrolled way. Let's put our own limit here so
that we can fail gracefully rather than segfaulting.

In similar cases where we recursed along the commit graph, we rewrote
the algorithms to avoid recursion and keep any stack data on the heap.
But the commit graph is meant to grow without bound, whereas it's not an
imposition to put a limit on the maximum size of tree we'll handle.

And this has a bonus side effect: coupled with a limit on individual
tree entry names, this limits the total size of a path we may encounter.
This gives us an extra protection against code handling long path names
which may suffer from integer overflows in the size (which could then be
exploited by malicious trees).

The default of 4096 is set to be much longer than anybody would care
about in the real world. Even with single-letter interior tree names
(like "a/b/c"), such a path is at least 8191 bytes. While most operating
systems will let you create such a path incrementally, trying to
reference the whole thing in a system call (as Git would do when
actually trying to access it) will result in ENAMETOOLONG. Coupled with
the recent fsck.largePathname warning, the maximum total pathname Git
will handle is (by default) 16MB.

This config option doesn't do anything yet; future patches will convert
various algorithms to respect the limit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31 15:51:07 -07:00
Philip Oakley 776ba91a5e doc: use "commit-graph" hyphenation consistently
Note, historical release notes have not been updated.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 19:58:40 -04:00
Junio C Hamano 6a591a3173 Merge branch 'ma/sparse-checkout-cone-doc-fix'
Docfix.

* ma/sparse-checkout-cone-doc-fix:
  config/core.txt: fix minor issues for `core.sparseCheckoutCone`
2022-07-27 09:16:53 -07:00
Martin Ågren ae436f283c config/core.txt: fix minor issues for `core.sparseCheckoutCone`
The sparse checkout feature can be used in "cone mode" or "non-cone
mode". In this one instance in the documentation, we refer to the latter
as "non cone mode" with whitespace rather than a hyphen. Align this with
the rest of our documentation.

A few words later in the same paragraph, there's mention of "a more
flexible patterns". Drop that leading "a" to fix the grammar.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 09:39:20 -07:00
Han Xin aaf81223f4 unpack-objects: use stream_loose_object() to unpack large objects
Make use of the stream_loose_object() function introduced in the
preceding commit to unpack large objects. Before this we'd need to
malloc() the size of the blob before unpacking it, which could cause
OOM with very large blobs.

We could use the new streaming interface to unpack all blobs, but
doing so would be much slower, as demonstrated e.g. with this
benchmark using git-hyperfine[0]:

	rm -rf /tmp/scalar.git &&
	git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
	mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
	git hyperfine \
		-r 2 --warmup 1 \
		-L rev origin/master,HEAD -L v "10,512,1k,1m" \
		-s 'make' \
		-p 'git init --bare dest.git' \
		-c 'rm -rf dest.git' \
		'./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'

Here we'll perform worse with lower core.bigFileThreshold settings
with this change in terms of speed, but we're getting lower memory use
in return:

	Summary
	  './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
	    1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
	    1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
	    1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'

A better benchmark to demonstrate the benefits of that this one, which
creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:

	rm -rf /tmp/repo &&
	git init /tmp/repo &&
	(
		cd /tmp/repo &&
		for i in 1 25 50 75 100
		do
			dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
		done &&
		git add blob.* &&
		git commit -mblobs &&
		git gc &&
		PACK=$(echo .git/objects/pack/pack-*.pack) &&
		cp "$PACK" my.pack
	) &&
	git hyperfine \
		--show-output \
		-L rev origin/master,HEAD -L v "512,50m,100m" \
		-s 'make' \
		-p 'git init --bare dest.git' \
		-c 'rm -rf dest.git' \
		'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'

Using this test we'll always use >100MB of memory on
origin/master (around ~105MB), but max out at e.g. ~55MB if we set
core.bigFileThreshold=50m.

The relevant "Maximum resident set size" lines were manually added
below the relevant benchmark:

  '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
        Maximum resident set size (kbytes): 107080
    1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
        Maximum resident set size (kbytes): 106968
    1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
        Maximum resident set size (kbytes): 107032
    1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 107072
    1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 55704
    2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
        Maximum resident set size (kbytes): 4564

This shows that if you have enough memory this new streaming method is
slower the lower you set the streaming threshold, but the benefit is
more bounded memory use.

An earlier version of this patch introduced a new
"core.bigFileStreamingThreshold" instead of re-using the existing
"core.bigFileThreshold" variable[1]. As noted in a detailed overview
of its users in [2] using it has several different meanings.

Still, we consider it good enough to simply re-use it. While it's
possible that someone might want to e.g. consider objects "small" for
the purposes of diffing but "big" for the purposes of writing them
such use-cases are probably too obscure to worry about. We can always
split up "core.bigFileThreshold" in the future if there's a need for
that.

0. https://github.com/avar/git-hyperfine/
1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:36 -07:00
Ævar Arnfjörð Bjarmason 3c3ca0b0c1 core doc: modernize core.bigFileThreshold documentation
The core.bigFileThreshold documentation has been largely unchanged
since 5eef828bc0 (fast-import: Stream very large blobs directly to
pack, 2010-02-01).

But since then this setting has been expanded to affect a lot more
than that description indicated. Most notably in how "git diff" treats
them, see 6bf3b81348 (diff --stat: mark any file larger than
core.bigfilethreshold binary, 2014-08-16).

In addition to that, numerous commands and APIs make use of a
streaming mode for files above this threshold.

So let's attempt to summarize 12 years of changes in behavior, which
can be seen with:

    git log --oneline -Gbig_file_thre 5eef828bc03.. -- '*.c'

To do that turn this into a bullet-point list. The summary Han Xin
produced in [1] helped a lot, but is a bit too detailed for
documentation aimed at users. Let's instead summarize how
user-observable behavior differs, and generally describe how we tend
to stream these files in various commands.

1. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/

Helped-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13 10:22:35 -07:00
Junio C Hamano 83937e9592 Merge branch 'ns/batch-fsync'
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.

* ns/batch-fsync:
  core.fsyncmethod: performance tests for batch mode
  t/perf: add iteration setup mechanism to perf-lib
  core.fsyncmethod: tests for batch mode
  test-lib-functions: add parsing helpers for ls-files and ls-tree
  core.fsync: use batch mode and sync loose objects by default on Windows
  unpack-objects: use the bulk-checkin infrastructure
  update-index: use the bulk-checkin infrastructure
  builtin/add: add ODB transaction around add_files_to_cache
  cache-tree: use ODB transaction around writing a tree
  core.fsyncmethod: batched disk flushes for loose-objects
  bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
2022-06-03 14:30:34 -07:00
Junio C Hamano 377d347eb3 Merge branch 'en/sparse-cone-becomes-default'
Deprecate non-cone mode of the sparse-checkout feature.

* en/sparse-cone-becomes-default:
  Documentation: some sparsity wording clarifications
  git-sparse-checkout.txt: mark non-cone mode as deprecated
  git-sparse-checkout.txt: flesh out pattern set sections a bit
  git-sparse-checkout.txt: add a new EXAMPLES section
  git-sparse-checkout.txt: shuffle some sections and mark as internal
  git-sparse-checkout.txt: update docs for deprecation of 'init'
  git-sparse-checkout.txt: wording updates for the cone mode default
  sparse-checkout: make --cone the default
  tests: stop assuming --no-cone is the default mode for sparse-checkout
2022-06-03 14:30:33 -07:00
Elijah Newren 2d95707a02 sparse-checkout: make --cone the default
Make cone mode the default, and update the documentation accordingly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-21 23:12:38 -07:00
Todd Zullinger f3ea4bed2a doc: replace "--" with {litdd} in credential-cache/fsmonitor
Asciidoc renders `--` as em-dash.  This is not appropriate for command
names.  It also breaks linkgit links to these commands.

Fix git-credential-cache--daemon and git-fsmonitor--daemon.  The latter
was added 3248486920 (fsmonitor: document builtin fsmonitor, 2022-03-25)
and included several links.  A check for broken links in the HTML docs
turned this up.

Manually inspecting the other Documentation/git-*--*.txt files turned up
the issue in git-credential-cache--daemon.

While here, quote `git credential-cache--daemon` in the synopsis to
match the vast majority of our other documentation.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 16:06:06 -07:00
Neeraj Singh c0f4752ed2 core.fsyncmethod: batched disk flushes for loose-objects
When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin odb-transaction functionality, takes
advantage of tmp-objdir, and uses the writeout-only support code.

When the new mode is enabled, we do the following for each new object:
1a. Create the object in a tmp-objdir.
2a. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1b. Issue an fsync against a dummy file to flush the log and hardware
   writeback cache, which should by now have seen the tmp-objdir writes.
2b. Rename all of the tmp-objdir files to their final names.
3b. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the default
   today, but the user now has the option of syncing the index and there
   is a separate patch series to implement syncing of refs.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns. This sequence also ensures that no object files
appear in the main object store unless they are fsync-durable.

Batch mode is only enabled if core.fsync includes loose-objects. If
the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
not include loose-objects, we will use file-by-file fsyncing.

In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
work if no loose objects are ever added to the ODB. We use a tmp-objdir
to maintain the invariant that no loose-objects are visible in the main
ODB unless they are properly fsync-durable. This is important since
future ODB operations that try to create an object with specific
contents will silently drop the new data if an object with the target
hash exists without checking that the loose-object contents match the
hash. Only a full git-fsck would restore the ODB to a functional state
where dataloss doesn't occur.

In step (1b) of the sequence, we issue a fsync against a dummy file
created specifically for the purpose. This method has a little higher
cost than using one of the input object files, but makes adding new
callers of this mechanism easier, since we don't need to figure out
which object file is "last" or risk sharing violations by caching the fd
of the last object file.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.

Adding 500 files to the repo with 'git add' Times reported in seconds.

object file syncing | Linux | Mac   | Windows
--------------------|-------|-------|--------
           disabled | 0.06  |  0.35 | 0.61
              fsync | 1.88  | 11.18 | 2.47
              batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:13:01 -07:00
Junio C Hamano 439c1e6d5d Merge branch 'jh/builtin-fsmonitor-part2'
Built-in fsmonitor (part 2).

* jh/builtin-fsmonitor-part2: (30 commits)
  t7527: test status with untracked-cache and fsmonitor--daemon
  fsmonitor: force update index after large responses
  fsmonitor--daemon: use a cookie file to sync with file system
  fsmonitor--daemon: periodically truncate list of modified files
  t/perf/p7519: add fsmonitor--daemon test cases
  t/perf/p7519: speed up test on Windows
  t/perf/p7519: fix coding style
  t/helper/test-chmtime: skip directories on Windows
  t/perf: avoid copying builtin fsmonitor files into test repo
  t7527: create test for fsmonitor--daemon
  t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  help: include fsmonitor--daemon feature flag in version info
  fsmonitor--daemon: implement handle_client callback
  compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
  compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
  compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
  fsmonitor--daemon: create token-based changed path cache
  fsmonitor--daemon: define token-ids
  fsmonitor--daemon: add pathname classification
  fsmonitor--daemon: implement 'start' command
  ...
2022-04-04 10:56:24 -07:00
Jeff Hostetler 3248486920 fsmonitor: document builtin fsmonitor
Document how `core.fsmonitor` can be set to a boolean to enable
or disable the builtin FSMonitor.

Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
pointers to `Watchman` to refer to it.

Create `git-fsmonitor--daemon` manual page and describe its features.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25 16:04:15 -07:00
Patrick Steinhardt bc22d845c4 core.fsync: new option to harden references
When writing both loose and packed references to disk we first create a
lockfile, write the updated values into that lockfile, and on commit we
rename the file into place. According to filesystem developers, this
behaviour is broken because applications should always sync data to disk
before doing the final rename to ensure data consistency [1][2][3]. If
applications fail to do this correctly, a hard crash of the machine can
easily result in corrupted on-disk data.

This kind of corruption can in fact be easily observed with Git when the
machine hard-resets shortly after writing references to disk. On
machines with ext4, this will likely lead to the "empty files" problem:
the file has been renamed, but its data has not been synced to disk. The
result is that the reference is corrupt, and in the worst case this can
lead to data loss.

Implement a new option to harden references so that users and admins can
avoid this scenario by syncing locked loose and packed references to
disk before we rename them into place.

[1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
[2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-15 13:30:58 -07:00
Neeraj Singh b9f5d0358d core.fsync: documentation and user-friendly aggregate options
This commit adds aggregate options for the core.fsync setting that are
more user-friendly. These options are specified in terms of 'levels of
safety', indicating which Git operations are considered to be sync
points for durability.

The new documentation is also included here in its entirety for ease of
review.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-15 12:32:55 -07:00
Neeraj Singh 844a8ad4f8 core.fsync: add configuration parsing
This change introduces code to parse the core.fsync setting and
configure the fsync_components variable.

core.fsync is configured as a comma-separated list of component names to
sync. Each time a core.fsync variable is encountered in the
configuration heirarchy, we start off with a clean state with the
platform default value. Passing 'none' resets the value to indicate
nothing will be synced. We gather all negative and positive entries from
the comma separated list and then compute the new value by removing all
the negative entries and adding all of the positive entries.

We issue a warning for components that are not recognized so that the
configuration code is compatible with configs from future versions of
Git with more repo components.

Complete documentation for the new setting is included in a later patch
in the series so that it can be reviewed once in final form.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Neeraj Singh abf38abec2 core.fsyncmethod: add writeout-only mode
This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.

The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.

Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.

When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.

On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Junio C Hamano 6dbbae17d9 Merge branch 'ew/decline-core-abbrev'
The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.

* ew/decline-core-abbrev:
  core.abbrev=no disables abbreviations
2021-01-15 15:20:28 -08:00
Eric Wong a9ecaa06a7 core.abbrev=no disables abbreviations
This allows users to write hash-agnostic scripts and configs by
disabling abbreviations.  Using "-c core.abbrev=40" will be
insufficient with SHA-256, and "-c core.abbrev=64" won't work with
SHA-1 repos today.

Signed-off-by: Eric Wong <e@80x24.org>
[jc: tweaked implementation, added doc and a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-23 13:40:09 -08:00
Junio C Hamano 52b8c8c716 Merge branch 'ds/maintenance-part-2'
"git maintenance", an extended big brother of "git gc", continues
to evolve.

* ds/maintenance-part-2:
  maintenance: add incremental-repack auto condition
  maintenance: auto-size incremental-repack batch
  maintenance: add incremental-repack task
  midx: use start_delayed_progress()
  midx: enable core.multiPackIndex by default
  maintenance: create auto condition for loose-objects
  maintenance: add loose-objects task
  maintenance: add prefetch task
2020-10-27 15:09:47 -07:00
Derrick Stolee 18e449f86b midx: enable core.multiPackIndex by default
The core.multiPackIndex setting has been around since c4d25228eb
(config: create core.multiPackIndex setting, 2018-07-12), but has been
disabled by default. If a user wishes to use the multi-pack-index
feature, then they must enable this config and run 'git multi-pack-index
write'.

The multi-pack-index feature is relatively stable now, so make the
config option true by default. For users that do not use a
multi-pack-index, the only extra cost will be a file lookup to see if a
multi-pack-index file exists (once per process, per object directory).

Also, this config option will be referenced by an upcoming
"incremental-repack" task in the maintenance builtin, so move the config
option into the repository settings struct. Note that if
GIT_TEST_MULTI_PACK_INDEX=1, then we want to ignore the config option
and treat core.multiPackIndex as enabled.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
Jonathan Tan 009be0d26d Documentation: deltaBaseCacheLimit is per-thread
Clarify that core.deltaBaseCacheLimit is per-thread, as can be seen from
the fact that cache usage (base_cache_used in struct thread_local in
builtin/index-pack.c) is tracked individually for each thread and
compared against delta_base_cache_limit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 13:55:06 -07:00
Kevin Willford dfaed02862 fsmonitor: update documentation for hook version and watchman hooks
A new config value for core.fsmonitorHookVersion was added to be able
to force the version of the fsmonitor hook.  Possible values are 1 or 2.
When this is not set the code will use a value of -1 and attempt to use
version 2 of the hook first and if that fails will attempt version 1.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-23 15:10:23 -08:00
Junio C Hamano bd72a08d6c Merge branch 'ds/sparse-cone'
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.

* ds/sparse-cone: (21 commits)
  sparse-checkout: improve OS ls compatibility
  sparse-checkout: respect core.ignoreCase in cone mode
  sparse-checkout: check for dirty status
  sparse-checkout: update working directory in-process for 'init'
  sparse-checkout: cone mode should not interact with .gitignore
  sparse-checkout: write using lockfile
  sparse-checkout: use in-process update for disable subcommand
  sparse-checkout: update working directory in-process
  sparse-checkout: sanitize for nested folders
  unpack-trees: add progress to clear_ce_flags()
  unpack-trees: hash less in cone mode
  sparse-checkout: init and set in cone mode
  sparse-checkout: use hashmaps for cone patterns
  sparse-checkout: add 'cone' mode
  trace2: add region in clear_ce_flags
  sparse-checkout: create 'disable' subcommand
  sparse-checkout: add '--stdin' option to set subcommand
  sparse-checkout: 'set' subcommand
  clone: add --sparse mode
  sparse-checkout: create 'init' subcommand
  ...
2019-12-25 11:21:58 -08:00
Johannes Schindelin ac33519ddf mingw: restrict file handle inheritance only on Windows 7 and later
Turns out that it don't work so well on Vista, see
https://github.com/git-for-windows/git/issues/1742 for details.

According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it
*should* work on Windows Vista and later.

But apparently there are issues on Windows Vista when pipes are
involved. Given that Windows Vista is past its end of life (official
support ended on April 11th, 2017), let's not spend *too* much time on
this issue and just disable the file handle inheritance restriction on
any Windows version earlier than Windows 7.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23 11:17:01 +09:00
Derrick Stolee 879321eb0b sparse-checkout: add 'cone' mode
The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.

Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.

The config option does nothing right now, but will be expanded
upon in a later commit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22 16:11:44 +09:00
Derrick Stolee c6cc4c5afd repo-settings: create feature.manyFiles setting
The feature.manyFiles setting is suitable for repos with many
files in the working directory. By setting index.version=4 and
core.untrackedCache=true, commands such as 'git status' should
improve.

While adding this setting, modify the index version precedence
tests to check how this setting overrides the default for
index.version is unset.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-13 13:33:55 -07:00
Derrick Stolee 31b1de6a09 commit-graph: turn on commit-graph by default
The commit-graph feature has seen a lot of activity in the past
year or so since it was introduced. The feature is a critical
performance enhancement for medium- to large-sized repos, and
does not significantly hurt small repos.

Change the defaults for core.commitGraph and gc.writeCommitGraph
to true so users benefit from this feature by default.

There are several places in the test suite where the environment
variable GIT_TEST_COMMIT_GRAPH is disabled to avoid reading a
commit-graph, if it exists. The config option overrides the
environment, so swap these. Some GIT_TEST_COMMIT_GRAPH assignments
remain, and those are to avoid writing a commit-graph when a new
commit is created.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-13 13:33:55 -07:00
Corentin BOMPARD 68ed71b53c doc: format pathnames and URLs as monospace.
Applying CodingGuidelines about monospace on pathnames and URLs.

See Documentation/CodingGuidelines.txt for more information.

Signed-off-by: Corentin BOMPARD <corentin.bompard@etu.univ-lyon1.fr>
Signed-off-by: Nathan BERBEZIER <nathan.berbezier@etu.univ-lyon1.fr>
Signed-off-by: Pablo CHABANNE <pablo.chabanne@etu.univ-lyon1.fr>
Signed-off-by: Matthieu MOY <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-13 11:14:22 +09:00
Jeff King c9446f0504 docs/config: clarify "text property" in core.eol
The word "property" is vague here. Let's spell out that we mean the path
must be marked with the text attribute.

While we're here, let's make the paragraph a little easier to read by
de-emphasizing the "when core.autocrlf is false" bit. Putting it in the
first sentence obscures the main content, and many readers won't care
about autocrlf (i.e., anyone who is just following the gitattributes(7)
advice, which mainly discusses "text" and "core.eol").

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:21:33 -08:00
Junio C Hamano 6c268fdda9 Merge branch 'js/mingw-perl5lib'
Windows fix.

* js/mingw-perl5lib:
  mingw: unset PERL5LIB by default
  config: move Windows-specific config settings into compat/mingw.c
  config: allow for platform-specific core.* config settings
  config: rename `dummy` parameter to `cb` in git_default_config()
2018-11-13 22:37:20 +09:00
Junio C Hamano fd7761a1cd Merge branch 'nd/config-split'
Split the overly large Documentation/config.txt file into million
little pieces.  This potentially allows each individual piece
included into the manual page of the command it affects more easily.

* nd/config-split: (81 commits)
  config.txt: remove config/dummy.txt
  config.txt: move worktree.* to a separate file
  config.txt: move web.* to a separate file
  config.txt: move versionsort.* to a separate file
  config.txt: move user.* to a separate file
  config.txt: move url.* to a separate file
  config.txt: move uploadpack.* to a separate file
  config.txt: move uploadarchive.* to a separate file
  config.txt: move transfer.* to a separate file
  config.txt: move tag.* to a separate file
  config.txt: move submodule.* to a separate file
  config.txt: move stash.* to a separate file
  config.txt: move status.* to a separate file
  config.txt: move splitIndex.* to a separate file
  config.txt: move showBranch.* to a separate file
  config.txt: move sequencer.* to a separate file
  config.txt: move sendemail-config.txt to config/
  config.txt: move reset.* to a separate file
  config.txt: move rerere.* to a separate file
  config.txt: move repack.* to a separate file
  ...
2018-11-13 22:37:16 +09:00
Nguyễn Thái Ngọc Duy 1a394fa9ad config.txt: move core.* to a separate file
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-29 10:17:00 +09:00