mirror/git - git - git.dotya.ml

mirror/ git

mirror of https://github.com/git/git.git synced 2024-05-28 22:06:15 +02:00

Author	SHA1	Message	Date
Reuven Y	e22f2daed0	docs: improve fast-forward in glossary content The text was somewhat confusing between the revision itself and the author. Signed-off-by: Reuven Yagel <robi@post.jce.ac.il> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 21:11:49 +09:00
Derrick Stolee	f6e2cd0625	read-cache: delete unused hashing methods These methods were marked as MAYBE_UNUSED in the previous change to avoid a complicated diff. Delete them entirely, since we now use the hashfile API instead of this custom hashing code. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 16:41:21 +09:00
Derrick Stolee	410334ed52	read-cache: use hashfile instead of git_hash_ctx The do_write_index() method in read-cache.c has its own hashing logic and buffering mechanism. Specifically, the ce_write() method was introduced by `4990aadc` (Speed up index file writing by chunking it nicely, 2005-04-20) and similar mechanisms were introduced a few months later in `c38138cd` (git-pack-objects: write the pack files with a SHA1 csum, 2005-06-26). Based on the timing, in the early days of the Git codebase, I figured that these roughly equivalent code paths were never unified only because it got lost in the shuffle. The hashfile API has since been used extensively in other file formats, such as pack-indexes, multi-pack-indexes, and commit-graphs. Therefore, it seems prudent to unify the index writing code to use the same mechanism. I discovered this disparity while trying to create a new index format that uses the chunk-format API. That API uses a hashfile as its base, so it is incompatible with the custom code in read-cache.c. This rewrite is rather straightforward. It replaces all writes to the temporary file with writes to the hashfile struct. This takes care of many of the direct interactions with the_hash_algo. There are still some git_hash_ctx uses remaining: the extension headers are hashed for use in the End of Index Entries (EOIE) extension. This use of the git_hash_ctx is left as-is. There are multiple reasons to not use a hashfile here, including the fact that the data is not actually writing to a file, just a hash computation. These hashes do not block our adoption of the chunk-format API in a future change to the index, so leave it as-is. The internals of the algorithms are mostly identical. Previously, the hashfile API used a smaller 8KB buffer instead of the 128KB buffer from read-cache.c. The previous change already unified these sizes. There is one subtle point: we do not pass the CSUM_FSYNC to the finalize_hashfile() method, which differs from most consumers of the hashfile API. The extra fsync() call indicated by this flag causes a significant peformance degradation that is noticeable for quick commands that write the index, such as "git add". Other consumers can absorb this cost with their more complicated data structure organization, and further writing structures such as pack-files and commit-graphs is rarely in the critical path for common user interactions. Some static methods become orphaned in this diff, so I marked them as MAYBE_UNUSED. The diff is much harder to read if they are deleted during this change. Instead, they will be deleted in the following change. In addition to the test suite passing, I computed indexes using the previous binaries and the binaries compiled after this change, and found the index data to be exactly equal. Finally, I did extensive performance testing of "git update-index --force-write" on repos of various sizes, including one with over 2 million paths at HEAD. These tests demonstrated less than 1% difference in behavior. As expected, the performance should be considered unchanged. The previous changes to increase the hashfile buffer size from 8K to 128K ensured this change would not create a peformance regression. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 16:41:21 +09:00
Derrick Stolee	2ca245f8be	csum-file.h: increase hashfile buffer size The hashfile API uses a hard-coded buffer size of 8KB and has ever since it was introduced in `c38138c` (git-pack-objects: write the pack files with a SHA1 csum, 2005-06-26). It performs a similar function to the hashing buffers in read-cache.c, but that code was updated from 8KB to 128KB in `f279894` (read-cache: make the index write buffer size 128K, 2021-02-18). The justification there was that do_write_index() improves from 1.02s to 0.72s. Since our end goal is to have the index writing code use the hashfile API, we need to unify this buffer size to avoid a performance regression. There is a buffer, 'check_buffer', that is used to verify the check_fd file descriptor. When this buffer increases to 128K to fit the data being flushed, it causes the stack to overflow the limits placed in the test suite. To avoid issues with stack size, move both 'buffer' and 'check_buffer' to be heap pointers within 'struct hashfile'. The 'check_buffer' member is left as NULL unless check_fd is set in hashfd_check(). Both buffers are cleared as part of finalize_hashfile() which also frees the full structure. Since these buffers are now on the heap, we can adjust their size based on the needs of the consumer. In particular, callers to hashfd_throughput() are expecting to report progress indicators as the buffer flushes. These callers would prefer the smaller 8k buffer to avoid large delays between updates, especially for users with slower networks. When the progress indicator is not used, the larger buffer is preferrable. By adding a new trace2 region in the chunk-format API, we can see that the writing portion of 'git multi-pack-index write' lowers from ~1.49s to ~1.47s on a Linux machine. These effects may be more pronounced or diminished on other filesystems. The end-to-end timing is too noisy to have a definitive change either way. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 16:41:21 +09:00
Jonathan Nieder	aafa5df0df	xsize_t: avoid implementation defined behavior when len < 0 The xsize_t helper aims to safely convert an off_t to a size_t, erroring out when a file offset is too large to fit into a memory address. It does this by using two casts: size_t size = (size_t) len; if (len != (off_t) size) ... error out ... On a platform with sizeof(size_t) < sizeof(off_t), this check is safe and correct. The first cast truncates to a size_t by finding the remainder modulo SIZE_MAX+1 (see C99 section 6.3.1.3 Signed and unsigned integers) and the second promotes to an off_t, meaning the result is true if and only if len is representable as a size_t. On other platforms, this two-casts strategy still works well (always succeeds) for len >= 0. But for len < 0, when the first cast succeeds and produces SIZE_MAX + 1 + len, the resulting value is too large to be represented as an off_t, so the second cast produces implementation defined behavior. In practice, it is likely to produce a result of true despite len not being representable as size_t. Simplify by replacing with a more straightforward check: compare len to the relevant bounds and then cast it. (To avoid a -Wsign-compare warning, after checking that len >= 0, we explicitly convert to a sufficiently-large unsigned type before comparing to SIZE_MAX.) In practice, this is not likely to come up since typical callers use nonnegative len. Still, it's helpful to handle this case to make the behavior easy to reason about. Historical note: the original bounds-checking in `46be82dfd0` (xsize_t: check whether we lose bits, 2010-07-28) did not produce this implementation-defined behavior, though it still did not handle negative offsets. It was not until `73560c793a` (git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings, 2017-09-21) introduced the double cast that the implementation-defined behavior was triggered. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 15:00:30 +09:00
Jeff King	ecf7b129fa	Revert "remote-curl: fall back to basic auth if Negotiate fails" This reverts commit `1b0d9545bb`. That commit does fix the situation it intended to (avoiding Negotiate even when the credentials were provided in the URL), but it creates a more serious regression: we now never hit the conditional for "we had a username and password, tried them, but the server still gave us a 401". That has two bad effects: 1. we never call credential_reject(), and thus a bogus credential stored by a helper will live on forever 2. we never return HTTP_NOAUTH, so the error message the user gets is "The requested URL returned error: 401", instead of "Authentication failed". Doing this correctly seems non-trivial, as we don't know whether the Negotiate auth was a problem. Since this is a regression in the upcoming v2.23.0 release (for which we're in -rc0), let's revert for now and work on a fix separately. (Note that this isn't a pure revert; the previous commit added a test showing the regression, so we can now flip it to expect_success). Reported-by: Ben Humphreys <behumphreys@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 10:09:58 +09:00
Jeff King	b694f1e49e	t5551: test http interaction with credential helpers We test authentication with http, and we independently test that credential helpers work, but we don't have any tests that cover the two features working together. Let's add two: 1. Make sure that a successful request asks the helper to save the credential. This works as expected. 2. Make sure that a failed request asks the helper to forget the credential. This is marked as expect_failure, as it was recently regressed by `1b0d9545bb` (remote-curl: fall back to basic auth if Negotiate fails, 2021-03-22). The symptom here is that the second request should prompt the user, but doesn't. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-19 10:09:57 +09:00
Junio C Hamano	f302c1e4aa	revisions(7): clarify that most commands take a single revision range Sometimes new people are confused by how a revision "range" works, in that it is not a random collection of commits but a set of commits that are all connected to each other, and most Git commands work on a single such "range". Give an example to clarify it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-18 10:08:03 +09:00
Derrick Stolee	68142e117c	hashfile: use write_in_full() The flush() logic in csum-file.c was introduced originally by `c38138c` (git-pack-objects: write the pack files with a SHA1 csum, 2005-06-26) and a portion of the logic performs similar utility to write_in_full() in wrapper.c. The history of write_in_full() is full of moves and renames, but was originally introduced by `7230e6d` (Add write_or_die(), a helper function, 2006-08-21). The point of these sections of code are to flush a write buffer using xwrite() and report errors in the case of disk space issues or other generic input/output errors. The logic in flush() can interpret the output of write_in_full() to provide the correct error messages to users. The logic in the hashfile API has an additional set of logic to augment the progress indicator between calls to xwrite(). This was introduced by `2a128d6` (add throughput display to git-push, 2007-10-30). It seems that since the hashfile's buffer is only 8KB, these additional progress indicators might not be incredibly necessary. Instead, update the progress only when write_in_full() complete. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-18 06:32:35 +09:00
Derrick Stolee	4279cb1c6e	sparse-index: fix uninitialized jump While testing the sparse-index, I verified a test with --valgrind and it complained about an uninitialized value being used in a jump in the path_matches_pattern_list() method. The line was this one: if (*dtype == DT_UNKNOWN) In the call stack, the culprit was the initialization of the dtype variable in convert_to_sparse_rec(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-18 06:29:17 +09:00
Matheus Tavares	3d20ed27b8	parallel-checkout: send the new object_id algo field to the workers An object_id storing a SHA-1 name has some unused bytes at the end of the hash array. Since these bytes are not used, they are usually not initialized to any value either. However, at parallel_checkout.c:send_one_item() the object_id of a cache entry is copied into a buffer which is later sent to a checkout worker through a pipe write(). This makes Valgrind complain about passing uninitialized bytes to a syscall. The worker won't use these uninitialized bytes either, but the warning could confuse someone trying to debug this code; So instead of using oidcpy(), send_one_item() uses hashcpy() to only copy the used/initialized bytes of the object_id, and leave the remaining part with zeros. However, since `cf0983213c` ("hash: add an algo member to struct object_id", 2021-04-26), using hashcpy() is no longer sufficient here as it won't copy the new algo field from the object_id. Let's add and use a new function which meets both our requirements of copying all the important object_id data while still avoiding the uninitialized bytes, by padding the end of the hash array in the destination object_id. With this change, we also no longer need the destination buffer from send_one_item() to be initialized with zeros, so let's switch from xcalloc() to xmalloc() to make this clear. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-18 05:38:54 +09:00
Todd Zullinger	58cf6056c9	t7500: remove non-existant C_LOCALE_OUTPUT prereq The C_LOCALE_OUTPUT prerequisite was removed in `b1e079807b` (tests: remove last uses of C_LOCALE_OUTPUT, 2021-02-11), where Ævar noted: I'm not leaving the prerequisite itself in place for in-flight changes as there currently are none that introduce new tests that rely on it, and because C_LOCALE_OUTPUT is currently a noop on the master branch we likely won't have any new submissions that use it. One more use of C_LOCALE_OUTPUT did creep in with `3d1bda6b5b` (t7500: add tests for --fixup=[amend\|reword] options, 2021-03-15). This causes a number of the tests to be skipped by default: ok 35 # SKIP --fixup=reword: incompatible with --all (missing C_LOCALE_OUTPUT) ok 36 # SKIP --fixup=reword: incompatible with --include (missing C_LOCALE_OUTPUT) ok 37 # SKIP --fixup=reword: incompatible with --only (missing C_LOCALE_OUTPUT) ok 38 # SKIP --fixup=reword: incompatible with --interactive (missing C_LOCALE_OUTPUT) ok 39 # SKIP --fixup=reword: incompatible with --patch (missing C_LOCALE_OUTPUT) Remove the C_LOCALE_OUTPUT prerequisite from these tests so they are not skipped. Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-18 04:48:30 +09:00
Wolfgang Müller	e2c5993744	rev-parse: mark die() messages for translation These error messages are intended for the user. Let's touch them up since we're here from the previous commit. Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 18:39:53 +09:00
Wolfgang Müller	99fc555188	rev-parse: fix segfault with missing --path-format argument Calling "git rev-parse --path-format" without an argument segfaults instead of giving an error message. Commit `fac60b8925` (rev-parse: add option for absolute or relative path formatting, 2020-12-13) added the argument parsing code but forgot to handle NULL. Returning an error makes sense here because there is no default value we could use. Add a test case to verify. Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 18:39:29 +09:00
Anders Höckersten	bfe35a6165	describe-doc: clarify default length of abbreviation Clarify the default length used for the abbreviated form used for commits in git describe. The behavior was modified in Git 2.11.0, but the documentation was not updated to clarify the new behavior. Signed-off-by: Anders Höckersten <anders@hockersten.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 15:56:29 +09:00
edef	72ee47ceeb	mailinfo: don't discard names under 3 characters I sometimes receive patches from people with short mononyms, and in my cultural environment these are not uncommon. To my dismay, git-am currently discards their names, and replaces them with their email addresses. Link: https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/ Signed-off-by: edef <edef@edef.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 07:35:43 +09:00
Alex Henrie	f5f5a61d5a	submodule: use the imperative mood to describe the --files option Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 07:31:40 +09:00
Alex Henrie	4901884a23	stash: don't translate literal commands Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 07:21:04 +09:00
Gregory Anders	cd5b33fbdc	git-send-email: add option to specify sendmail command The sendemail.smtpServer configuration option and --smtp-server command line option both support using a sendmail-like program to send emails by specifying an absolute file path. However, this is not ideal for the following reasons: 1. It overloads the meaning of smtpServer (now a program is being used for the server?) 2. It doesn't allow for non-absolute paths, arguments, or arbitrary scripting Requiring an absolute path is bad for portability, as the same program may be in different locations on different systems. If a user wishes to pass arguments to their program, they have to use the smtpServerOption option, which is cumbersome (as it must be repeated for each option) and doesn't adhere to normal git conventions. Introduce a new configuration option sendemail.sendmailCmd as well as a command line option --sendmail-cmd that can be used to specify a command (with or without arguments) or shell expression to run to send email. The name of this option is consistent with --to-cmd and --cc-cmd. This invocation honors the user's $PATH so that absolute paths are not necessary. Arbitrary shell expressions are also supported, allowing users to do basic scripting. Give this option a higher precedence over --smtp-server and sendemail.smtpServer, as the new interface is more flexible. For backward compatibility, continue to support absolute paths in --smtp-server and sendemail.smtpServer. Signed-off-by: Gregory Anders <greg@gpanders.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-17 07:06:13 +09:00
Junio C Hamano	09c0ee21fe	Sync with Git 2.32-rc0	2021-05-16 21:06:09 +09:00
Junio C Hamano	48fd717185	Merge branch 'zh/ref-filter-atom-type' into next The code to handle the "--format" option in "for-each-ref" and friends made too many string comparisons on %(atom)s used in the format string, which has been corrected by converting them into enum when the format string is parsed. * zh/ref-filter-atom-type: ref-filter: introduce enum atom_type ref-filter: add objectsize to used_atom	2021-05-16 21:05:50 +09:00
Junio C Hamano	2f4f4454ea	Merge branch 'jk/test-chainlint-softer' into next The "chainlint" feature in the test framework is a handy way to catch common mistakes in writing new tests, but tends to get expensive. An knob to selectively disable it has been introduced to help running tests that the developer has not modified. * jk/test-chainlint-softer: t: avoid sed-based chain-linting in some expensive cases	2021-05-16 21:05:50 +09:00
Junio C Hamano	4467131ca3	Merge branch 'en/prompt-under-set-u' into next The bash prompt script (in contrib/) did not work under "set -u". * en/prompt-under-set-u: git-prompt: work under set -u	2021-05-16 21:05:50 +09:00
Junio C Hamano	d90e8df2b7	Merge branch 'zh/ref-filter-push-remote-fix' into next The handling of "%(push)" formatting element of "for-each-ref" and friends was broken when the same codepath started handling "%(push:<what>)", which has been corrected. * zh/ref-filter-push-remote-fix: ref-filter: fix read invalid union member bug	2021-05-16 21:05:50 +09:00
Junio C Hamano	bf949ade81	Git 2.32-rc0 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-16 21:05:24 +09:00
Junio C Hamano	e004fd6b69	Merge branch 'ls/typofix' * ls/typofix: pretty: fix a typo in the documentation for %(trailers)	2021-05-16 21:05:24 +09:00
Junio C Hamano	a8a2491e62	Merge branch 'dl/stash-show-untracked-fixup' The code to handle options recently added to "git stash show" around untracked part of the stash segfaulted when these options were used on a stash entry that does not record untracked part. * dl/stash-show-untracked-fixup: stash show: fix segfault with --{include,only}-untracked t3905: correct test title	2021-05-16 21:05:24 +09:00
Junio C Hamano	16f91451fa	Merge branch 'wc/packed-ref-removal-cleanup' When "git update-ref -d" removes a ref that is packed, it left empty directories under $GIT_DIR/refs/ for * wc/packed-ref-removal-cleanup: refs: cleanup directories when deleting packed ref	2021-05-16 21:05:24 +09:00
Junio C Hamano	94294e92e1	Merge branch 'lh/maintenance-leakfix' * lh/maintenance-leakfix: maintenance: fix two memory leaks	2021-05-16 21:05:24 +09:00
Junio C Hamano	caf6840be0	Merge branch 'ma/typofixes' A couple of trivial typofixes. * ma/typofixes: pretty-formats.txt: add missing space git-repack.txt: remove spurious ")"	2021-05-16 21:05:24 +09:00
Junio C Hamano	c7c7c460f8	Merge branch 'ah/merge-ort-i18n' An i18n fix. * ah/merge-ort-i18n: merge-ort: split "distinct types" message into two translatable messages	2021-05-16 21:05:23 +09:00
Junio C Hamano	483932a3d8	Merge branch 'dd/mailinfo-quoted-cr' "git mailinfo" (hence "git am") learned the "--quoted-cr" option to control how lines ending with CRLF wrapped in base64 or qp are handled. * dd/mailinfo-quoted-cr: am: learn to process quoted lines that ends with CRLF mailinfo: allow stripping quoted CR without warning mailinfo: allow squelching quoted CRLF warning mailinfo: warn if CRLF found in decoded base64/QP email mailinfo: stop parsing options manually mailinfo: load default metainfo_charset lazily	2021-05-16 21:05:23 +09:00
Junio C Hamano	c8e34a7ac2	Merge branch 'ab/sparse-index-cleanup' Code clean-up. * ab/sparse-index-cleanup: sparse-index.c: remove set_index_sparse_config()	2021-05-16 21:05:23 +09:00
Junio C Hamano	502a67891c	Merge branch 'ab/streaming-simplify' Code clean-up. * ab/streaming-simplify: streaming.c: move {open,close,read} from vtable to "struct git_istream" streaming.c: stop passing around "object_info *" to open() streaming.c: remove {open,close,read}_method_decl() macros streaming.c: remove enum/function/vtbl indirection streaming.c: avoid forward declarations	2021-05-16 21:05:23 +09:00
Junio C Hamano	a737e1f1d2	Merge branch 'mt/parallel-checkout-part-3' The final part of "parallel checkout". * mt/parallel-checkout-part-3: ci: run test round with parallel-checkout enabled parallel-checkout: add tests related to .gitattributes t0028: extract encoding helpers to lib-encoding.sh parallel-checkout: add tests related to path collisions parallel-checkout: add tests for basic operations checkout-index: add parallel checkout support builtin/checkout.c: complete parallel checkout support make_transient_cache_entry(): optionally alloc from mem_pool	2021-05-16 21:05:23 +09:00
Junio C Hamano	644f4a2046	Merge branch 'jt/push-negotiation' "git push" learns to discover common ancestor with the receiving end over protocol v2. * jt/push-negotiation: send-pack: support push negotiation fetch: teach independent negotiation (no packfile) fetch-pack: refactor command and capability write fetch-pack: refactor add_haves() fetch-pack: refactor process_acks()	2021-05-16 21:05:22 +09:00
Alex Henrie	a30e43f61a	merge: don't translate literal commands These strings have not been modified in any translation, nor should they be. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-16 13:00:28 +09:00
Junio C Hamano	46aad6cb9e	Sync with master	2021-05-14 08:30:47 +09:00
Junio C Hamano	6d6a81717a	Merge branch 'ew/sha256-clone-remote-curl-fix' into next "git clone" from SHA256 repository by Git built with SHA-1 as the default hash algorithm over the dumb HTTP protocol did not correctly set up the resulting repository, which has been corrected. * ew/sha256-clone-remote-curl-fix: remote-curl: fix clone on sha256 repos	2021-05-14 08:30:32 +09:00
Junio C Hamano	316f9264c1	Merge branch 'en/dir-traversal' into next "git clean" and "git ls-files -i" had confusion around working on or showing ignored paths inside an ignored directory, which has been corrected. * en/dir-traversal: dir: introduce readdir_skip_dot_and_dotdot() helper dir: update stale description of treat_directory() dir: traverse into untracked directories if they may have ignored subfiles dir: avoid unnecessary traversal into ignored directory t3001, t7300: add testcase showcasing missed directory traversal t7300: add testcase showing unnecessary traversal into ignored directory ls-files: error out on -i unless -o or -c are specified dir: report number of visited directories and paths with trace2 dir: convert trace calls to trace2 equivalents	2021-05-14 08:30:32 +09:00
Junio C Hamano	97eea85a0a	The seventeenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-14 08:26:11 +09:00
Junio C Hamano	52371bf449	Merge branch 'mt/clean-clean' Code clean-up. * mt/clean-clean: clean: remove unnecessary variable	2021-05-14 08:26:11 +09:00
Junio C Hamano	47fa106617	Merge branch 'ow/no-dryrun-in-add-i' "git add -i --dry-run" does not dry-run, which was surprising. The combination of options has taught to error out. * ow/no-dryrun-in-add-i: add: die if both --dry-run and --interactive are given	2021-05-14 08:26:09 +09:00
Junio C Hamano	e289f681ed	Merge branch 'jk/p4-locate-branch-point-optim' "git p4" learned to find branch points more efficiently. * jk/p4-locate-branch-point-optim: git-p4: speed up search for branch parent git-p4: ensure complex branches are cloned correctly	2021-05-14 08:26:08 +09:00
Junio C Hamano	eede71149e	Merge branch 'ba/object-info' Over-the-wire protocol learns a new request type to ask for object sizes given a list of object names. * ba/object-info: object-info: support for retrieving object info	2021-05-14 08:26:08 +09:00
Junio C Hamano	daffa8961b	Merge branch 'pw/patience-diff-clean-up' Code clean-up. * pw/patience-diff-clean-up: patience diff: remove unused variable patience diff: remove unnecessary string comparisons	2021-05-14 08:26:08 +09:00
Junio C Hamano	65c18913de	Merge branch 'pw/word-diff-zero-width-matches' The word-diff mode has been taught to work better with a word regexp that can match an empty string. * pw/word-diff-zero-width-matches: word diff: handle zero length matches	2021-05-14 08:26:06 +09:00
ZheNing Hu	1197f1a463	ref-filter: introduce enum atom_type In the original ref-filter design, it will copy the parsed atom's name and attributes to `used_atom[i].name` in the atom's parsing step, and use it again for string matching in the later specific ref attributes filling step. It use a lot of string matching to determine which atom we need. Introduce the enum "atom_type", each enum value is named as `ATOM_*`, which is the index of each corresponding valid_atom entry. In the first step of the atom parsing, `used_atom.atom_type` will record corresponding enum value from valid_atom entry index, and then in specific reference attribute filling step, only need to compare the value of the `used_atom[i].atom_type` to check the atom type. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-14 06:37:28 +09:00
ZheNing Hu	0caf20f228	ref-filter: add objectsize to used_atom When the support for "objectsize:disk" was bolted onto the existing support for "objectsize", it didn't follow the usual pattern for handling "atomtype:modifier", which reads the <modifier> part just once while parsing the format string, and store the parsed result in the union in the used_atom structure, so that the string form of it does not have to be parsed over and over at runtime (e.g. in grab_common_values()). Add a new member `objectsize` to the union `used_atom.u`, so that we can separate the check of <modifier> from the check of <atomtype>, this will bring scalability to atom `%(objectsize)`. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-14 06:37:27 +09:00
Jeff King	2d86a96220	t: avoid sed-based chain-linting in some expensive cases Commit `878f988350` (t/test-lib: teach --chain-lint to detect broken &&-chains in subshells, 2018-07-11) introduced additional chain-lint tests which add an extra "sed" pipeline to each test we run. This has a measurable impact on runtime. Here are timings with and without a new environment variable (added by this patch) that lets you disable just the additional sed-based chain-lint tests: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 make test Time (mean ± σ): 64.202 s ± 1.030 s [User: 622.469 s, System: 301.402 s] Range (min … max): 61.571 s … 65.662 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 make test Time (mean ± σ): 57.591 s ± 0.333 s [User: 529.368 s, System: 270.618 s] Range (min … max): 57.143 s … 58.309 s 10 runs Summary 'GIT_TEST_CHAIN_LINT_HARDER=0 make test' ran 1.11 ± 0.02 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 make test' Of course those extra lint checks are doing something useful, so paying a few extra seconds (at least on Linux) isn't so bad (though note the CPU time; we're bounded in our parallel run here by the slowest test, so it really is ~120s of CPU improvement). But we can observe that there are some test scripts where they produce a much stronger effect, and provide less value. In t0027 and t3070 we run a very large number of small tests, all driven by a series of functions/loops which are filling in the test bodies. There we get much less bang for our buck in terms of bug-finding versus CPU cost. This patch introduces a mechanism for controlling when those extra lint checks are run, at two levels: - a user can ask to disable or to force-enable the checks by setting GIT_TEST_CHAIN_LINT_HARDER - if the user hasn't specified a preference, individual scripts can disable the checks by setting GIT_TEST_CHAIN_LINT_HARDER_DEFAULT; scripts which don't set that get the current behavior of enabling them. In addition, this patch flips the default for t0027 and t3070's mass-generated sections to disable the extra checks. Here are the timing results for t0027: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh Time (mean ± σ): 17.078 s ± 0.848 s [User: 14.878 s, System: 7.075 s] Range (min … max): 15.952 s … 18.421 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh Time (mean ± σ): 9.063 s ± 0.759 s [User: 7.890 s, System: 3.362 s] Range (min … max): 7.747 s … 10.619 s 10 runs Benchmark #3: ./t0027-auto-crlf.sh Time (mean ± σ): 9.186 s ± 0.881 s [User: 7.957 s, System: 3.427 s] Range (min … max): 7.796 s … 10.498 s 10 runs Summary 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh' ran 1.01 ± 0.13 times faster than './t0027-auto-crlf.sh' 1.88 ± 0.18 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh' We can see that disabling the checks for the whole script buys us an almost 2x speedup. But the new default behavior, disabling them only for the mass-generated part, gets us most of that speedup (but still leaves the checks on for further manual tests people might write). As a side note, I'd caution about comparing runtimes and CPU seconds between this timing and the earlier "make test" one. In "make test", we're running a lot of scripts in parallel, so the CPU is throttling down (and thus a CPU second saved here would count for more during a parallel run; the same work takes more CPU seconds there). We get similar results for t3070: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh Time (mean ± σ): 20.054 s ± 3.967 s [User: 16.003 s, System: 8.286 s] Range (min … max): 11.891 s … 23.671 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh Time (mean ± σ): 12.399 s ± 2.256 s [User: 7.542 s, System: 5.342 s] Range (min … max): 9.606 s … 15.727 s 10 runs Benchmark #3: ./t3070-wildmatch.sh Time (mean ± σ): 10.726 s ± 3.476 s [User: 6.790 s, System: 4.365 s] Range (min … max): 5.444 s … 15.376 s 10 runs Summary './t3070-wildmatch.sh' ran 1.16 ± 0.43 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh' 1.87 ± 0.71 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh' Again, we get almost a 2x speedup disabling these. In this case, there are no tests not covered by the script's "default to disable" behavior, so the second two benchmarks should be the same (and while they do differ, you can see the variance is quite high but they're within one standard deviation). So it seems like for these two scripts, at least, disabling the extra checks is a reasonable tradeoff. Sadly, the overall runtime of "make test" on my system doesn't get much faster. But that's because we're mostly limited by the cost of the single biggest test. Here are the top-5 tests by wall-clock time from a parallel run, before my patch: 57.9192368984222 t9001-send-email.sh 45.6329638957977 t0027-auto-crlf.sh 32.5278220176697 t3070-wildmatch.sh 22.2701289653778 t7610-mergetool.sh 20.8635759353638 t1701-racy-split-index.sh And after: 57.1476998329163 t9001-send-email.sh 33.776211977005 t0027-auto-crlf.sh 21.3116669654846 t7610-mergetool.sh 20.7748689651489 t1701-racy-split-index.sh 19.6957249641418 t7112-reset-submodule.sh We dropped 12s from t0027, and t3070 dropped off our list entirely at around 16s. In both cases we're bound by t9001, but its slowness is due to the actual tests, so we'll have to deal with it in a different way. But this reduces overall CPU, and means that dealing with t9001 (by improving the speed of send-email or splitting it apart) will let us reduce our overall runtime even on multi-core machines. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-13 15:50:44 +09:00

1 2 3 4 5 ...

63370 Commits All Branches Search

63370 Commits

All Branches