mirror/git - git - git.dotya.ml

mirror/ git

mirror of https://github.com/git/git.git synced 2024-06-05 02:46:14 +02:00

Author	SHA1	Message	Date
Jeff King	67985e4e4a	refs: drop "broken" flag from for_each_fullref_in() No callers pass in anything but "0" here. Likewise to our sibling functions. Note that some of them ferry along the flag, but none of their callers pass anything but "0" either. Nor is anybody likely to change that. Callers which really want to see all of the raw refs use for_each_rawref(). And anybody interested in iterating a subset of the refs will likely be happy to use the now-default behavior of showing broken refs, but omitting dangling symlinks. So we can get rid of this whole feature. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	5d1f5b8cd4	repack, prune: drop GIT_REF_PARANOIA settings Now that GIT_REF_PARANOIA is the default, we don't need to selectively enable it for destructive operations. In fact, it's harmful to do so, because it overrides any GIT_REF_PARANOIA=0 setting that the user may have provided (because they're trying to work around some corruption). With these uses gone, we can further clean up the ref_paranoia global, and make it a static variable inside the refs code. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	968f12fdac	refs: turn on GIT_REF_PARANOIA by default The original point of the GIT_REF_PARANOIA flag was to include broken refs in iterations, so that possibly-destructive operations would not silently ignore them (and would generally instead try to operate on the oids and fail when the objects could not be accessed). We already turned this on by default for some dangerous operations, like "repack -ad" (where missing a reachability tip would mean dropping the associated history). But it was not on for general use, even though it could easily result in the spreading of corruption (e.g., imagine cloning a repository which simply omits some of its refs because their objects are missing; the result quietly succeeds even though you did not clone everything!). This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned above would actually fail (upload-pack tells us about the broken ref, and when we ask for the objects, pack-objects fails to deliver them). This may be inconvenient when working with a corrupted repository, but: - we are better off to err on the side of complaining about corruption, and then provide mechanisms for explicitly loosening safety. - this is only one type of corruption anyway. If we are missing any other objects in the history that _aren't_ ref tips, then we'd behave similarly (happily show the ref, but then barf when we started traversing). We retain the GIT_REF_PARANOIA variable, but simply default it to "1" instead of "0". That gives the user an escape hatch for loosening this when working with a corrupt repository. It won't work across a remote connection to upload-pack (because we can't necessarily set environment variables on the remote), but there the client has other options (e.g., choosing which refs to fetch). As a bonus, this also makes ref iteration faster in general (because we don't have to call has_object_file() for each ref), though probably not noticeably so in the general case. In a repo with a million refs, it shaved a few hundred milliseconds off of upload-pack's advertisement; that's noticeable, but most repos are not nearly that large. The possible downside here is that any operation which iterates refs but doesn't ever open their objects may now quietly claim to have X when the object is corrupted (e.g., "git rev-list new-branch --not --all" will treat a broken ref as uninteresting). But again, that's not really any different than corruption below the ref level. We might have refs/heads/old-branch as non-corrupt, but we are not actively checking that we have the entire reachable history. Or the pointed-to object could even be corrupted on-disk (but our "do we have it" check would still succeed). In that sense, this is merely bringing ref-corruption in line with general object corruption. One alternative implementation would be to actually check for broken refs, and then _immediately die_ if we see any. That would cause the "rev-list --not --all" case above to abort immediately. But in many ways that's the worst of all worlds: - it still spends time looking up the objects an extra time - it still doesn't catch corruption below the ref level - it's even more inconvenient; with the current implementation of GIT_REF_PARANOIA for something like upload-pack, we can make the advertisement and let the client choose a non-broken piece of history. If we bail as soon as we see a broken ref, they cannot even see the advertisement. The test changes here show some of the fallout. A non-destructive "git repack -adk" now fails by default (but we can override it). Deleting a broken ref now actually tells the hooks the correct "before" state, rather than a confusing null oid. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	6d751be4b6	refs: omit dangling symrefs when using GIT_REF_PARANOIA Dangling symrefs aren't actually a corruption problem. It's perfectly fine for refs/remotes/origin/HEAD to point to an unborn branch. And in particular, if you are trying to establish reachability, a symref that points nowhere doesn't matter either way. Any ref it could point to will be examined during the rest of the traversal. It's possible that a symref pointing nowhere _could_ be a sign that the ref it was meant to point to was deleted accidentally (e.g., via corruption). But there is no particular reason to think that is true for any given case, and in the meantime, GIT_REF_PARANOIA kicking in automatically for some operations means they'll fail unnecessarily. So let's loosen it just a bit. The new test in t5312 shows off an example that is safe, but currently fails (and no longer does after this patch). Note that we don't do anything if the caller explicitly asked for DO_FOR_EACH_INCLUDE_BROKEN. In that case they may be looking for dangling symrefs themselves, and setting GIT_REF_PARANOIA should not _loosen_ things from what the caller asked for. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	9aab952e85	refs-internal.h: reorganize DO_FOR_EACH_* flag documentation The documentation for the DO_FOR_EACH_* flags is sprinkled over the refs-internal.h file. We define the two flags in one spot, and then describe them in more detail far away from there, in the definitions of refs_ref_iterator_begin() and ref_iterator_advance_fn(). Let's try to organize this a bit better: - convert the #defines to an enum. This makes it clear that they are related, and that the enum shows the complete set of flags. - combine all descriptions for each flag in a single spot, next to the flag's definition - use the enum rather than a bare int for functions which take the flags. This helps readers realize which flags can be used. - clarify the mention of flags for ref_iterator_advance_fn(). It does not take flags itself, but is meant to depend on ones set up earlier. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Junio C Hamano	f0ade787ac	Merge branch 'hn/refs-iterator-peel-returns-boolean' Tiny API tweak. * hn/refs-iterator-peel-returns-boolean: refs: make explicit that ref_iterator_peel returns boolean	2021-07-16 17:42:49 -07:00
Han-Wen Nienhuys	617480d75b	refs: make explicit that ref_iterator_peel returns boolean Use -1 as error return value throughout. This removes spurious differences in the GIT_TRACE_REFS output, depending on the ref storage backend active. Before, the cached ref_iterator (but only that iterator!) would return peel_object() output directly. No callers relied on the peel_status values beyond success/failure. All calls to these functions go through peel_iterated_oid(), which returns peel_object() as a fallback, but also squashing the error values. The iteration interface already passes REF_ISSYMREF and REF_ISBROKEN through the flags argument, so the additional error values in enum peel_status provide no value. The ref iteration interface provides a separate peel() function because certain formats (eg. packed-refs and reftable) can store the peeled object next to the tag SHA1. Passing the peeled SHA1 as an optional argument to each_ref_fn maps more naturally to the implementation of ref databases. Changing the code in this way is left for a future refactoring. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 07:54:12 +09:00
Junio C Hamano	aaa3c8065d	Merge branch 'bc/hash-transition-interop-part-1' SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id	2021-05-10 16:59:46 +09:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
Jeff King	45a187cc34	lookup_unknown_object(): take a repository argument All of the other lookup_foo() functions take a repository argument, but lookup_unknown_object() was never converted, and it uses the_repository internally. Let's fix that. We could leave a wrapper that uses the_repository, but there aren't that many calls, so we'll just convert them all. I looked briefly at each site to see if we had a repository struct (besides the_repository) we could pass, but none of them do (so this conversion to pass the_repository is a pure noop in each case, though it does take us one step closer to eventually getting rid of the_repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:18:46 -07:00
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
Junio C Hamano	6254fa1359	Merge branch 'tb/ls-refs-optim' The ls-refs protocol operation has been optimized to narrow the sub-hierarchy of refs/ it walks to produce response. * tb/ls-refs-optim: ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets ls-refs.c: initialize 'prefixes' before using it refs: expose 'for_each_fullref_in_prefixes'	2021-02-05 16:40:45 -08:00
Junio C Hamano	973e20b83f	Merge branch 'jk/peel-iterated-oid' The peel_ref() API has been replaced with peel_iterated_oid(). * jk/peel-iterated-oid: refs: switch peel_ref() to peel_iterated_oid()	2021-02-03 15:04:49 -08:00
Taylor Blau	16b1985be5	refs: expose 'for_each_fullref_in_prefixes' This function was used in the ref-filter.c code to find the longest common prefix of among a set of refspecs, and then to iterate all of the references that descend from that prefix. A future patch will want to use that same code from ls-refs.c, so prepare by exposing and moving it to refs.c. Since there is nothing specific to the ref-filter code here (other than that it was previously the only caller of this function), this really belongs in the more generic refs.h header. The code moved in this patch is identical before and after, with the one exception of renaming some arguments to be consistent with other functions exposed in refs.h. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 18:57:27 -08:00
Jeff King	36a317929b	refs: switch peel_ref() to peel_iterated_oid() The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char refname, const struct object_id oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:51:31 -08:00
Denton Liu	6436a20284	refs: allow @{n} to work with n-sized reflog This sequence works $ git checkout -b newbranch $ git commit --allow-empty -m one $ git show -s newbranch@{1} and shows the state that was immediately after the newbranch was created. But then if you do $ git reflog expire --expire=now refs/heads/newbranch $ git commit --allow=empty -m two $ git show -s newbranch@{1} you'd be scolded with fatal: log for 'newbranch' only has 1 entries While it is true that it has only 1 entry, we have enough information in that single entry that records the transition between the state in which the tip of the branch was pointing at commit 'one' to the new commit 'two' built on it, so we should be able to answer "what object newbranch was pointing at?". But we refuse to do so. Make @{0} the special case where we use the new side to look up that entry. Otherwise, look up @{n} using the old side of the (n-1)th entry of the reflog. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-11 14:13:50 -08:00
Denton Liu	95c2a71820	refs: factor out set_read_ref_cutoffs() This block of code is duplicated twice. In a future commit, it will be duplicated for a third time. Factor out the common functionality into set_read_ref_cutoffs(). In the case of read_ref_at_ent(), we are incrementing `cb->reccnt` at the beginning of the function. Move these to right before the return so that the `cb->reccnt - 1` is changed to `cb->reccnt` and it can be cleanly factored out into set_read_ref_cutoffs(). The duplication of the increment statements will be removed in a future patch. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-10 12:24:00 -08:00
Johannes Schindelin	675704c74d	init: provide useful advice about init.defaultBranch To give ample warning for users wishing to override Git's the fall-back for an unconfigured `init.defaultBranch` (in case we decide to change it in a future Git version), let's introduce some advice that is shown upon `git init` when that value is not set. Note: two test cases in Git's test suite want to verify that the `stderr` output of `git init` is empty. It is now necessary to suppress the advice, we now do that via the `init.defaultBranch` setting. While not strictly necessary, we also set this to `false` in `test_create_repo()`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 15:53:51 -08:00
Johannes Schindelin	cc0f13c57d	get_default_branch_name(): prepare for showing some advice We are about to introduce a message giving users running `git init` some advice about `init.defaultBranch`. This will necessarily be done in `repo_default_branch_name()`. Not all code paths want to show that advice, though. In particular, the `git clone` codepath _specifically_ asks for `init_db()` to be quiet, via the `INIT_DB_QUIET` flag. In preparation for showing users above-mentioned advice, let's change the function signature of `get_default_branch_name()` to accept the parameter `quiet`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 15:53:50 -08:00
Johannes Schindelin	704fed9ea2	tests: start moving to a different default main branch name To allow for an incremental conversion to a new default main branch name, let's introduce `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME`. This environment variable can be set at the top of each converted test script, overriding the default main branch name to use when initializing new repositories (or cloning empty repositories). Note: the `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME` is _not_ intended to be used manually; many tests require a specific main branch name and cannot simply work with another one. This `GIT_TEST_*` variable is meant purely for the transitional period while the entire test suite is converted to use `main` as the initial branch name by default. We also introduce the `PREPARE_FOR_MAIN_BRANCH` prereq that determines whether the default main branch name is `main`, and adjust a couple of test functions to use it. This prereq will be used to temporarily disable a couple test cases to allow for adjusting the test script incrementally. Once an entire test is adjusted, we will adjust the test so that it is run with `GIT_TEST_DEFAULT_MAIN_BRANCH_NAME=main`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-23 08:57:40 -07:00
Junio C Hamano	c9a04f036f	Merge branch 'hn/refs-trace-backend' Developer support. * hn/refs-trace-backend: refs: add GIT_TRACE_REFS debugging mechanism	2020-09-22 12:36:28 -07:00
Junio C Hamano	0df670bc0b	Merge branch 'jt/interpret-branch-name-fallback' "git status" has trouble showing where it came from by interpreting reflog entries that recordcertain events, e.g. "checkout @{u}", and gives a hard/fatal error. Even though it inherently is impossible to give a correct answer because the reflog entries lose some information (e.g. "@{u}" does not record what branch the user was on hence which branch 'the upstream' needs to be computed, and even if the record were available, the relationship between branches may have changed), at least hide the error to allow "status" show its output. * jt/interpret-branch-name-fallback: wt-status: tolerate dangling marks refs: move dwim_ref() to header file sha1-name: replace unsigned int with option struct	2020-09-09 13:53:09 -07:00
Han-Wen Nienhuys	4441f42707	refs: add GIT_TRACE_REFS debugging mechanism When set in the environment, GIT_TRACE_REFS makes git print operations and results as they flow through the ref storage backend. This helps debug discrepancies between different ref backends. Example: $ GIT_TRACE_REFS="1" ./git branch 15:42:09.769631 refs/debug.c:26 ref_store for .git 15:42:09.769681 refs/debug.c:249 read_raw_ref: HEAD: 0000000000000000000000000000000000000000 (=> refs/heads/ref-debug) type 1: 0 15:42:09.769695 refs/debug.c:249 read_raw_ref: refs/heads/ref-debug: `3a238e539b` (=> refs/heads/ref-debug) type 0: 0 15:42:09.770282 refs/debug.c:233 ref_iterator_begin: refs/heads/ (0x1) 15:42:09.770290 refs/debug.c:189 iterator_advance: refs/heads/b4 (0) 15:42:09.770295 refs/debug.c:189 iterator_advance: refs/heads/branch3 (0) Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 12:58:37 -07:00
Jonathan Tan	f24c30e0b6	wt-status: tolerate dangling marks When a user checks out the upstream branch of HEAD, the upstream branch not being a local branch, and then runs "git status", like this: git clone $URL client cd client git checkout @{u} git status no status is printed, but instead an error message: fatal: HEAD does not point to a branch (This error message when running "git branch" persists even after checking out other things - it only stops after checking out a branch.) This is because "git status" reads the reflog when determining the "HEAD detached" message, and thus attempts to DWIM "@{u}", but that doesn't work because HEAD no longer points to a branch. Therefore, when calculating the status of a worktree, tolerate dangling marks. This is done by adding an additional parameter to dwim_ref() and repo_dwim_ref(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:39:25 -07:00
Jonathan Tan	ec06b05568	refs: move dwim_ref() to header file This makes it clear that dwim_ref() is just repo_dwim_ref() without the first parameter. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:39:17 -07:00
Jonathan Tan	a4f66a7876	sha1-name: replace unsigned int with option struct In preparation for a future patch adding a boolean parameter to repo_interpret_branch_name(), which might be easily confused with an existing unsigned int parameter, refactor repo_interpret_branch_name() to take an option struct instead of the unsigned int parameter. The static function interpret_branch_mark() is also updated to take the option struct in preparation for that future patch, since it will also make use of the to-be-introduced boolean parameter. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:39:17 -07:00
Junio C Hamano	6ddd76fd6c	Merge branch 'ps/ref-transaction-hook' Code simplification by removing ineffective optimization. * ps/ref-transaction-hook: refs: remove lookup cache for reference-transaction hook	2020-08-31 15:49:53 -07:00
Junio C Hamano	e699684cf6	Merge branch 'hn/refs-pseudorefs' Accesses to two pseudorefs have been updated to properly use ref API. * hn/refs-pseudorefs: sequencer: treat REVERT_HEAD as a pseudo ref builtin/commit: suggest update-ref for pseudoref removal sequencer: treat CHERRY_PICK_HEAD as a pseudo ref refs: make refs_ref_exists public	2020-08-31 15:49:48 -07:00
Junio C Hamano	98df75b286	Merge branch 'hn/refs-fetch-head-is-special' The FETCH_HEAD is now always read from the filesystem regardless of the ref backend in use, as its format is much richer than the normal refs, and written directly by "git fetch" as a plain file.. * hn/refs-fetch-head-is-special: refs: read FETCH_HEAD and MERGE_HEAD generically refs: move gitdir into base ref_store refs: fix comment about submodule ref_stores refs: split off reading loose ref data in separate function	2020-08-27 14:04:49 -07:00
Patrick Steinhardt	0a0fbbe3ff	refs: remove lookup cache for reference-transaction hook When adding the reference-transaction hook, there were concerns about the performance impact it may have on setups which do not make use of the new hook at all. After all, it gets executed every time a reftx is prepared, committed or aborted, which linearly scales with the number of reference-transactions created per session. And as there are code paths like `git push` which create a new transaction for each reference to be updated, this may translate to calling `find_hook()` quite a lot. To address this concern, a cache was added with the intention to not repeatedly do negative hook lookups. Turns out this cache caused a regression, which was fixed via `e5256c82e5` (refs: fix interleaving hook calls with reference-transaction hook, 2020-08-07). In the process of discussing the fix, we realized that the cache doesn't really help even in the negative-lookup case. While performance tests added to benchmark this did show a slight improvement in the 1% range, this really doesn't warrent having a cache. Furthermore, it's quite flaky, too. E.g. running it twice in succession produces the following results: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2% 1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5% Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5% 1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0% One case notably absent from those benchmarks is a single executable searching for the hook hundreds of times, which is exactly the case for which the negative cache was added. p1400.2 will spawn a new update-ref for each transaction and p1400.3 only has a single reference-transaction for all reference updates. So this commit adds a third benchmark, which performs an non-atomic push of a thousand references. This will create a new reference transaction per reference. But even for this case, the negative cache doesn't consistently improve performance: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7% 1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6% 1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2% So let's just remove the cache altogether to simplify the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-25 15:34:42 -07:00
Han-Wen Nienhuys	3f9f1acccf	refs: make refs_ref_exists public This will be necessary to replace file existence checks for pseudorefs. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 11:20:10 -07:00
Han-Wen Nienhuys	e811530278	refs: read FETCH_HEAD and MERGE_HEAD generically The FETCH_HEAD and MERGE_HEAD refs must be stored in a file, regardless of the type of ref backend. This is because they can hold more than just a single ref. To accomodate them for alternate ref backends, read them from a file generically in refs_read_raw_ref() Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 14:08:04 -07:00
Junio C Hamano	95c687bf85	Merge branch 'hn/reftable-prep-part-2' Further preliminary change to refs API. * hn/reftable-prep-part-2: Make HEAD a PSEUDOREF rather than PER_WORKTREE. Modify pseudo refs through ref backend storage t1400: use git rev-parse for testing PSEUDOREF existence	2020-08-17 17:02:42 -07:00
Junio C Hamano	5676db2612	Merge branch 'ps/ref-transaction-hook' The logic to find the ref transaction hook script attempted to cache the path to the found hook without realizing that it needed to keep a copied value, as the API it used returned a transitory buffer space. This has been corrected. * ps/ref-transaction-hook: t1416: avoid hard-coded sha1 ids refs: fix interleaving hook calls with reference-transaction hook	2020-08-17 17:02:41 -07:00
Junio C Hamano	46b225f153	Merge branch 'jk/strvec' The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc	2020-08-10 10:23:57 -07:00
Patrick Steinhardt	e5256c82e5	refs: fix interleaving hook calls with reference-transaction hook In order to not repeatedly search for the reference-transaction hook in case it's getting called multiple times, we use a caching mechanism to only call `find_hook()` once. What was missed though is that the return value of `find_hook()` actually comes from a static strbuf, which means it will get overwritten when calling `find_hook()` again. As a result, we may call the wrong hook with parameters of the reference-transaction hook. This scenario was spotted in the wild when executing a git-push(1) with multiple references, where there are interleaving calls to both the update and the reference-transaction hook. While initial calls to the reference-transaction hook work as expected, it will stop working after the next invocation of the update hook. The result is that we now start calling the update hook with parameters and stdin of the reference-transaction hook. This commit fixes the issue by storing a copy of `find_hook()`'s return value in the cache. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-07 12:27:41 -07:00
Junio C Hamano	dc3c6fb565	Merge branch 'hn/reftable' into master Brown-paper-bag fix. * hn/reftable: refs: move the logic to add \t to reflog to the files backend	2020-08-01 13:49:13 -07:00
Han-Wen Nienhuys	25429fed5c	refs: move the logic to add \t to reflog to the files backend `523fa69c` (reflog: cleanse messages in the refs.c layer, 2020-07-10) centralized reflog normalizaton. However, the normalizaton added a leading "\t" to the message. This is an artifact of the reflog storage format in the files backend, so it should be added there. Routines that parse back the reflog (such as grab_nth_branch_switch) expect the "\t" to not be in the message, so without this fix, git with reftable cannot process the "@{-1}" syntax. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-31 10:21:51 -07:00
Junio C Hamano	3161cc6e6b	Merge branch 'hn/reftable' into master Preliminary clean-up of the refs API in preparation for adding a new refs backend "reftable". * hn/reftable: reflog: cleanse messages in the refs.c layer bisect: treat BISECT_HEAD as a pseudo ref t3432: use git-reflog to inspect the reflog for HEAD lib-t6000.sh: write tag using git-update-ref	2020-07-30 13:20:32 -07:00
Jeff King	c972bf4cf5	strvec: convert remaining callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the remaining files, as the resulting diff is reasonably sized. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
Han-Wen Nienhuys	55dd8b9108	Make HEAD a PSEUDOREF rather than PER_WORKTREE. This is consistent with the definition of REF_TYPE_PSEUDOREF (uppercase in the root ref namespace). Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-27 10:06:49 -07:00
Han-Wen Nienhuys	09743417a2	Modify pseudo refs through ref backend storage The previous behavior was introduced in commit `74ec19d4be` ("pseudorefs: create and use pseudoref update and delete functions", Jul 31, 2015), with the justification "alternate ref backends still need to store pseudorefs in GIT_DIR". Refs such as REBASE_HEAD are read through the ref backend. This can only work consistently if they are written through the ref backend as well. Tooling that works directly on files under .git should be updated to use git commands to read refs instead. The following behaviors change: * Updates to pseudorefs (eg. ORIG_HEAD) with core.logAllRefUpdates=always will create reflogs for the pseudoref. * non-HEAD pseudoref symrefs are also dereferenced on deletion. Update t1405 accordingly. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-27 10:06:49 -07:00
Junio C Hamano	523fa69c36	reflog: cleanse messages in the refs.c layer Regarding reflog messages: - We expect that a reflog message consists of a single line. The file format used by the files backend may add a LF after the message as a delimiter, and output by commands like "git log -g" may complete such an incomplete line by adding a LF at the end, but philosophically, the terminating LF is not a part of the message. - We however allow callers of refs API to supply a random sequence of NUL terminated bytes. We cleanse caller-supplied message by squashing a run of whitespaces into a SP, and by trimming trailing whitespace, before storing the message. This is how we tolerate, instead of erring out, a message with LF in it (be it at the end, in the middle, or both). Currently, the cleansing of the reflog message is done by the files backend, before the log is written out. This is sufficient with the current code, as that is the only backend that writes reflogs. But new backends can be added that write reflogs, and we'd want the resulting log message we would read out of "log -g" the same no matter what backend is used, and moving the code to do so to the generic layer is a way to do so. An added benefit is that the "cleansing" function could be updated later, independent from individual backends, to e.g. allow multi-line log messages if we wanted to, and when that happens, it would help a lot to ensure we covered all bases if the cleansing function (which would be updated) is called from the generic layer. Side note: I am not interested in supporting multi-line reflog messages right at the moment (nobody is asking for it), but I envision that instead of the "squash a run of whitespaces into a SP and rtrim" cleansing, we can %urlencode problematic bytes in the message AND append a SP at the end, when a new version of Git that supports multi-line and/or verbatim reflog messages writes a reflog record. The reading side can detect the presense of SP at the end (which should have been rtrimmed out if it were written by existing versions of Git) as a signal that decoding %urlencode recovers the original reflog message. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-10 13:53:37 -07:00
Junio C Hamano	11cbda2add	Merge branch 'js/default-branch-name' The name of the primary branch in existing repositories, and the default name used for the first branch in newly created repositories, is made configurable, so that we can eventually wean ourselves off of the hardcoded 'master'. * js/default-branch-name: contrib: subtree: adjust test to change in fmt-merge-msg testsvn: respect `init.defaultBranch` remote: use the configured default branch name when appropriate clone: use configured default branch name when appropriate init: allow setting the default for the initial branch name via the config init: allow specifying the initial branch name for the new repository docs: add missing diamond brackets submodule: fall back to remote's HEAD for missing remote.<name>.branch send-pack/transport-helper: avoid mentioning a particular branch fmt-merge-msg: stop treating `master` specially	2020-07-06 22:09:17 -07:00
Junio C Hamano	d80bea479d	Merge branch 'ak/commit-graph-to-slab' A few fields in "struct commit" that do not have to always be present have been moved to commit slabs. * ak/commit-graph-to-slab: commit-graph: minimize commit_graph_data_slab access commit: move members graph_pos, generation to a slab commit-graph: introduce commit_graph_data_slab object: drop parsed_object_pool->commit_count	2020-07-06 22:09:14 -07:00
Don Goodman-Wilson	8747ebb7cd	init: allow setting the default for the initial branch name via the config We just introduced the command-line option `--initial-branch=<branch-name>` to allow initializing a new repository with a different initial branch than the hard-coded one. To allow users to override the initial branch name more permanently (i.e. without having to specify the name manually for each and every `git init` invocation), let's introduce the `init.defaultBranch` config setting. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Patrick Steinhardt	6754159767	refs: implement reference transaction hook The low-level reference transactions used to update references are currently completely opaque to the user. While certainly desirable in most usecases, there are some which might want to hook into the transaction to observe all queued reference updates as well as observing the abortion or commit of a prepared transaction. One such usecase would be to have a set of replicas of a given Git repository, where we perform Git operations on all of the repositories at once and expect the outcome to be the same in all of them. While there exist hooks already for a certain subset of Git commands that could be used to implement a voting mechanism for this, many others currently don't have any mechanism for this. The above scenario is the motivation for the new "reference-transaction" hook that reaches directly into Git's reference transaction mechanism. The hook receives as parameter the current state the transaction was moved to ("prepared", "committed" or "aborted") and gets via its standard input all queued reference updates. While the exit code gets ignored in the "committed" and "aborted" states, a non-zero exit code in the "prepared" state will cause the transaction to be aborted prematurely. Given the usecase described above, a voting mechanism can now be implemented via this hook: as soon as it gets called, it will take all of stdin and use it to cast a vote to a central service. When all replicas of the repository agree, the hook will exit with zero, otherwise it will abort the transaction by returning non-zero. The most important upside is that this will catch _all_ commands writing references at once, allowing to implement strong consistency for reference updates via a single mechanism. In order to test the impact on the case where we don't have any "reference-transaction" hook installed in the repository, this commit introduce two new performance tests for git-update-refs(1). Run against an empty repository, it produces the following results: Test origin/master HEAD -------------------------------------------------------------------- 1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4% 1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0% The performance test p1400.2 creates, updates and deletes a branch a thousand times, thus averaging runtime of git-update-refs over 3000 invocations. p1400.3 instead calls `git-update-refs --stdin` three times and queues a thousand creations, updates and deletes respectively. As expected, p1400.3 consistently shows no noticeable impact, as for each batch of updates there's a single call to access(3P) for the negative hook lookup. On the other hand, for p1400.2, one can see an impact caused by this patchset. But doing five runs of the performance tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead ranged from -1.5% to +1.1%. These inconsistent performance numbers can be explained by the overhead of spawning 3000 processes. This shows that the overhead of assembling the hook path and executing access(3P) once to check if it's there is mostly outweighed by the operating system's overhead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 10:46:13 -07:00
Abhishek Kumar	6da43d937c	object: drop parsed_object_pool->commit_count `14ba97f8` (alloc: allow arbitrary repositories for alloc functions, 2018-05-15) introduced parsed_object_pool->commit_count to keep count of commits per repository and was used to assign commit->index. However, commit-slab code requires commit->index values to be unique and a global count would be correct, rather than a per-repo count. Let's introduce a static counter variable, `parsed_commits_count` to keep track of parsed commits so far. As commit_count has no use anymore, let's also drop it from the struct. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-17 14:37:14 -07:00
Junio C Hamano	d3fc8dc53a	Merge branch 'ds/log-exclude-decoration-config' The "--decorate-refs" and "--decorate-refs-exclude" options "git log" takes have learned a companion configuration variable log.excludeDecoration that sits at the lowest priority in the family. * ds/log-exclude-decoration-config: log: add log.excludeDecoration config option log-tree: make ref_filter_match() a helper method	2020-04-28 15:50:08 -07:00

1 2 3 4 5 ...

1007 Commits