From 00a09d57eb8a041e6a6b0470c53533719c049bab Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 23 Jun 2015 06:53:58 -0400 Subject: [PATCH 1/2] introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- .../technical/repository-version.txt | 81 +++++++++++++++++++ cache.h | 6 ++ setup.c | 37 ++++++++- t/t1302-repo-version.sh | 38 +++++++++ 4 files changed, 159 insertions(+), 3 deletions(-) create mode 100644 Documentation/technical/repository-version.txt diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt new file mode 100644 index 0000000000..3d7106d93d --- /dev/null +++ b/Documentation/technical/repository-version.txt @@ -0,0 +1,81 @@ +Git Repository Format Versions +============================== + +Every git repository is marked with a numeric version in the +`core.repositoryformatversion` key of its `config` file. This version +specifies the rules for operating on the on-disk repository data. An +implementation of git which does not understand a particular version +advertised by an on-disk repository MUST NOT operate on that repository; +doing so risks not only producing wrong results, but actually losing +data. + +Because of this rule, version bumps should be kept to an absolute +minimum. Instead, we generally prefer these strategies: + + - bumping format version numbers of individual data files (e.g., + index, packfiles, etc). This restricts the incompatibilities only to + those files. + + - introducing new data that gracefully degrades when used by older + clients (e.g., pack bitmap files are ignored by older clients, which + simply do not take advantage of the optimization they provide). + +A whole-repository format version bump should only be part of a change +that cannot be independently versioned. For instance, if one were to +change the reachability rules for objects, or the rules for locking +refs, that would require a bump of the repository format version. + +Note that this applies only to accessing the repository's disk contents +directly. An older client which understands only format `0` may still +connect via `git://` to a repository using format `1`, as long as the +server process understands format `1`. + +The preferred strategy for rolling out a version bump (whether whole +repository or for a single file) is to teach git to read the new format, +and allow writing the new format with a config switch or command line +option (for experimentation or for those who do not care about backwards +compatibility with older gits). Then after a long period to allow the +reading capability to become common, we may switch to writing the new +format by default. + +The currently defined format versions are: + +Version `0` +----------- + +This is the format defined by the initial version of git, including but +not limited to the format of the repository directory, the repository +configuration file, and the object and ref storage. Specifying the +complete behavior of git is beyond the scope of this document. + +Version `1` +----------- + +This format is identical to version `0`, with the following exceptions: + + 1. When reading the `core.repositoryformatversion` variable, a git + implementation which supports version 1 MUST also read any + configuration keys found in the `extensions` section of the + configuration file. + + 2. If a version-1 repository specifies any `extensions.*` keys that + the running git has not implemented, the operation MUST NOT + proceed. Similarly, if the value of any known key is not understood + by the implementation, the operation MUST NOT proceed. + +Note that if no extensions are specified in the config file, then +`core.repositoryformatversion` SHOULD be set to `0` (setting it to `1` +provides no benefit, and makes the repository incompatible with older +implementations of git). + +This document will serve as the master list for extensions. Any +implementation wishing to define a new extension should make a note of +it here, in order to claim the name. + +The defined extensions are: + +`noop` +~~~~~~ + +This extension does not change git's behavior at all. It is useful only +for testing format-1 compatibility. diff --git a/cache.h b/cache.h index 4f554664c5..996584c1ce 100644 --- a/cache.h +++ b/cache.h @@ -686,7 +686,13 @@ extern char *notes_ref_name; extern int grafts_replace_parents; +/* + * GIT_REPO_VERSION is the version we write by default. The + * _READ variant is the highest number we know how to + * handle. + */ #define GIT_REPO_VERSION 0 +#define GIT_REPO_VERSION_READ 1 extern int repository_format_version; extern int check_repository_format(void); diff --git a/setup.c b/setup.c index 82c0cc2a13..0d5384683c 100644 --- a/setup.c +++ b/setup.c @@ -5,6 +5,7 @@ static int inside_git_dir = -1; static int inside_work_tree = -1; static int work_tree_config_is_bogus; +static struct string_list unknown_extensions = STRING_LIST_INIT_DUP; /* * The input parameter must contain an absolute path, and it must already be @@ -352,10 +353,23 @@ void setup_work_tree(void) static int check_repo_format(const char *var, const char *value, void *cb) { + const char *ext; + if (strcmp(var, "core.repositoryformatversion") == 0) repository_format_version = git_config_int(var, value); else if (strcmp(var, "core.sharedrepository") == 0) shared_repository = git_config_perm(var, value); + else if (skip_prefix(var, "extensions.", &ext)) { + /* + * record any known extensions here; otherwise, + * we fall through to recording it as unknown, and + * check_repository_format will complain + */ + if (!strcmp(ext, "noop")) + ; + else + string_list_append(&unknown_extensions, ext); + } return 0; } @@ -366,6 +380,8 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok) config_fn_t fn; int ret = 0; + string_list_clear(&unknown_extensions, 0); + if (get_common_dir(&sb, gitdir)) fn = check_repo_format; else @@ -383,16 +399,31 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok) * is a good one. */ git_config_early(fn, NULL, repo_config); - if (GIT_REPO_VERSION < repository_format_version) { + if (GIT_REPO_VERSION_READ < repository_format_version) { if (!nongit_ok) die ("Expected git repo version <= %d, found %d", - GIT_REPO_VERSION, repository_format_version); + GIT_REPO_VERSION_READ, repository_format_version); warning("Expected git repo version <= %d, found %d", - GIT_REPO_VERSION, repository_format_version); + GIT_REPO_VERSION_READ, repository_format_version); warning("Please upgrade Git"); *nongit_ok = -1; ret = -1; } + + if (repository_format_version >= 1 && unknown_extensions.nr) { + int i; + + if (!nongit_ok) + die("unknown repository extension: %s", + unknown_extensions.items[0].string); + + for (i = 0; i < unknown_extensions.nr; i++) + warning("unknown repository extension: %s", + unknown_extensions.items[i].string); + *nongit_ok = -1; + ret = -1; + } + strbuf_release(&sb); return ret; } diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh index 0d9388afc4..8dd6fd7baa 100755 --- a/t/t1302-repo-version.sh +++ b/t/t1302-repo-version.sh @@ -67,4 +67,42 @@ test_expect_success 'gitdir required mode' ' ) ' +check_allow () { + git rev-parse --git-dir >actual && + echo .git >expect && + test_cmp expect actual +} + +check_abort () { + test_must_fail git rev-parse --git-dir +} + +# avoid git-config, since it cannot be trusted to run +# in a repository with a broken version +mkconfig () { + echo '[core]' && + echo "repositoryformatversion = $1" && + shift && + + if test $# -gt 0; then + echo '[extensions]' && + for i in "$@"; do + echo "$i" + done + fi +} + +while read outcome version extensions; do + test_expect_success "$outcome version=$version $extensions" " + mkconfig $version $extensions >.git/config && + check_${outcome} + " +done <<\EOF +allow 0 +allow 1 +allow 1 noop +abort 1 no-such-extension +allow 0 no-such-extension +EOF + test_done From 067fbd4105c5aa8260a73cc6961854be0e93fa03 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 23 Jun 2015 06:54:11 -0400 Subject: [PATCH 2/2] introduce "preciousObjects" repository extension If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see. For instance, if you do: $ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1 you now have additional safety when running git in the parent repository. Prunes and repacks will bail with an error, and `git gc` will skip those operations (it will continue to pack refs and do other non-object operations). Older versions of git, when run in the repository, will fail on every operation. Note that we do not set the preciousObjects extension by default when doing a "clone -s", as doing so breaks backwards compatibility. It is a decision the user should make explicitly. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- .../technical/repository-version.txt | 7 ++++++ builtin/gc.c | 18 ++++++++------- builtin/prune.c | 3 +++ builtin/repack.c | 3 +++ cache.h | 1 + environment.c | 1 + setup.c | 2 ++ t/t1302-repo-version.sh | 22 +++++++++++++++++++ 8 files changed, 49 insertions(+), 8 deletions(-) diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 3d7106d93d..00ad37986e 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -79,3 +79,10 @@ The defined extensions are: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~~~~~~~~~~~~~~~~~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). diff --git a/builtin/gc.c b/builtin/gc.c index 36fe33300f..8b8dc6b610 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (gc_before_repack()) return -1; - if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, repack.argv[0]); + if (!repository_format_precious_objects) { + if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, repack.argv[0]); - if (prune_expire) { - argv_array_push(&prune, prune_expire); - if (quiet) - argv_array_push(&prune, "--no-progress"); - if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune.argv[0]); + if (prune_expire) { + argv_array_push(&prune, prune_expire); + if (quiet) + argv_array_push(&prune, "--no-progress"); + if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, prune.argv[0]); + } } if (prune_worktrees_expire) { diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246c72..6a58e75108 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(_("cannot prune in a precious-objects repo")); + while (argc--) { unsigned char sha1[20]; const char *name = *argv++; diff --git a/builtin/repack.c b/builtin/repack.c index af7340c7ba..3beda2c65a 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (delete_redundant && repository_format_precious_objects) + die(_("cannot delete packs in a precious-objects repo")); + if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps; diff --git a/cache.h b/cache.h index 996584c1ce..b1bc401055 100644 --- a/cache.h +++ b/cache.h @@ -694,6 +694,7 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION 0 #define GIT_REPO_VERSION_READ 1 extern int repository_format_version; +extern int repository_format_precious_objects; extern int check_repository_format(void); #define MTIME_CHANGED 0x0001 diff --git a/environment.c b/environment.c index 61c685b8d9..da66e829d1 100644 --- a/environment.c +++ b/environment.c @@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1; int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_version; +int repository_format_precious_objects; const char *git_commit_encoding; const char *git_log_output_encoding; int shared_repository = PERM_UMASK; diff --git a/setup.c b/setup.c index 0d5384683c..8b8dca9fd2 100644 --- a/setup.c +++ b/setup.c @@ -367,6 +367,8 @@ static int check_repo_format(const char *var, const char *value, void *cb) */ if (!strcmp(ext, "noop")) ; + else if (!strcmp(ext, "preciousobjects")) + repository_format_precious_objects = git_config_bool(var, value); else string_list_append(&unknown_extensions, ext); } diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh index 8dd6fd7baa..9bcd34969f 100755 --- a/t/t1302-repo-version.sh +++ b/t/t1302-repo-version.sh @@ -105,4 +105,26 @@ abort 1 no-such-extension allow 0 no-such-extension EOF +test_expect_success 'precious-objects allowed' ' + mkconfig 1 preciousObjects >.git/config && + check_allow +' + +test_expect_success 'precious-objects blocks destructive repack' ' + test_must_fail git repack -ad +' + +test_expect_success 'other repacks are OK' ' + test_commit foo && + git repack +' + +test_expect_success 'precious-objects blocks prune' ' + test_must_fail git prune +' + +test_expect_success 'gc runs without complaint' ' + git gc +' + test_done