1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-12 15:16:09 +02:00
Commit Graph

73269 Commits

Author SHA1 Message Date
Junio C Hamano d8360a86ed Merge branch 'tb/midx-write'
Code clean-up by splitting code responsible for writing midx files
into its own file.

* tb/midx-write:
  midx-write.c: use `--stdin-packs` when repacking
  midx-write.c: check count of packs to repack after grouping
  midx-write.c: factor out common want_included_pack() routine
  midx-write: move writing-related functions from midx.c
2024-04-12 11:31:39 -07:00
Junio C Hamano 28dc93bab0 Merge branch 'rs/t-prio-queue-cleanup'
t-prio-queue test has been cleaned up by using C99 compound
literals; this is meant to also serve as a weather-balloon to smoke
out folks with compilers who have trouble compiling code that uses
the feature.

* rs/t-prio-queue-cleanup:
  t-prio-queue: simplify using compound literals
2024-04-12 11:31:39 -07:00
Junio C Hamano 7fbe3ead19 Merge branch 'ps/reftable-binsearch-updates'
Reftable code clean-up and some bugfixes.

* ps/reftable-binsearch-updates:
  reftable/block: avoid decoding keys when searching restart points
  reftable/record: extract function to decode key lengths
  reftable/block: fix error handling when searching restart points
  reftable/block: refactor binary search over restart points
  reftable/refname: refactor binary search over refnames
  reftable/basics: improve `binsearch()` test
  reftable/basics: fix return type of `binsearch()` to be `size_t`
2024-04-12 11:31:39 -07:00
Junio C Hamano 847af43a3a Merge branch 'jc/checkout-detach-wo-tracking-report'
"git checkout/switch --detach foo", after switching to the detached
HEAD state, gave the tracking information for the 'foo' branch,
which was pointless.

Tested-by: M Hickford <mirth.hickford@gmail.com>
cf. <CAGJzqsmE9FDEBn=u3ge4LA3ha4fDbm4OWiuUbMaztwjELBd7ug@mail.gmail.com>

* jc/checkout-detach-wo-tracking-report:
  checkout: omit "tracking" information on a detached HEAD
2024-04-12 11:31:39 -07:00
Junio C Hamano d8800f630a Merge branch 'rs/imap-send-use-xsnprintf'
Code clean-up and duplicate reduction.

* rs/imap-send-use-xsnprintf:
  imap-send: use xsnprintf to format command
2024-04-12 11:31:38 -07:00
Junio C Hamano d842e22ebb Merge branch 'js/merge-tree-3-trees'
Match the option argument type in the help text to the correct type
updated by a recent series.

* js/merge-tree-3-trees:
  merge-tree: fix argument type of the `--merge-base` option
2024-04-12 11:31:38 -07:00
Johannes Schindelin 0c6ee971fb merge-tree: fix argument type of the `--merge-base` option
In 5f43cf5b2e (merge-tree: accept 3 trees as arguments, 2024-01-28), I
taught `git merge-tree` to perform three-way merges on trees. This
commit even changed the manual page to state that the `--merge-base`
option takes a tree-ish rather than requiring a commit.

But I forgot to adjust the in-program help text. This patch fixes that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 09:10:43 -07:00
Xing Xin 5da40be8d7 Documentation: fix typos describing date format
This commit corrects a typographical error found in both
date-formats.txt and git-fast-import.txt documentation, where the term
`email format` was mistakenly used instead of `date format`.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 09:03:03 -07:00
Patrick Steinhardt 70b81fbf3c t0612: add tests to exercise Git/JGit reftable compatibility
While the reftable format is a recent introduction in Git, JGit already
knows to read and write reftables since 2017. Given the complexity of
the format there is a very real risk of incompatibilities between those
two implementations, which is something that we really want to avoid.

Add some basic tests that verify that reftables written by Git and JGit
can be read by the respective other implementation. For now this test
suite is rather small, only covering basic functionality. But it serves
as a good starting point and can be extended over time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
Patrick Steinhardt db1d63bf57 t0610: fix non-portable variable assignment
Older versions of the Dash shell fail to parse `local var=val`
assignments in some cases when `val` is unquoted. Such failures can be
observed e.g. with Ubuntu 20.04 and older, which has a Dash version that
still has this bug.

Such an assignment has been introduced in t0610. The issue wasn't
detected for a while because this test used to only run when the
GIT_TEST_DEFAULT_REF_FORMAT environment variable was set to "reftable".
We have dropped that requirement now though, meaning that it runs
unconditionally, including on jobs which use such older versions of
Ubuntu.

We have worked around such issues in the past, e.g. in ebee5580ca
(parallel-checkout: avoid dash local bug in tests, 2021-06-06), by
quoting the `val` side. Apply the same fix to t0610.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
Patrick Steinhardt ca13c3e94a t06xx: always execute backend-specific tests
The tests in t06xx exercise specific ref formats. Next to probing some
basic functionality, these tests also exercise other low-level details
specific to the format. Those tests are only executed though in case
`GIT_TEST_DEFAULT_REF_FORMAT` is set to the ref format of the respective
backend-under-test.

Ideally, we would run the full test matrix for ref formats such that our
complete test suite is executed with every supported format on every
supported platform. This is quite an expensive undertaking though, and
thus we only execute e.g. the "reftable" tests on macOS and Linux. As a
result, we basically have no test coverage for the "reftable" format at
all on other platforms like Windows.

Adapt these tests so that they override `GIT_TEST_DEFAULT_REF_FORMAT`,
which means that they'll always execute. This increases test coverage on
platforms that don't run the full test matrix, which at least gives us
some basic test coverage on those platforms for the "reftable" format.

This of course comes at the cost of running those tests multiple times
on platforms where we do run the full test matrix. But arguably, this is
a good thing because it will also cause us to e.g. run those tests with
the address sanitizer and other non-standard parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:51 -07:00
Patrick Steinhardt 04ba2c7eb3 ci: install JGit dependency
We have some tests in t5310 that use JGit to verify that bitmaps can be
read both by Git and by JGit. We do not execute these tests in our CI
jobs though because we don't make JGit available there. Consequently,
the tests basically bitrot because almost nobody is ever going to have
JGit in their path.

Install JGit to plug this test gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt ca44ef3165 ci: make Perforce binaries executable for all users
The Perforce binaries are only made executable for the current user. On
GitLab CI though we execute tests as a different user than "root", and
thus these binaries may not be executable by that test user at all. This
has gone unnoticed so far because those binaries are optional -- in case
they don't exist we simply skip over tests requiring them.

Fix the setup so that we set the executable bits for all users.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 9cdeb34b96 ci: merge scripts which install dependencies
We have two different scripts which install dependencies, one for
dockerized jobs and one for non-dockerized ones. Naturally, these
scripts have quite some duplication. Furthermore, either of these
scripts is missing some test dependencies that the respective other
script has, thus reducing test coverage.

Merge those two scripts such that there is a single source of truth for
test dependencies, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 2c5c7639e5 ci: fix setup of custom path for GitLab CI
Part of "install-dependencies.sh" is to install some binaries required
for tests into a custom directory that gets added to the PATH. This
directory is located at "$HOME/path" and thus depends on the current
user that the script executes as.

This creates problems for GitLab CI, which installs dependencies as the
root user, but runs tests as a separate, unprivileged user. As their
respective home directories are different, we will end up using two
different custom path directories. Consequently, the unprivileged user
will not be able to find the binaries that were set up as root user.

Fix this issue by allowing CI to override the custom path, which allows
GitLab to set up a constant value that isn't derived from "$HOME".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt d1ef3d3b1d ci: merge custom PATH directories
We're downloading various executables required by our tests. Each of
these executables goes into its own directory, which is then appended to
the PATH variable. Consequently, whenever we add a new dependency and
thus a new directory, we would have to adapt to this change in several
places.

Refactor this to instead put all binaries into a single directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 40c60f4c12 ci: convert "install-dependencies.sh" to use "/bin/sh"
We're about to merge the "install-docker-dependencies.sh" script into
"install-dependencies.sh". This will also move our Alpine-based jobs
over to use the latter script. This script uses the Bash shell though,
which is not available by default on Alpine Linux.

Refactor "install-dependencies.sh" to use "/bin/sh" instead of Bash.
This requires us to get rid of the pushd/popd invocations, which are
replaced by some more elaborate commands that download or extract
executables right to where they are needed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 21bcb4a602 ci: drop duplicate package installation for "linux-gcc-default"
The "linux-gcc-default" job installs common Ubuntu packages. This is
already done in the distro-specific switch, so we basically duplicate
the effort here.

Drop the duplicate package installations and inline the variable that
contains those common packages.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 11d3f1aa5f ci: skip sudo when we are already root
Our "install-dependencies.sh" script is executed by non-dockerized jobs
to install dependencies. These jobs don't run with "root" permissions,
but with a separate user. Consequently, we need to use sudo(8) there to
elevate permissions when installing packages.

We're about to merge "install-docker-dependencies.sh" into that script
though, and our Docker containers do run as "root". Using sudo(8) is
thus unnecessary there, even though it would be harmless. On some images
like Alpine Linux though there is no sudo(8) available by default, which
would consequently break the build.

Adapt the script to make "sudo" a no-op when running as "root" user.
This allows us to easily reuse the script for our dockerized jobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt ab2b3aadf3 ci: expose distro name in dockerized GitHub jobs
Expose a distro name in dockerized jobs. This will be used in a
subsequent commit where we merge the installation scripts for dockerized
and non-dockerized jobs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:50 -07:00
Patrick Steinhardt 2d65e5b6a6 ci: rename "runs_on_pool" to "distro"
The "runs_on_pool" environment variable is used by our CI scripts to
distinguish the different kinds of operating systems. It is quite
specific to GitHub Actions though and not really a descriptive name.

Rename the variable to "distro" to clarify its intent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 08:47:49 -07:00
Pi Fisher 84a7c33a4b typo: replace 'commitish' with 'committish'
Across only three files, comments and a single function name used
'commitish' rather than 'commit-ish' or 'committish' as the spelling.
The git glossary accepts a hyphen or a double-t, but not a single-t.
Despite the typo in a translation file, none of the typos appear in
user-visible locations.

Signed-off-by: Pi Fisher <Pi.L.D.Fisher@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-11 15:14:56 -07:00
Junio C Hamano 436d4e5b14 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 10:00:09 -07:00
Junio C Hamano f43863e686 Merge branch 'jc/t2104-style-update'
Coding style fixes.

* jc/t2104-style-update:
  t2104: style fixes
2024-04-10 10:00:09 -07:00
Junio C Hamano 280b74ce18 Merge branch 'kn/clarify-update-ref-doc'
Doc update, as a preparation to enhance "git update-ref --stdin".

* kn/clarify-update-ref-doc:
  githooks: use {old,new}-oid instead of {old,new}-value
  update-ref: use {old,new}-oid instead of {old,new}value
2024-04-10 10:00:08 -07:00
Junio C Hamano a4a1453ad1 Merge branch 'vs/complete-with-set-u-fix'
Another "set -u" fix for the bash prompt (in contrib/) script.

* vs/complete-with-set-u-fix:
  completion: protect prompt against unset SHOWUPSTREAM in nounset mode
  completion: fix prompt with unset SHOWCONFLICTSTATE in nounset mode
2024-04-10 10:00:08 -07:00
Junio C Hamano aaf524cfb0 Merge branch 'rs/mem-pool-size-t-safety'
size_t arithmetic safety.

* rs/mem-pool-size-t-safety:
  mem-pool: use st_add() in mem_pool_strvfmt()
2024-04-10 10:00:08 -07:00
Junio C Hamano dc89c59951 Merge branch 'ds/typofix-core-config-doc'
Typofix.

* ds/typofix-core-config-doc:
  config: fix some small capitalization issues, as spotted
2024-04-10 10:00:08 -07:00
Đoàn Trần Công Danh 03e84cca5d t9604: Fix test for musl libc and new Debian
CST6CDT and the like are POSIX timezone, with no rule for transition.
And POSIX doesn't enforce how to interpret the rule if it's omitted.
Some libc (e.g. glibc) resorted back to IANA (formerly Olson) db rules
for those timezones.  Some libc (e.g. FreeBSD) uses a fixed rule.
Other libc (e.g. musl) interpret that as no transition at all [1].

In addition, distributions (notoriously Debian-derived, which uses IANA
db for CST6CDT and the like) started to split "legacy" timezones
like CST6CDT, EST5EDT into `tzdata-legacy', which will not be installed
by default [2].

In those cases, t9604 will run into failure.

Let's switch to POSIX timezone with rules to change timezone.

1: http://mm.icann.org/pipermail/tz/2024-March/058751.html
2: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043250

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 09:10:31 -07:00
Junio C Hamano 8d320cec60 t2104: style fixes
We use tabs to indent, not two or four spaces.

These days, even the test fixture preparation should be done inside
test_expect_success block.

Address these two style violations in this test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 22:27:49 -07:00
Phillip Wood b4454d5a7b t3428: restore coverage for "apply" backend
This test file assumes the "apply" backend is the default which is not
the case since 2ac0d6273f (rebase: change the default backend from "am"
to "merge", 2020-02-15). Make sure the "apply" backend is tested by
specifying it explicitly.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
Phillip Wood 1ad81756b4 t3428: use test_commit_message
Using a helper function makes the tests shorter and avoids running "git
cat-file" upstream of a pipe.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
Phillip Wood aac1c6e8f5 t3428: modernize test setup
Perform the setup in a dedicated test so the later tests can be run
independently. Also avoid running git upstream of a pipe and take
advantage of test_commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 16:03:19 -07:00
Junio C Hamano 91ec36f2cc The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:31:45 -07:00
Junio C Hamano 8f31543f3d Merge branch 'rj/use-adv-if-enabled'
Use advice_if_enabled() API to rewrite a simple pattern to
call advise() after checking advice_enabled().

* rj/use-adv-if-enabled:
  add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO
  add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC
  add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
2024-04-09 14:31:45 -07:00
Junio C Hamano eacfd581d2 Merge branch 'ps/pack-refs-auto'
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".

Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>

* ps/pack-refs-auto:
  builtin/gc: pack refs when using `git maintenance run --auto`
  builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
  t6500: extract objects with "17" prefix
  builtin/gc: move `struct maintenance_run_opts`
  builtin/pack-refs: introduce new "--auto" flag
  builtin/pack-refs: release allocated memory
  refs/reftable: expose auto compaction via new flag
  refs: remove `PACK_REFS_ALL` flag
  refs: move `struct pack_refs_opts` to where it's used
  t/helper: drop pack-refs wrapper
  refs/reftable: print errors on compaction failure
  reftable/stack: gracefully handle failed auto-compaction due to locks
  reftable/stack: use error codes when locking fails during compaction
  reftable/error: discern locked/outdated errors
  reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-09 14:31:45 -07:00
Junio C Hamano a6abddab1e Merge branch 'es/test-cron-safety'
The test script had an incomplete and ineffective attempt to avoid
clobbering the testing user's real crontab (and its equivalents),
which has been completed.

* es/test-cron-safety:
  test-lib: fix non-functioning GIT_TEST_MAINT_SCHEDULER fallback
2024-04-09 14:31:45 -07:00
Junio C Hamano 989bf45394 Merge branch 'rj/add-p-explicit-reshow'
"git add -p" and other "interactive hunk selection" UI has learned to
skip showing the hunk immediately after it has already been shown, and
an additional action to explicitly ask to reshow the current hunk.

* rj/add-p-explicit-reshow:
  add-patch: do not print hunks repeatedly
  add-patch: introduce 'p' in interactive-patch
2024-04-09 14:31:44 -07:00
Junio C Hamano 4b4081034b Merge branch 'mg/editorconfig-makefile'
The .editorconfig file has been taught that a Makefile uses HT
indentation.

* mg/editorconfig-makefile:
  editorconfig: add Makefiles to "text files"
2024-04-09 14:31:44 -07:00
Junio C Hamano 58dd7e4b11 Merge branch 'ja/doc-markup-updates'
Documentation rules has been explicitly described how to mark-up
literal parts and a few manual pages have been updated as examples.

* ja/doc-markup-updates:
  doc: git-clone: do not autoreference the manpage in itself
  doc: git-clone: apply new documentation formatting guidelines
  doc: git-init: apply new documentation formatting guidelines
  doc: allow literal and emphasis format in doc vs help tests
  doc: rework CodingGuidelines with new formatting rules
2024-04-09 14:31:44 -07:00
Junio C Hamano 4697c8a445 Merge branch 'dg/myfirstobjectwalk-updates'
Update a more recent tutorial doc.

* dg/myfirstobjectwalk-updates:
  MyFirstObjectWalk: add stderr to pipe processing
  MyFirstObjectWalk: fix description for counting omitted objects
  MyFirstObjectWalk: fix filtered object walk
  MyFirstObjectWalk: fix misspelled "builtins/"
  MyFirstObjectWalk: use additional arg in config_fn_t
2024-04-09 14:31:44 -07:00
Junio C Hamano 39b2c6f77e Merge branch 'jc/advice-sans-trailing-whitespace'
The "hint:" messages given by the advice mechanism, when given a
message with a blank line, left a line with trailing whitespace,
which has been cleansed.

* jc/advice-sans-trailing-whitespace:
  advice: omit trailing whitespace
2024-04-09 14:31:43 -07:00
Junio C Hamano 8289a36f87 Merge branch 'jc/apply-parse-diff-git-header-names-fix'
"git apply" failed to extract the filename the patch applied to,
when the change was about an empty file created in or deleted from
a directory whose name ends with a SP, which has been corrected.

* jc/apply-parse-diff-git-header-names-fix:
  t4126: fix "funny directory name" test on Windows (again)
  t4126: make sure a directory with SP at the end is usable
  apply: parse names out of "diff --git" more carefully
2024-04-09 14:31:43 -07:00
Patrick Steinhardt 69d87802da t0610: execute git-pack-refs(1) with specified umask
The tests for git-pack-refs(1) with the `core.sharedRepository` config
execute git-pack-refs(1) outside of the shell that has the expected
umask set. This is wrong because we want to test the behaviour of that
command with different umasks. The issue went unnoticed because most
distributions have a default umask of 0022, and we only ever test with
`--shared=true`, which re-adds the group write bit.

Fix the issue by moving git-pack-refs(1) into the umask'd shell and add
a bunch of test cases that exercise behaviour more thoroughly.

Note that we drop the check for whether `core.sharedRepository` was set
to the correct value to make the test setup a bit easier. We should be
able to rely on git-init(1) doing its thing correctly. Furthermore, to
help readability, we convert tests that pass `--shared=true` to instead
pass the equivalent `--shared=group`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:14:00 -07:00
Patrick Steinhardt 2f960dd5fe t0610: make `--shared=` tests reusable
We have two kinds of `--shared=` tests, one for git-init(1) and one for
git-pack-refs(1). Merge them into a reusable function such that we can
easily add additional testcases with different umasks and flags for the
`--shared=` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:14:00 -07:00
Patrick Steinhardt fa74f32291 reftable/block: reuse compressed array
Similar to the preceding commit, let's reuse the `compressed` array that
we use to store compressed data in. This results in a small reduction in
memory allocations when writing many refs.

Before:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,618,257 allocs, 22,618,106 frees, 1,236,351,528 bytes allocated

So while the reduction in allocations isn't really all that big, it's a
low hanging fruit and thus there isn't much of a reason not to pick it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:42 -07:00
Patrick Steinhardt a155ab2bf4 reftable/block: reuse zstream when writing log blocks
While most reftable blocks are written to disk as-is, blocks for log
records are compressed with zlib. To compress them we use `compress2()`,
which is a simple wrapper around the more complex `zstream` interface
that would require multiple function invocations.

One downside of this interface is that `compress2()` will reallocate
internal state of the `zstream` interface on every single invocation.
Consequently, as we call `compress2()` for every single log block which
we are about to write, this can lead to quite some memory allocation
churn.

Refactor the code so that the block writer reuses a `zstream`. This
significantly reduces the number of bytes allocated when writing many
refs in a single transaction, as demonstrated by the following benchmark
that writes 100k refs in a single transaction.

Before:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,631,887 allocs, 22,631,736 frees, 1,854,670,793 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 671,931 bytes in 151 blocks
    total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:42 -07:00
Patrick Steinhardt 8aaeffe3b5 reftable/writer: reset `last_key` instead of releasing it
The reftable writer tracks the last key that it has written so that it
can properly compute the compressed prefix for the next record it is
about to write. This last key must be reset whenever we move on to write
the next block, which is done in `writer_reinit_block_writer()`. We do
this by calling `strbuf_release()` though, which needlessly deallocates
the underlying buffer.

Convert the code to use `strbuf_reset()` instead, which saves one
allocation per block we're about to write. This requires us to also
amend `reftable_writer_free()` to release the buffer's memory now as we
previously seemingly relied on `writer_reinit_block_writer()` to release
the memory for us. Releasing memory here is the right thing to do
anyway.

While at it, convert a callsite where we truncate the buffer by setting
its length to zero to instead use `strbuf_reset()`, too.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
Patrick Steinhardt 60dd319519 reftable/writer: unify releasing memory
There are two code paths which release memory of the reftable writer:

  - `reftable_writer_close()` releases internal state after it has
    written data.

  - `reftable_writer_free()` releases the block that was written to and
    the writer itself.

Both code paths free different parts of the writer, and consequently the
caller must make sure to call both. And while callers mostly do this
already, this falls apart when a write failure causes the caller to skip
calling `reftable_write_close()`.

Introduce a new function `reftable_writer_release()` that releases all
internal state and call it from both paths. Like this it is fine for the
caller to not call `reftable_writer_close()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00
Patrick Steinhardt 7e892fec47 reftable/writer: refactorings for `writer_flush_nonempty_block()`
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_flush_nonempty_block()` such that it conforms
better to it and add some documentation that explains some of its more
intricate behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08 17:01:41 -07:00