1
0
mirror of https://github.com/git/git.git synced 2024-09-23 08:21:16 +02:00

reftable/pq: allocation-less comparison of entry keys

The priority queue is used by the merged iterator to iterate over
reftable records from multiple tables in the correct order. The queue
ends up having one record for each table that is being iterated over,
with the record that is supposed to be shown next at the top. For
example, the key of a ref record is equal to its name so that we end up
sorting the priority queue lexicographically by ref name.

To figure out the order we need to compare the reftable record keys with
each other. This comparison is done by formatting them into a `struct
strbuf` and then doing `strbuf_strcmp()` on the result. We then discard
the buffers immediately after the comparison.

This ends up being very expensive. Because the priority queue usually
contains as many records as we have tables, we call the comparison
function `O(log($tablecount))` many times for every record we insert.
Furthermore, when iterating over many refs, we will insert at least one
record for every ref we are iterating over. So ultimately, this ends up
being called `O($refcount * log($tablecount))` many times.

Refactor the code to use the new `refatble_record_cmp()` function that
has been implemented in a preceding commit. This function does not need
to allocate memory and is thus significantly more efficient.

The following benchmark prints a single ref matching a specific pattern
out of 1 million refs via git-show-ref(1), where the reftable stack
consists of three tables:

  Benchmark 1: show-ref: single matching ref (revision = HEAD~)
    Time (mean ± σ):     224.4 ms ±   6.5 ms    [User: 220.6 ms, System: 3.6 ms]
    Range (min … max):   216.5 ms … 261.1 ms    1000 runs

  Benchmark 2: show-ref: single matching ref (revision = HEAD)
    Time (mean ± σ):     172.9 ms ±   4.4 ms    [User: 169.2 ms, System: 3.6 ms]
    Range (min … max):   166.5 ms … 204.6 ms    1000 runs

  Summary
    show-ref: single matching ref (revision = HEAD) ran
      1.30 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Patrick Steinhardt 2024-02-12 09:32:43 +01:00 committed by Junio C Hamano
parent 5730a9dccf
commit dbe4e8b3fd

View File

@ -14,20 +14,9 @@ https://developers.google.com/open-source/licenses/bsd
int pq_less(struct pq_entry *a, struct pq_entry *b)
{
struct strbuf ak = STRBUF_INIT;
struct strbuf bk = STRBUF_INIT;
int cmp = 0;
reftable_record_key(&a->rec, &ak);
reftable_record_key(&b->rec, &bk);
cmp = strbuf_cmp(&ak, &bk);
strbuf_release(&ak);
strbuf_release(&bk);
int cmp = reftable_record_cmp(&a->rec, &b->rec);
if (cmp == 0)
return a->index > b->index;
return cmp < 0;
}