1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-02 12:56:31 +02:00
Commit Graph

51821 Commits

Author SHA1 Message Date
Derrick Stolee c4d25228eb config: create core.multiPackIndex setting
The core.multiPackIndex config setting controls the multi-pack-
index (MIDX) feature. If false, the setting will disable all reads
from the multi-pack-index file.

Read this config setting in the new prepare_multi_pack_index_one()
which is called during prepare_packed_git(). This check is run once
per repository.

Add comparison commands in t5319-multi-pack-index.sh to check
typical Git behavior remains the same as the config setting is turned
on and off. This currently includes 'git rev-list' and 'git log'
commands to trigger several object database reads. Currently, these
would only catch an error in the prepare_multi_pack_index_one(), but
with later commits will catch errors in object lookups, abbreviations,
and approximate object counts.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 662148c435 midx: write object offsets
The final pair of chunks for the multi-pack-index file stores the object
offsets. We default to using 32-bit offsets as in the pack-index version
1 format, but if there exists an offset larger than 32-bits, we use a
trick similar to the pack-index version 2 format by storing all offsets
at least 2^31 in a 64-bit table; we use the 32-bit table to point into
that 64-bit table as necessary.

We only store these 64-bit offsets if necessary, so create a test that
manipulates a version 2 pack-index to fake a large offset. This allows
us to test that the large offset table is created, but the data does not
match the actual packfile offsets. The multi-pack-index offset does match
the (corrupted) pack-index offset, so a future feature will compare these
offsets during a 'verify' step.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee d7cacf29cc midx: write object id fanout chunk
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 0d5b3a5ef7 midx: write object ids in a chunk
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee fe1ed56f5e midx: sort and deduplicate objects from packfiles
Before writing a list of objects and their offsets to a multi-pack-index,
we need to collect the list of objects contained in the packfiles. There
may be multiple copies of some objects, so this list must be deduplicated.

It is possible to artificially get into a state where there are many
duplicate copies of objects. That can create high memory pressure if we
are to create a list of all objects before de-duplication. To reduce
this memory pressure without a significant performance drop,
automatically group objects by the first byte of their object id. Use
the IDX fanout tables to group the data, copy to a local array, then
sort.

Copy only the de-duplicated entries. Select the duplicate based on the
most-recent modified time of a packfile containing the object.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 3227565cfd midx: read pack names into array
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 32f3c541e3 multi-pack-index: write pack names in chunk
The multi-pack-index needs to track which packfiles it indexes. Store
these in our first required chunk. Since filenames are not well
structured, add padding to keep good alignment in later chunks.

Modify the 'git multi-pack-index read' subcommand to output the
existence of the pack-file name chunk. Modify t5319-multi-pack-index.sh
to reflect this new output and the new expected number of chunks.

Defense in depth: A pattern we are using in the multi-pack-index feature
is to verify the data as we write it. We want to ensure we never write
invalid data to the multi-pack-index. There are many checks that verify
that the values we are writing fit the format definitions. This mainly
helps developers while working on the feature, but it can also identify
issues that only appear when dealing with very large data sets. These
large sets are hard to encode into test cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 396f257018 multi-pack-index: read packfile list
When constructing a multi-pack-index file for a given object directory,
read the files within the enclosed pack directory and find matches that
end with ".idx" and find the correct paired packfile using
add_packed_git().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 9208e318f5 packfile: generalize pack directory list
In anticipation of sharing the pack directory listing with the
multi-pack-index, generalize prepare_packed_git_one() into
for_each_file_in_pack_dir().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 2c3813354b t5319: expand test data
As we build the multi-pack-index file format, we want to test the format
on real repositories. Add tests that create repository data including
multiple packfiles with both version 1 and version 2 formats.

The current 'git multi-pack-index write' command will always write the
same file with no "real" data. This will be expanded in future commits,
along with the test expectations.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 4d80560c54 multi-pack-index: load into memory
Create a new multi_pack_index struct for loading multi-pack-indexes into
memory. Create a test-tool builtin for reading basic information about
that multi-pack-index to verify the correct data is written.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee fc59e74844 midx: write header information to lockfile
As we begin writing the multi-pack-index format to disk, start with
the basics: the 12-byte header and the 20-byte checksum footer. Start
with these basics so we can add the rest of the format in small
increments.

As we implement the format, we will use a technique to check that our
computed offsets within the multi-pack-index file match what we are
actually writing. Each method that writes to the hashfile will return
the number of bytes written, and we will track that those values match
our expectations.

Currently, write_midx_header() returns 12, but is not checked. We will
check the return value in a later commit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee a340773026 multi-pack-index: add 'write' verb
In anticipation of writing multi-pack-indexes, add a skeleton
'git multi-pack-index write' subcommand and send the options to a
write_midx_file() method. Also create a skeleton test script that
tests the 'write' subcommand.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:28 -07:00
Derrick Stolee 6a257f03ba multi-pack-index: add builtin
This new 'git multi-pack-index' builtin will be the plumbing access
for writing, reading, and checking multi-pack-index files. The
initial implementation is a no-op.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 11:27:26 -07:00
Derrick Stolee e0d1bcf825 multi-pack-index: add format details
The multi-pack-index feature generalizes the existing pack-index
feature by indexing objects across multiple pack-files.

Describe the basic file format, using a 12-byte header followed by
a lookup table for a list of "chunks" which will be described later.
The file ends with a footer containing a checksum using the hash
algorithm.

The header allows later versions to create breaking changes by
advancing the version number. We can also change the hash algorithm
using a different version value.

We will add the individual chunk format information as we introduce
the code that writes that information.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-12 13:55:02 -07:00
Derrick Stolee ceab693d1f multi-pack-index: add design document
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-12 13:55:02 -07:00
Junio C Hamano 53f9a3e157 Git 2.18
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-21 10:00:06 -07:00
Junio C Hamano 1fb9df7248 Merge branch 'en/rename-directory-detection-reboot'
* en/rename-directory-detection-reboot:
  merge-recursive: use xstrdup() instead of fixed buffer
2018-06-19 11:11:03 -07:00
Junio C Hamano f0ac6e3943 Merge Korean translation for l10n of Git 2.18.0 round 3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbKFyNAAoJEMek6Rt1RHooUS0P/3zxiFjtSTlb0dfZ1VFaC3Zm
 vjowFQg4gUzBK37L1G5YD5sCJsdH/sOsfJlB1ZQ5sV2Ml2i8aJIMQYIrIQV9en3t
 Q+/zndp8i+2GdoL7VkPcoYv5kA/dvUaDn3GBohqkyXJy5oe7s6f1DuqXVCSgLBck
 9CSxbIp2lU7eS7Hm0bnV4RTTVmT0xiYT03CA/CPfIuQd59wPSR33o0zxCzrHjtJO
 ZMyDjbWWBTYFKN77KFAQOSSbhvqwcFbarjQCuN61STy+L6R5kXBgt58ZExHY41Gb
 JOmvqWmEHzKtJ0254fWGOyyIWetA7cjaxdYlE2M96p/etao6TIK1LeESLYRQBx2p
 dIGnUnNo2/7sBL4yIt4oq9BNAS+EOt+2uFM+tl+bKBhFvOGxqAMj4pgw1RtR/7kF
 9cFr+hrqliIUd3sBe4yVOaVmjnLK3vN76IYqFKlwtlE5Br8Jl/zz1PhWM+Z+91zm
 8GSqJTgY6ivtd2GSguVqkj5MNTgDOFNe2IOgAHmkXQEW27gkW+5aZ8GCWVAzwNX+
 YiMtNtuG1+TUGmEHM3kRYItUWpclhViNDwZTrqCJtdlBLwNMYXVHqHr+LkwxElhD
 ZSb4EPl4OHrvpEZmjCRTg5g9FWsZgFdUqwvApHLfb0ghYkcrRf1xGgISfiIeO1Xo
 DOpLbg7zcaVZywzVZ0xr
 =APV7
 -----END PGP SIGNATURE-----

Merge tag 'l10n-2.18.0-rnd3.1' of git://github.com/git-l10n/git-po

Merge Korean translation for l10n of Git 2.18.0 round 3

* tag 'l10n-2.18.0-rnd3.1' of git://github.com/git-l10n/git-po:
  l10n: ko.po: Update Korean translation
2018-06-19 09:29:23 -07:00
Junio C Hamano bd73c3f13f Merge branch 'cf/submodule-progress-dissociate'
* cf/submodule-progress-dissociate:
  t7400: encapsulate setup code in test_expect_success
2018-06-19 09:26:59 -07:00
Junio C Hamano 6b55779afc Merge branch 'js/rebase-i-root-fix'
* js/rebase-i-root-fix:
  t3404: check root commit in 'rebase -i --root reword root commit'
2018-06-19 09:26:28 -07:00
Stefan Beller 83f5fa58e8 t7400: encapsulate setup code in test_expect_success
When running t7400 in a shell you observe more output than expected:

    ...
    ok 8 - setup - hide init subdirectory
    ok 9 - setup - repository to add submodules to
    ok 10 - submodule add
    [master (root-commit) d79ce16] one
     Author: A U Thor <author@example.com>
     1 file changed, 1 insertion(+)
     create mode 100644 one.t
    ok 11 - redirected submodule add does not show progress
    ok 12 - redirected submodule add --progress does show progress
    ok 13 - submodule add to .gitignored path fails
    ...

Fix the output by encapsulating the setup code in test_expect_success

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-19 09:25:56 -07:00
Todd Zullinger 878810552b t3404: check root commit in 'rebase -i --root reword root commit'
When testing a reworded root commit, ensure that the squash-onto commit
which is created and amended is still the root commit.

Suggested-by: Phillip Wood <phillip.wood@talktalk.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-19 09:14:33 -07:00
Karthikeyan Singaravelan 1f2abe68d0 doc: fix typos in documentation and release notes
Signed-off-by: Karthikeyan Singaravelan <tir.karthi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-19 09:01:12 -07:00
Junio C Hamano 242ba98e44 Almost 2.18 final
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 11:24:21 -07:00
Junio C Hamano da34dd49bb Merge branch 'es/make-no-iconv'
"make NO_ICONV=NoThanks" did not override NEEDS_LIBICONV
(i.e. linkage of -lintl, -liconv, etc. that are platform-specific
tweaks), which has been corrected.

* es/make-no-iconv:
  Makefile: make NO_ICONV really mean "no iconv"
2018-06-18 11:23:24 -07:00
Junio C Hamano cc2beafc4b Merge branch 'sg/t7406-chain-fix'
Test fix.

* sg/t7406-chain-fix:
  t7406-submodule-update: fix broken &&-chains
2018-06-18 11:23:24 -07:00
Junio C Hamano 4229478639 Merge branch 'ks/branch-set-upstream'
A test title has been reworded to clarify it.

* ks/branch-set-upstream:
  t3200: clarify description of --set-upstream test
2018-06-18 11:23:23 -07:00
Junio C Hamano f300f5681e Merge branch 'js/rebase-i-root-fix'
A regression to "rebase -i --root" introduced during this cycle has
been fixed.

* js/rebase-i-root-fix:
  rebase --root: fix amending root commit messages
  rebase --root: demonstrate a bug while amending root commit messages
2018-06-18 11:23:22 -07:00
Junio C Hamano f35f43f565 Merge branch 'jk/ewah-bounds-check'
The code to read compressed bitmap was not careful to avoid reading
past the end of the file, which has been corrected.

* jk/ewah-bounds-check:
  ewah: adjust callers of ewah_read_mmap()
  ewah_read_mmap: bounds-check mmap reads
2018-06-18 11:23:22 -07:00
Junio C Hamano 1663e2ba68 l10n for Git 2.18.0 round 3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbJo13AAoJEMek6Rt1RHooQngQAIDuLczx9KUC03Ig5ekbVeyb
 c9l/6DA3D03b6vG++T+C/Ro/rCtbmVWamf2VRXkTjtTk9zMo2oSoVxw3vWoIveQq
 AZ8dcT2js3otY0XUnIznsf0sVWGIN19lxvnwGspVm/1M3uu4/dVBOs84rbEvPPqi
 NtwwmBvFFMSfc7H/EIP4NzMNDWFbMNd2eJwu6rvasIP8RLnk2WaMToibfI9JNV0F
 XDcb3xpIMu3UEePzjOzdG7562pr7SAxOi3TEIP6ToRqo1HRFVIXJspIdOHlc9Lm+
 i1hV8iKnptj9GF/VyKBYdyzHHlQPUiZ+diZiZd5se6DlaE7Cpl4Cx7F1MPq07u0B
 +RKFfHzHguLljUgwEVO4LiZt/U+fWcGu++BkwzIPmTpolxqdaoW8oArBn5KPMY2k
 L3mE2M4ck71VytTrdG+rcQ/jzC9n2fb+65u9J908QV0KL797XGxlzd4E7vPmklaB
 9Ed0yRzndJgMe/aOv1J9RtG7ITpGnxffT1B73R0aOmrPuyu77cJnHY7q6clAqBOH
 vYH0/2VHTRT1qRhUZQTtCC2iYbDAu+ZMW1D6/oG3eC6cMqw2UFMM8yu1WRR1RI/N
 EZl5jLZqyVOulxDoo6f5ZXAohxbQjrBrxE6UOMnsC6Xi+nhng182snfXcahZA7O6
 DD7yY8hJPVIKQiMjXqqy
 =iiA9
 -----END PGP SIGNATURE-----

Merge tag 'l10n-2.18.0-rnd3' of git://github.com/git-l10n/git-po

l10n for Git 2.18.0 round 3

* tag 'l10n-2.18.0-rnd3' of git://github.com/git-l10n/git-po:
  l10n: zh_CN: for git v2.18.0 l10n round 1 to 3
  l10n: bg.po: Updated Bulgarian translation (3608t)
  l10n: vi.po(3608t): Update Vietnamese translation for v2.18.0 round 3
  l10n: fr.po v2.18.0 round 3
  l10n: es.po: Spanish update for v2.18.0 round 3
  l10n: git.pot: v2.18.0 round 3 (1 new, 1 removed)
  l10n: vi.po(3608t): Update Vietnamese translation for v2.18.0 round2
  l10n: bg.po: Updated Bulgarian translation (3608t)
  l10n: es.po: Spanish update for v2.18.0 round 2
  l10n: sv.po: Update Swedish translation (3608t0f0u)
  l10n: sv.po: Update Swedish translation (3470t0f0u)
  l10n: git.pot: v2.18.0 round 2 (144 new, 6 removed)
  l10n: fr.po v2.18 round 1
  l10n: vi(3470t): Updated Vietnamese translation for v2.18.0
  l10n: es.po: Spanish update for v2.18.0 round 1
  l10n: git.pot: v2.18.0 round 1 (108 new, 14 removed)
  l10n: TEAMS: remove inactive de team members
  l10n: de.po: fix typos
  l10n: Update Catalan translation
2018-06-18 10:21:24 -07:00
Junio C Hamano 1022379886 A bunch of micro-fixes before going 2.18 final
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 10:20:42 -07:00
Changwoo Ryu 4898dd2513 l10n: ko.po: Update Korean translation
Update the Korean translation and change the team leader to Gwan-gyeong
Mun.

Signed-off-by: Gwan-gyeong Mun <elongbug@gmail.com>
Signed-off-by: Changwoo Ryu <cwryu@debian.org>
Reviewed-by: Gwan-gyeong Mun <elongbug@gmail.com>
2018-06-19 02:19:42 +09:00
Junio C Hamano 698eb031bb Merge branch 'sb/blame-color'
Leakfix.

* sb/blame-color:
  blame: release string_list after use in parse_color_fields()
2018-06-18 10:18:45 -07:00
Junio C Hamano 23fc55a90c Merge branch 'mw/doc-merge-enumfix'
Fix old merge glitch in Documentation during v2.13-rc0 era.

* mw/doc-merge-enumfix:
  doc: update the order of the syntax `git merge --continue`
2018-06-18 10:18:45 -07:00
Junio C Hamano f72432d64e Merge branch 'en/rename-directory-detection'
Newly added codepath in merge-recursive had potential buffer
overrun, which has been fixed.

* en/rename-directory-detection:
  merge-recursive: use xstrdup() instead of fixed buffer
2018-06-18 10:18:44 -07:00
Junio C Hamano 929c097548 Merge branch 'rd/doc-remote-tracking-with-hyphen'
Doc update.

* rd/doc-remote-tracking-with-hyphen:
  Use hyphenated "remote-tracking branch" (docs and comments)
2018-06-18 10:18:43 -07:00
Junio C Hamano faff81287b Merge branch 'jl/zlib-restore-nul-termination'
Make zlib inflate codepath more robust against versions of zlib
that clobber unused portion of outbuf.

* jl/zlib-restore-nul-termination:
  packfile: correct zlib buffer handling
2018-06-18 10:18:43 -07:00
Junio C Hamano 094381ed79 Merge branch 'ab/cred-netrc-no-autodie'
Hotfix for contrib/ stuff broken by this cycle.

* ab/cred-netrc-no-autodie:
  git-credential-netrc: remove use of "autodie"
2018-06-18 10:18:42 -07:00
Junio C Hamano a626082629 Merge branch 'km/doc-workflows-typofix'
Typofix.

* km/doc-workflows-typofix:
  gitworkflows: fix grammar in 'Merge upwards' rule
2018-06-18 10:18:42 -07:00
Junio C Hamano e638899470 Merge branch 'ld/git-p4-updates'
"git p4" updates.

* ld/git-p4-updates:
  git-p4: auto-size the block
  git-p4: narrow the scope of exceptions caught when parsing an int
  git-p4: raise exceptions from p4CmdList based on error from p4 server
  git-p4: better error reporting when p4 fails
  git-p4: add option to disable syncing of p4/master with p4
  git-p4: disable-rebase: allow setting this via configuration
  git-p4: add options --commit and --disable-rebase
2018-06-18 10:18:41 -07:00
Junio C Hamano d676cc512a Merge branch 'rd/diff-options-typofix'
Typofix.

* rd/diff-options-typofix:
  diff-options.txt: fix minor typos, font inconsistencies, in docs
2018-06-18 10:18:41 -07:00
Junio C Hamano 1bd0e6779a Merge branch 'rd/comment-typofix-in-sha1-file'
In code comment typofix

* rd/comment-typofix-in-sha1-file:
  sha1-file.c: correct $GITDIR to $GIT_DIR in a comment
2018-06-18 10:18:40 -07:00
René Scharfe 94eff2b69a merge-recursive: use xstrdup() instead of fixed buffer
Paths can be longer than PATH_MAX.  Avoid a buffer overrun in
check_dir_renamed() by using xstrdup() to make a private copy safely.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 10:03:38 -07:00
SZEDER Gábor 2e157d134c RelNotes 2.18: minor fix to entry about dynamically loading completions
It was not "newer versions of bash" but newer versions of
bash-completion that made commit 085e2ee0e6 (completion: load
completion file for external subcommand, 2018-04-29) both necessary
and possible.

Update the corresponding RelNotes entry accordingly.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:50:56 -07:00
SZEDER Gábor 8de19d6be8 t7406-submodule-update: fix broken &&-chains
Three tests in 't7406-submodule-update' contain broken &&-chains, but
since they are all in subshells, chain-lint couldn't notice them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:48:10 -07:00
Johannes Schindelin 76fda6ebbc rebase --root: fix amending root commit messages
The code path that triggered that "BUG" really does not want to run
without an explicit commit message. In the case where we want to amend a
commit message, we have an *implicit* commit message, though: the one of
the commit to amend. Therefore, this code path should not even be
entered.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:36:58 -07:00
Todd Zullinger 3a36ca0881 rebase --root: demonstrate a bug while amending root commit messages
When splitting a repository, running `git rebase -i --root` to reword
the initial commit, Git dies with

	BUG: sequencer.c:795: root commit without message.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:22:18 -07:00
Jeff King 1140bf01ec ewah: adjust callers of ewah_read_mmap()
The return value of ewah_read_mmap() is now an ssize_t,
since we could (in theory) process up to 32GB of data. This
would never happen in practice, but a corrupt or malicious
.bitmap or index file could convince us to do so.

Let's make sure that we don't stuff the value into an int,
which would cause us to incorrectly move our pointer
forward.  We'd always move too little, since negative values
are used for reporting errors. So the worst case is just
that we end up reporting a corrupt file, not an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:13:57 -07:00
Jeff King 9d2e330b17 ewah_read_mmap: bounds-check mmap reads
The on-disk ewah format tells us how big the ewah data is,
and we blindly read that much from the buffer without
considering whether the mmap'd data is long enough, which
can lead to out-of-bound reads.

Let's make sure we have data available before reading it,
both for the ewah header/footer as well as for the bit data
itself. In particular:

  - keep our ptr/len pair in sync as we move through the
    buffer, and check it before each read

  - check the size for integer overflow (this should be
    impossible on 64-bit, as the size is given as a 32-bit
    count of 8-byte words, but is possible on a 32-bit
    system)

  - return the number of bytes read as an ssize_t instead of
    an int, again to prevent integer overflow

  - compute the return value using a pointer difference;
    this should yield the same result as the existing code,
    but makes it more obvious that we got our computations
    right

The included test is far from comprehensive, as it just
picks a static point at which to truncate the generated
bitmap. But in practice this will hit in the middle of an
ewah and make sure we're at least exercising this code.

Reported-by: Luat Nguyen <root@l4w.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18 09:13:57 -07:00