1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-08 19:26:10 +02:00
Commit Graph

859 Commits

Author SHA1 Message Date
Junio C Hamano 95c1a79630 Merge branch 'jk/info-alternates-fix' into maint
A regression fix for 2.11 that made the code to read the list of
alternate object stores overrun the end of the string.

* jk/info-alternates-fix:
  read_info_alternates: warn on non-trivial errors
  read_info_alternates: read contents into strbuf
2017-10-23 14:40:00 +09:00
Junio C Hamano 96c6bb566e Merge branch 'jk/write-in-full-fix' into maint
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.

* jk/write-in-full-fix:
  read_pack_header: handle signed/unsigned comparison in read result
  config: flip return value of store_write_*()
  notes-merge: use ssize_t for write_in_full() return value
  pkt-line: check write_in_full() errors against "< 0"
  convert less-trivial versions of "write_in_full() != len"
  avoid "write_in_full(fd, buf, len) != len" pattern
  get-tar-commit-id: check write_in_full() return against 0
  config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-10-23 14:37:22 +09:00
Junio C Hamano eeed979e6a Merge branch 'jk/sha1-loose-object-info-fix' into maint
Leakfix and futureproofing.

* jk/sha1-loose-object-info-fix:
  sha1_loose_object_info: handle errors from unpack_sha1_rest
2017-10-18 14:19:14 +09:00
Junio C Hamano 7c9375db0e Merge branch 'jk/drop-sha1-entry-pos' into maint
Code clean-up.

* jk/drop-sha1-entry-pos:
  sha1-lookup: remove sha1_entry_pos() from header file
  sha1_file: drop experimental GIT_USE_LOOKUP search
2017-10-18 14:19:06 +09:00
Jeff King b3ea7dd32d sha1_loose_object_info: handle errors from unpack_sha1_rest
When a caller of sha1_object_info_extended() sets the
"contentp" field in object_info, we call unpack_sha1_rest()
but do not check whether it signaled an error.

This causes two problems:

  1. We pass back NULL to the caller via the contentp field,
     but the function returns "0" for success. A caller
     might reasonably expect after a successful return that
     it can access contentp without a NULL check and
     segfault.

     As it happens, this is impossible to trigger in the
     current code. There is exactly one caller which uses
     contentp, read_object(). And the only thing it does
     after a successful call is to return the content
     pointer to its caller, using NULL as a sentinel for
     errors. So in effect it converts the success code from
     sha1_object_info_extended() back into an error!

     But this is still worth addressing avoid problems for
     future users of "contentp".

  2. Callers of unpack_sha1_rest() are expected to close the
     zlib stream themselves on error. Which means that we're
     leaking the stream.

The problem in (1) comes from from c84a1f3ed4 (sha1_file:
refactor read_object, 2017-06-21), which added the contentp
field.  Before that, we called unpack_sha1_rest() via
unpack_sha1_file(), which directly used the NULL to signal
an error.

But note that the leak in (2) is actually older than that.
The original unpack_sha1_file() directly returned the result
of unpack_sha1_rest() to its caller, when it should have
been closing the zlib stream itself on error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 13:04:41 +09:00
Jeff King f0f7bebef7 read_info_alternates: warn on non-trivial errors
When we fail to open $GIT_DIR/info/alternates, we silently
assume there are no alternates. This is the right thing to
do for ENOENT, but not for other errors.

A hard error is probably overkill here. If we fail to read
an alternates file then either we'll complete our operation
anyway, or we'll fail to find some needed object. Either
way, a warning is good idea. And we already have a helper
function to handle this pattern; let's just call
warn_on_fopen_error().

Note that technically the errno from strbuf_read_file()
might be from a read() error, not open(). But since read()
would never return ENOENT or ENOTDIR, and since it produces
a generic "unable to access" error, it's suitable for
handling errors from either.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-20 11:33:29 +09:00
Junio C Hamano 0db625f5d6 Merge branch 'jk/info-alternates-fix-2.11' into jk/info-alternates-fix
* jk/info-alternates-fix-2.11:
  read_info_alternates: read contents into strbuf
2017-09-20 11:33:06 +09:00
Jeff King dc732bd5cb read_info_alternates: read contents into strbuf
This patch fixes a regression in v2.11.1 where we might read
past the end of an mmap'd buffer. It was introduced in
cf3c635210.

The link_alt_odb_entries() function has always taken a
ptr/len pair as input. Until cf3c635210 (alternates: accept
double-quoted paths, 2016-12-12), we made a copy of those
bytes in a string. But after that commit, we switched to
parsing the input left-to-right, and we ignore "len"
totally, instead reading until we hit a NUL.

This has mostly gone unnoticed for a few reasons:

  1. All but one caller passes a NUL-terminated string, with
     "len" pointing to the NUL.

  2. The remaining caller, read_info_alternates(), passes in
     an mmap'd file. Unless the file is an exact multiple of
     the page size, it will generally be followed by NUL
     padding to the end of the page, which just works.

The easiest way to demonstrate the problem is to build with:

  make SANITIZE=address NO_MMAP=Nope test

Any test which involves $GIT_DIR/info/alternates will fail,
as the mmap emulation (correctly) does not add an extra NUL,
and ASAN complains about reading past the end of the buffer.

One solution would be to teach link_alt_odb_entries() to
respect "len". But it's actually a bit tricky, since we
depend on unquote_c_style() under the hood, and it has no
ptr/len variant.

We could also just make a NUL-terminated copy of the input
bytes and operate on that. But since all but one caller
already is passing a string, instead let's just fix that
caller to provide NUL-terminated input in the first place,
by swapping out mmap for strbuf_read_file().

There's no advantage to using mmap on the alternates file.
It's not expected to be large (and anyway, we're copying its
contents into an in-memory linked list). Nor is using
git_open() buying us anything here, since we don't keep the
descriptor open for a long period of time.

Let's also drop the "len" parameter entirely from
link_alt_odb_entries(), since it's completely ignored. That
will avoid any new callers re-introducing a similar bug.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-20 11:32:04 +09:00
Jeff King f48ecd38cb read_pack_header: handle signed/unsigned comparison in read result
The result of read_in_full() may be -1 if we saw an error.
But in comparing it to a sizeof() result, that "-1" will be
promoted to size_t. In fact, the largest possible size_t
which is much bigger than our struct size. This means that
our "< sizeof(header)" error check won't trigger.

In practice, we'd go on to read uninitialized memory and
compare it to the PACK signature, which is likely to fail.
But we shouldn't get there.

We can fix this by making a direct "!=" comparison to the
requested size, rather than "<". This means that errors get
lumped in with short reads, but that's sufficient for our
purposes here. There's no PH_ERROR tp represent our case.
And anyway, this function reads from pipes and network
sockets. A network error may racily appear as EOF to us
anyway if there's data left in the socket buffers.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:18:00 +09:00
Junio C Hamano f04f860dfa Merge branch 'sb/sha1-file-cleanup' into maint
Code clean-up.

* sb/sha1-file-cleanup:
  sha1_file: make read_info_alternates static
2017-09-10 17:03:04 +09:00
Junio C Hamano c580ce194f Merge branch 'rs/find-pack-entry-bisection' into maint
Code clean-up.

* rs/find-pack-entry-bisection:
  sha1_file: avoid comparison if no packed hash matches the first byte
2017-09-10 17:03:02 +09:00
Junio C Hamano 438776e3d4 Merge branch 'rs/unpack-entry-leakfix' into maint
Memory leak in an error codepath has been plugged.

* rs/unpack-entry-leakfix:
  sha1_file: release delta_stack on error in unpack_entry()
2017-09-10 17:02:53 +09:00
Stefan Beller 2456990dfd sha1_file: make read_info_alternates static
read_info_alternates is not used from outside, so let's make it static.

We have to declare the function before link_alt_odb_entry instead of
moving the code around, link_alt_odb_entry calls read_info_alternates,
which in turn calls link_alt_odb_entry.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15 14:39:25 -07:00
René Scharfe 896dca3ab7 sha1_file: release delta_stack on error in unpack_entry()
When unpack_entry() encounters a broken packed object, it returns early.
It adjusts the reference count of the pack window, but leaks the buffer
for a big delta stack in case the small automatic one was not enough.
Jump to the cleanup code at end instead, which takes care of that.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-10 15:42:46 -07:00
Jeff King f1068efefe sha1_file: drop experimental GIT_USE_LOOKUP search
Long ago in 628522ec14 (sha1-lookup: more memory efficient
search in sorted list of SHA-1, 2007-12-29) we added
sha1_entry_pos(), a binary search that uses the uniform
distribution of sha1s to scale the selection of mid-points.
As this was a performance experiment, we tied it to the
GIT_USE_LOOKUP environment variable and never enabled it by
default.

This code was successful in reducing the number of steps in
each search. But the overhead of the scaling ends up making
it slower when the cache is warm. Here are best-of-five
timings for running rev-list on linux.git, which will have
to look up every object:

  $ time git rev-list --objects --all >/dev/null
  real	0m35.357s
  user	0m35.016s
  sys	0m0.340s

  $ time GIT_USE_LOOKUP=1 git rev-list --objects --all >/dev/null
  real	0m37.364s
  user	0m37.045s
  sys	0m0.316s

The USE_LOOKUP version might have more benefit on a cold
cache, as the time to fault in each page would dominate. But
that would be for a single lookup. In practice, most
operations tend to look up many objects, and the whole pack
.idx will end up warm.

It's possible that the code could be better optimized to
compete with a naive binary search for the warm-cache case,
and we could have the best of both worlds. But over the
years nobody has done so, and this is largely dead code that
is rarely run outside of the test suite. Let's drop it in
the name of simplicity.

This lets us remove sha1_entry_pos() entirely, as the .idx
lookup code was the only caller.  Note that sha1-lookup.c
still contains sha1_pos(), which differs from
sha1_entry_pos() in two ways:

  - it has a different interface; it uses a function pointer
    to access sha1 entries rather than a size/offset pair
    describing the table's memory layout

  - it only scales the initial selection of "mi", rather
    than each iteration of the search

We can't get rid of this function, as it's called from
several places. It may be that we could replace it with a
simple binary search, but that's out of scope for this patch
(and would need benchmarking).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-09 11:03:35 -07:00
René Scharfe 6355a76802 sha1_file: avoid comparison if no packed hash matches the first byte
find_pack_entry_one() uses the fan-out table of pack indexes to find out
which entries match the first byte of the searched hash and does a
binary search on this subset of the main index table.

If there are no matching entries then lo and hi will have the same
value.  The binary search still starts and compares the hash of the
following entry (which has a non-matching first byte, so won't cause any
trouble), or whatever comes after the sorted list of entries.

The probability of that stray comparison matching by mistake is low, but
let's not take any chances and check when entering the binary search
loop if we're actually done already.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-09 09:52:25 -07:00
Junio C Hamano 2842e06352 Merge branch 'ew/fd-cloexec-fix'
Portability/fallback fix.

* ew/fd-cloexec-fix:
  set FD_CLOEXEC properly when O_CLOEXEC is not supported
2017-07-20 16:30:00 -07:00
Eric Wong 9fb9495dae set FD_CLOEXEC properly when O_CLOEXEC is not supported
FD_CLOEXEC only applies to the file descriptor, so it needs to be
manipuluated via F_GETFD/F_SETFD.  F_GETFL/F_SETFL are for file
description flags.

Verified via strace with o_cloexec set to zero.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:52:16 -07:00
Junio C Hamano 91f6922544 Merge branch 'sb/hashmap-customize-comparison'
Update the hashmap API so that data to customize the behaviour of
the comparison function can be specified at the time a hashmap is
initialized.

* sb/hashmap-customize-comparison:
  hashmap: migrate documentation from Documentation/technical into header
  patch-ids.c: use hashmap correctly
  hashmap.h: compare function has access to a data field
2017-07-13 16:14:54 -07:00
Junio C Hamano 00b7cf2379 Merge branch 'jt/unify-object-info'
Code clean-ups.

* jt/unify-object-info:
  sha1_file: refactor has_sha1_file_with_flags
  sha1_file: do not access pack if unneeded
  sha1_file: teach sha1_object_info_extended more flags
  sha1_file: refactor read_object
  sha1_file: move delta base cache code up
  sha1_file: rename LOOKUP_REPLACE_OBJECT
  sha1_file: rename LOOKUP_UNKNOWN_OBJECT
  sha1_file: teach packed_object_info about typename
2017-07-05 13:32:57 -07:00
Junio C Hamano 5ab148dda0 Merge branch 'rs/sha1-name-readdir-optim'
Optimize "what are the object names already taken in an alternate
object database?" query that is used to derive the length of prefix
an object name is uniquely abbreviated to.

* rs/sha1-name-readdir-optim:
  sha1_file: guard against invalid loose subdirectory numbers
  sha1_file: let for_each_file_in_obj_subdir() handle subdir names
  p4205: add perf test script for pretty log formats
  sha1_name: cache readdir(3) results in find_short_object_filename()
2017-07-05 13:32:56 -07:00
Stefan Beller 7663cdc86c hashmap.h: compare function has access to a data field
When using the hashmap a common need is to have access to caller provided
data in the compare function. A couple of times we abuse the keydata field
to pass in the data needed. This happens for example in patch-ids.c.

This patch changes the function signature of the compare function
to have one more void pointer available. The pointer given for each
invocation of the compare function must be defined in the init function
of the hashmap and is just passed through.

Documentation of this new feature is deferred to a later patch.
This is a rather mechanical conversion, just adding the new pass-through
parameter.  However while at it improve the naming of the fields of all
compare functions used by hashmaps by ensuring unused parameters are
prefixed with 'unused_' and naming the parameters what they are (instead
of 'unused' make it 'unused_keydata').

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 12:49:28 -07:00
Jonathan Tan e83e71c5e1 sha1_file: refactor has_sha1_file_with_flags
has_sha1_file_with_flags() implements many mechanisms in common with
sha1_object_info_extended(). Make has_sha1_file_with_flags() a
convenience function for sha1_object_info_extended() instead.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26 10:28:58 -07:00
Jonathan Tan cd585e2a33 sha1_file: do not access pack if unneeded
Currently, regardless of the contents of the "struct object_info" passed
to sha1_object_info_extended(), that function always accesses the
packfile whenever it returns information about a packed object, since it
needs to populate "u.packed".

Add the ability to pass NULL, and use NULL-ness of the argument to
activate an optimization in which sha1_object_info_extended() does not
needlessly access the packfile. A subsequent patch will make use of this
optimization.

A similar optimization is not made for the cached and loose cases as it
would not cause a significant performance improvement.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26 10:28:58 -07:00
Jonathan Tan dfdd4afcf9 sha1_file: teach sha1_object_info_extended more flags
Improve sha1_object_info_extended() by supporting additional
flags. This allows has_sha1_file_with_flags() to be modified to use
sha1_object_info_extended() in a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26 10:28:42 -07:00
Junio C Hamano 50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Junio C Hamano f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
René Scharfe 70c49050d4 sha1_file: guard against invalid loose subdirectory numbers
Loose object subdirectories have hexadecimal names based on the first
byte of the hash of contained objects, thus their numerical
representation can range from 0 (0x00) to 255 (0xff).  Change the type
of the corresponding variable in for_each_file_in_obj_subdir() and
associated callback functions to unsigned int and add a range check.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24 11:09:52 -07:00
René Scharfe 0375f472d4 sha1_file: let for_each_file_in_obj_subdir() handle subdir names
The function for_each_file_in_obj_subdir() takes a object subdirectory
number and expects the name of the same subdirectory to be included in
the path strbuf.  Avoid this redundancy by letting the function append
the hexadecimal subdirectory name itself.  This makes it a bit easier
and safer to use the function -- it becomes impossible to specify
different subdirectories in subdir_nr and path.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24 11:09:50 -07:00
René Scharfe cc817ca3ef sha1_name: cache readdir(3) results in find_short_object_filename()
Read each loose object subdirectory at most once when looking for unique
abbreviated hashes.  This speeds up commands like "git log --pretty=%h"
considerably, which previously caused one readdir(3) call for each
candidate, even for subdirectories that were visited before.

The new cache is kept until the program ends and never invalidated.  The
same is already true for pack indexes.  The inherent racy nature of
finding unique short hashes makes it still fit for this purpose -- a
conflicting new object may be added at any time.  Tasks with higher
consistency requirements should not use it, though.

The cached object names are stored in an oid_array, which is quite
compact.  The bitmap for remembering which subdir was already read is
stored as a char array, with one char per directory -- that's not quite
as compact, but really simple and incurs only an overhead equivalent to
11 hashes after all.

Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22 12:07:51 -07:00
Jonathan Tan c84a1f3ed4 sha1_file: refactor read_object
read_object() and sha1_object_info_extended() both implement mechanisms
such as object replacement, retrying the packed store after failing to
find the object in the packed store then the loose store, and being able
to mark a packed object as bad and then retrying the whole process.
Consolidating these mechanisms would be a great help to maintainability.

Therefore, consolidate them by extending sha1_object_info_extended() to
support the functionality needed, and then modifying read_object() to
use sha1_object_info_extended().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Jonathan Tan 845b102b99 sha1_file: move delta base cache code up
In a subsequent patch, packed_object_info() will be modified to use the
delta base cache, so move the relevant code to before
packed_object_info().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Jonathan Tan 1f0c0d36c1 sha1_file: rename LOOKUP_REPLACE_OBJECT
The LOOKUP_REPLACE_OBJECT flag controls whether the
lookup_replace_object() function is invoked by
sha1_object_info_extended(), read_sha1_file_extended(), and
lookup_replace_object_extended(), but it is not immediately clear which
functions accept that flag.

Therefore restrict this flag to only sha1_object_info_extended(),
renaming it appropriately to OBJECT_INFO_LOOKUP_REPLACE and adding some
documentation. Update read_sha1_file_extended() to have a boolean
parameter instead, and delete lookup_replace_object_extended().

parse_sha1_header() also passes this flag to
parse_sha1_header_extended() since commit 46f0344 ("sha1_file: support
reading from a loose object of unknown type", 2015-05-03), but that has
had no effect since that commit. Therefore this patch also removes this
flag from that invocation.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Jonathan Tan 19fc5e84a7 sha1_file: rename LOOKUP_UNKNOWN_OBJECT
The LOOKUP_UNKNOWN_OBJECT flag was introduced in commit 46f0344
("sha1_file: support reading from a loose object of unknown type",
2015-05-03) in order to support a feature in cat-file subsequently
introduced in commit 39e4ae3 ("cat-file: teach cat-file a
'--allow-unknown-type' option", 2015-05-03). Despite its name and
location in cache.h, this flag is used neither in
read_sha1_file_extended() nor in any of the lookup functions, but used
only in sha1_object_info_extended().

Therefore rename this flag to OBJECT_INFO_ALLOW_UNKNOWN_TYPE, taking the
name of the cat-file flag that invokes this feature, and move it closer
to the declaration of sha1_object_info_extended(). Also add
documentation for this flag.

OBJECT_INFO_ALLOW_UNKNOWN_TYPE is defined to 2, not 1, to avoid
conflicting with LOOKUP_REPLACE_OBJECT. Avoidance of this conflict is
necessary because sha1_object_info_extended() supports both flags.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Ævar Arnfjörð Bjarmason 6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Brandon Williams b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Jonathan Tan 285a2984bd sha1_file: teach packed_object_info about typename
In commit 46f0344 ("sha1_file: support reading from a loose object of
unknown type", 2015-05-06), "struct object_info" gained a "typename"
field that could represent a type name from a loose object file, whether
valid or invalid, as opposed to the existing "typep" which could only
represent valid types. Some relatively complex manipulations were added
to avoid breaking packed_object_info() without modifying it, but it is
much easier to just teach packed_object_info() about the new field.
Therefore, teach packed_object_info() as described above.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 09:51:57 -07:00
Brandon Williams 82b474e025 convert: convert convert_to_git to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams d6c41c20e6 convert: convert convert_to_git_filter_fd to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Junio C Hamano d7f8a37852 Merge branch 'jk/loose-object-fsck'
Code cleanup.

* jk/loose-object-fsck:
  sha1_file: remove an used fd variable
2017-04-23 22:07:50 -07:00
Junio C Hamano eb3af74e93 Merge branch 'jk/no-looking-at-dotgit-outside-repo'
Clean up fallouts from recent tightening of the set-up sequence,
where Git barfs when repository information is accessed without
first ensuring that it was started in a repository.

* jk/no-looking-at-dotgit-outside-repo:
  test-read-cache: setup git dir
  has_sha1_file: don't bother if we are not in a repository
2017-04-19 21:37:20 -07:00
Junio C Hamano b1081e4004 Merge branch 'bc/object-id'
Conversion from unsigned char [40] to struct object_id continues.

* bc/object-id:
  Documentation: update and rename api-sha1-array.txt
  Rename sha1_array to oid_array
  Convert sha1_array_for_each_unique and for_each_abbrev to object_id
  Convert sha1_array_lookup to take struct object_id
  Convert remaining callers of sha1_array_lookup to object_id
  Make sha1_array_append take a struct object_id *
  sha1-array: convert internal storage for struct sha1_array to object_id
  builtin/pull: convert to struct object_id
  submodule: convert check_for_new_submodule_commits to object_id
  sha1_name: convert disambiguate_hint_fn to take object_id
  sha1_name: convert struct disambiguate_state to object_id
  test-sha1-array: convert most code to struct object_id
  parse-options-cb: convert sha1_array_append caller to struct object_id
  fsck: convert init_skiplist to struct object_id
  builtin/receive-pack: convert portions to struct object_id
  builtin/pull: convert portions to struct object_id
  builtin/diff: convert to struct object_id
  Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
  Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
  Define new hash-size constants for allocating memory
2017-04-19 21:37:13 -07:00
Junio C Hamano dfe46c5ce6 Merge branch 'jk/loose-object-info-report-error'
Update error handling for codepath that deals with corrupt loose
objects.

* jk/loose-object-info-report-error:
  index-pack: detect local corruption in collision check
  sha1_loose_object_info: return error for corrupted objects
2017-04-16 23:29:30 -07:00
Sebastian Schuberth 0747fb49fd sha1_file: remove an used fd variable
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:29:18 -07:00
Jonathan Nieder 3e8b7d3c77 has_sha1_file: don't bother if we are not in a repository
Most callers to this function already require that they are in a
git repository, but there is an exception: "git apply" uses
has_sha1_file to avoid work if the result of applying a binary
patch is already present in the repository. When run outside any
repository, this produces an error:

 fatal: BUG: setup_git_env called without repository

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 18:33:11 -07:00
Jeff King 93cff9a978 sha1_loose_object_info: return error for corrupted objects
When sha1_loose_object_info() finds that a loose object file
cannot be stat(2)ed or mmap(2)ed, it returns -1 to signal an
error to the caller.  However, if it found that the loose
object file is corrupt and the object data cannot be used
from it, it stuffs OBJ_BAD into "type" field of the
object_info, but returns zero (i.e., success), which can
confuse callers.

This is due to 052fe5eac (sha1_loose_object_info: make type
lookup optional, 2013-07-12), which switched the return to a
strict success/error, rather than returning the type (but
botched the return).

Callers of regular sha1_object_info() don't notice the
difference, as that function returns the type (which is
OBJ_BAD in this case). However, direct callers of
sha1_object_info_extended() see the function return success,
but without setting any meaningful values in the object_info
struct, leading them to access potentially uninitialized
memory.

The easiest way to see the bug is via "cat-file -s", which
will happily ignore the corruption and report whatever
value happened to be in the "size" variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-01 10:45:16 -07:00
Jeff King 1a168e5c86 convert unchecked snprintf into xsnprintf
These calls to snprintf should always succeed, because their
input is small and fixed. Let's use xsnprintf to make sure
this is the case (and to make auditing for actual truncation
easier).

These could be candidates for turning into heap buffers, but
they fall into a few broad categories that make it not worth
doing:

  - formatting single numbers is simple enough that we can
    see the result should fit

  - the size of a sha1 is likewise well-known, and I didn't
    want to cause unnecessary conflicts with the ongoing
    process to convert these constants to GIT_MAX_HEXSZ

  - the interface for curl_errorstr is dictated by curl

Signed-off-by: Jeff King <peff@peff.net>
2017-03-30 14:59:50 -07:00
Junio C Hamano ba5e05ffef Merge branch 'jk/pack-name-cleanups' into maint
Code clean-up.

* jk/pack-name-cleanups:
  index-pack: make pointer-alias fallbacks safer
  replace snprintf with odb_pack_name()
  odb_pack_keep(): stop generating keepfile name
  sha1_file.c: make pack-name helper globally accessible
  move odb_* declarations out of git-compat-util.h
2017-03-28 13:52:25 -07:00
brian m. carlson cd02599c48 Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 20 bytes, use the constant
GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead
of GIT_SHA1_RAWSZ.  This will ease the transition down the line by
distinguishing between places where we need to allocate memory suitable
for the largest hash from those where we need to handle the current
hash.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-26 22:08:21 -07:00
brian m. carlson dc01505f7f Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 40 hex characters, use the
constant GIT_MAX_HEXSZ, which is designed to be suitable for
allocations, instead of GIT_SHA1_HEXSZ.  This will ease the transition
down the line by distinguishing between places where we need to allocate
memory suitable for the largest hash from those where we need to handle
the current hash.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-26 22:08:21 -07:00