mirror/git - git - git.dotya.ml

mirror/ git

mirror of https://github.com/git/git.git synced 2024-06-01 21:46:10 +02:00

Author	SHA1	Message	Date
René Scharfe	86ad3ea5cf	sha1-file: release strbuf after use Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-07 12:28:57 -07:00
Junio C Hamano	68e65ded5b	Merge branch 'jk/check-connected-with-alternates' The tips of refs from the alternate object store can be used as starting point for reachability computation now. * jk/check-connected-with-alternates: check_everything_connected: assume alternate ref tips are valid object-store.h: move for_each_alternate_ref() from transport.h	2019-07-19 11:30:21 -07:00
Jeff King	709dfa6990	object-store.h: move for_each_alternate_ref() from transport.h There's nothing inherently transport-related about enumerating the alternate ref tips. The code has lived in transport.[ch] because the only use so far had been advertising available tips during transport. But it could be used for more, and a future patch will teach rev-list to access these refs. Let's move it alongside the other alt-odb code, declaring it in object-store.h with the implementation in sha1-file.c. This lets us drop the inclusion of transport.h from receive-pack, which perhaps shows how it was misplaced (though receive-pack is about transporting objects, transport.h is mostly about the client side). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-07-01 09:47:29 -07:00
Nguyễn Thái Ngọc Duy	d3b4705ab8	sha1-file.c: remove the_repo from read_object_with_reference() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-27 12:45:17 -07:00
Junio C Hamano	5d5c46b28c	Merge branch 'ds/object-info-for-prefetch-fix' Code cleanup and futureproof. * ds/object-info-for-prefetch-fix: sha1-file: split OBJECT_INFO_FOR_PREFETCH	2019-06-17 10:15:14 -07:00
Derrick Stolee	31f5256c82	sha1-file: split OBJECT_INFO_FOR_PREFETCH The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in `0f4a4fb1` (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29) and is used to prevent the fetch_objects() method when enabled. However, there is a problem with the current use. The definition of OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK. This is clearly stated above the definition (in a comment) that this is so OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK. The problem is that using "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies OBJECT_INFO_FOR_PREFETCH. Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep OBJECT_INFO_FOR_PREFETCH as the union of two flags. This allows a clearer use of flag checking while also keeping the implication of OBJECT_INFO_QUICK. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-28 13:31:50 -07:00
Junio C Hamano	32dc15dec1	Merge branch 'jt/batch-fetch-blobs-in-diff' While running "git diff" in a lazy clone, we can upfront know which missing blobs we will need, instead of waiting for the on-demand machinery to discover them one by one. Aim to achieve better performance by batching the request for these promised blobs. * jt/batch-fetch-blobs-in-diff: diff: batch fetching of missing blobs sha1-file: support OBJECT_INFO_FOR_PREFETCH	2019-04-25 16:41:19 +09:00
Jonathan Tan	0f4a4fb1c4	sha1-file: support OBJECT_INFO_FOR_PREFETCH Teach oid_object_info_extended() to support a new flag that inhibits fetching of missing objects. This is equivalent to setting fetch_is_missing to 0, calling oid_object_info_extended(), then setting fetch_if_missing to whatever it was before. Update unpack-trees.c to use this new flag instead of repeatedly setting fetch_if_missing. This new flag complicates things slightly in that there are now 2 ways to do the same thing. But this eliminates the need to repeatedly set a global variable, and more importantly, allows prefetching to be done in parallel (in the future); hence, this patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 15:47:15 +09:00
brian m. carlson	95399788d1	hash: add a function to lookup hash algorithm by length There are some cases, such as the dumb HTTP transport and bundles, where we can only determine the hash algorithm in use by the length of the object IDs. Provide a function that looks up the algorithm by length. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 11:57:39 +09:00
Junio C Hamano	cba595ab1a	Merge branch 'jk/loose-object-cache-oid' Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references	2019-02-06 22:05:27 -08:00
Junio C Hamano	b99a579f8e	Merge branch 'sb/more-repo-in-api' The in-core repository instances are passed through more codepaths. * sb/more-repo-in-api: (23 commits) t/helper/test-repository: celebrate independence from the_repository path.h: make REPO_GIT_PATH_FUNC repository agnostic commit: prepare free_commit_buffer and release_commit_memory for any repo commit-graph: convert remaining functions to handle any repo submodule: don't add submodule as odb for push submodule: use submodule repos for object lookup pretty: prepare format_commit_message to handle arbitrary repositories commit: prepare logmsg_reencode to handle arbitrary repositories commit: prepare repo_unuse_commit_buffer to handle any repo commit: prepare get_commit_buffer to handle any repo commit-reach: prepare in_merge_bases[_many] to handle any repo commit-reach: prepare get_merge_bases to handle any repo commit-reach.c: allow get_merge_bases_many_0 to handle any repo commit-reach.c: allow remove_redundant to handle any repo commit-reach.c: allow merge_bases_many to handle any repo commit-reach.c: allow paint_down_to_common to handle any repo commit: allow parse_commit* to handle any repo object: parse_object to honor its repository argument object-store: prepare has_{sha1, object}_file to handle any repo object-store: prepare read_object_file to deal with any repo ...	2019-02-05 14:26:09 -08:00
Junio C Hamano	33e4ae9c50	Merge branch 'bc/sha-256' Add sha-256 hash and plug it through the code to allow building Git with the "NewHash". * bc/sha-256: hash: add an SHA-256 implementation using OpenSSL sha256: add an SHA-256 implementation using libgcrypt Add a base implementation of SHA-256 support commit-graph: convert to using the_hash_algo t/helper: add a test helper to compute hash speed sha1-file: add a constant for hash block size t: make the sha1 test-tool helper generic t: add basic tests for our SHA-1 implementation cache: make hashcmp and hasheq work with larger hashes hex: introduce functions to print arbitrary hashes sha1-file: provide functions to look up hash algorithms sha1-file: rename algorithm to "sha1"	2019-01-29 12:47:55 -08:00
Junio C Hamano	2c0a645d9e	Merge branch 'rs/sha1-file-close-mapped-file-on-error' Code clean-up. * rs/sha1-file-close-mapped-file-on-error: sha1-file: close fd of empty file in map_sha1_file_1()	2019-01-18 13:49:56 -08:00
Jeff King	01f8d5948a	prefer "hash mismatch" to "sha1 mismatch" To future-proof ourselves against a change in the hash, let's use the more generic "hash mismatch" to refer to integrity problems. Note that we do advertise this exact string in git-fsck(1). However, the message itself is marked for translation, meaning we do not expect it to be machine-readable. While we're touching that documentation, let's also update it for grammar and clarity. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
Jeff King	2c319886c0	sha1-file: avoid "sha1 file" for generic use in messages These error messages say "sha1 file", which is vague and not common in user-facing documentation. Unlike the conversions from the previous commit, these do not always refer to loose objects. In finalize_object_file() we could be dealing with a packfile. Let's just say "unable to write file" instead; since we include the filename, the nature of the file is clear from the rest of the message. In force_object_loose(), we're calling into read_object(), which could actually be _any_ type of object. Just say "object". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
Jeff King	760113574f	sha1-file: prefer "loose object file" to "sha1 file" in messages When we're reporting an error for a loose object, let's use that term. It's more consistent with other parts of Git, and it is future-proof against changes to the hash function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
Jeff King	5d3679ee02	sha1-file: drop has_sha1_file() There are no callers left of has_sha1_file() or its with_flags() variant. Let's drop them, and convert has_object_file() from a wrapper into the "real" function. Ironically, the sha1 variant was just copying into an object_id internally, so the resulting code is actually shorter! We can also drop the coccinelle rules for catching has_sha1_file() callers. Since the function no longer exists, the compiler will do that for us. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
Jeff King	98374a07c9	convert has_sha1_file() callers to has_object_file() The only remaining callers of has_sha1_file() actually have an object_id already. They can use the "object" variant, rather than dereferencing the hash themselves. The code changes here were completely generated by the included coccinelle patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
Jeff King	d7a245730b	sha1-file: convert pass-through functions to object_id These two static functions, read_object() and quick_has_loose(), both have to hashcpy() their bare-sha1 arguments into object_id structs to pass them along. Since all of their callers actually have object_id structs in the first place, we can eliminate the copying by adjusting their input parameters. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:41:06 -08:00
René Scharfe	8be88dbcb1	object-store: retire odb_load_loose_cache() Inline odb_load_loose_cache() into its only remaining caller, odb_loose_cache(). The latter offers a nicer interface for loading the cache, as it doesn't require callers to deal with fanout directory numbers directly. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Jeff King	00a7760e81	sha1-file: modernize loose header/stream functions As with the open/map/close functions for loose objects that were recently converted, the functions for parsing the loose object stream use the name "sha1" and a bare "unsigned char *". Let's fix that so that unpack_sha1_header() becomes unpack_loose_header(), etc. These conversions are less clear-cut than the file access functions. You could argue that the they are parsing Git's canonical object format (i.e., "type size\0contents", over which we compute the hash), which is not strictly tied to loose storage. But in practice these functions are used only for loose objects, and using the term "loose_header" (instead of "object_header") distinguishes it from the object header found in packfiles (which contains the same information in a different format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
René Scharfe	4cea1ce0f6	object-store: use one oid_array per subdirectory for loose cache The loose objects cache is filled one subdirectory at a time as needed. It is stored in an oid_array, which has to be resorted after each add operation. So when querying a wide range of objects, the partially filled array needs to be resorted up to 255 times, which takes over 100 times longer than sorting once. Use one oid_array for each subdirectory. This ensures that entries have to only be sorted a single time. It also avoids eight binary search steps for each cache lookup as a small bonus. The cache is used for collision checks for the log placeholders %h, %t and %p, and we can see the change speeding them up in a repository with ca. 100 objects per subdirectory: $ git count-objects 26733 objects, 68808 kilobytes Test HEAD^ HEAD -------------------------------------------------------------------- 4205.1: log with %H 0.51(0.47+0.04) 0.51(0.49+0.02) +0.0% 4205.2: log with %h 0.84(0.82+0.02) 0.60(0.57+0.03) -28.6% 4205.3: log with %T 0.53(0.49+0.04) 0.52(0.48+0.03) -1.9% 4205.4: log with %t 0.84(0.80+0.04) 0.60(0.59+0.01) -28.6% 4205.5: log with %P 0.52(0.48+0.03) 0.51(0.50+0.01) -1.9% 4205.6: log with %p 0.85(0.78+0.06) 0.61(0.56+0.05) -28.2% 4205.7: log with %h-%h-%h 0.96(0.92+0.03) 0.69(0.64+0.04) -28.1% Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Jeff King	514c5fdd03	sha1-file: modernize loose object file functions The loose object access code in sha1-file.c is some of the oldest in Git, and could use some modernizing. It mostly uses "unsigned char *" for object ids, which these days should be "struct object_id". It also uses the term "sha1_file" in many functions, which is confusing. The term "loose_objects" is much better. It clearly distinguishes them from packed objects (which didn't even exist back when the name "sha1_file" came into being). And it also distinguishes it from the checksummed-file concept in csum-file.c (which until recently was actually called "struct sha1file"!). This patch converts the functions {open,close,map,stat}_sha1_file() into open_loose_object(), etc, and switches their sha1 arguments for object_id structs. Similarly, path functions like fill_sha1_path() become fill_loose_path() and use object_ids. The function sha1_loose_object_info() already says "loose", so we can just drop the "sha1" (and teach it to use object_id). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
René Scharfe	d4e19e5163	object-store: factor out odb_clear_loose_cache() Add and use a function for emptying the loose object cache, so callers don't have to know any of its implementation details. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
René Scharfe	0000d6543f	object-store: factor out odb_loose_cache() Add and use a function for loading the entries of a loose object subdirectory for a given object ID. It frees callers from deriving the fanout key; they can use the returned oid_array reference for lookups or forward range scans. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Jeff King	cb1c8d1d3c	sha1-file: fix outdated sha1 comment references Commit `17e65451e3` (sha1_file: convert check_sha1_signature to struct object_id, 2018-03-12) switched to using the name "oid", but forgot to update the variable name in the comment. Likewise, `b4f5aca40e` (sha1_file: convert read_sha1_file to struct object_id, 2018-03-12) dropped the name read_sha1_file(), but missed a comment which mentions it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
René Scharfe	6881925ef5	sha1-file: close fd of empty file in map_sha1_file_1() map_sha1_file_1() checks if the file it is about to mmap() is empty and errors out in that case and explains the situation in an error message. It leaks the private handle to that empty file, though. Have the function clean up after itself and close the file descriptor before exiting early. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-07 08:57:18 -08:00
Junio C Hamano	3b2f8a02fa	Merge branch 'jk/loose-object-cache' Code clean-up with optimization for the codepath that checks (non-)existence of loose objects. * jk/loose-object-cache: odb_load_loose_cache: fix strbuf leak fetch-pack: drop custom loose object cache sha1-file: use loose object cache for quick existence check object-store: provide helpers for loose_objects_cache sha1-file: use an object_directory for the main object dir handle alternates paths the same as the main object dir sha1_file_name(): overwrite buffer instead of appending rename "alternate_object_database" to "object_directory" submodule--helper: prefer strip_suffix() to ends_with() fsck: do not reuse child_process structs	2019-01-04 13:33:32 -08:00
Jeff King	7317aa7153	odb_load_loose_cache: fix strbuf leak Commit `3a2e0824` ("object-store: provide helpers for loose_objects_cache", 2018-11-12) moved the cache-loading code from find_short_object_filename(), but forgot the line that releases the path strbuf. Reported-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-24 11:13:24 +09:00
Junio C Hamano	f5f0f68d61	Merge branch 'tb/print-size-t-with-uintmax-format' Code preparation to replace ulong vars with size_t vars where appropriate. * tb/print-size-t-with-uintmax-format: Upcast size_t variables to uintmax_t when printing	2018-11-19 16:24:41 +09:00
Stefan Beller	9b45f49981	object-store: prepare has_{sha1, object}_file to handle any repo Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 17:22:40 +09:00
Stefan Beller	a3b72c89bd	object-store: allow read_object_file_extended to read from any repo read_object_file_extended is not widely used, so migrate it all at once. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 17:22:40 +09:00
brian m. carlson	13eeedb5d1	Add a base implementation of SHA-256 support SHA-1 is weak and we need to transition to a new hash function. For some time, we have referred to this new function as NewHash. Recently, we decided to pick SHA-256 as NewHash. The reasons behind the choice of SHA-256 are outlined in the thread starting at [1] and in the commit history for the hash function transition document. Add a basic implementation of SHA-256 based off libtomcrypt, which is in the public domain. Optimize it and restructure it to meet our coding standards. Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits. Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly. Note that with this patch, it is still not possible to switch to using SHA-256 in Git. Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed. [1] https://public-inbox.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/ Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:53 +09:00
brian m. carlson	a2ce0a7526	sha1-file: add a constant for hash block size There is one place we need the hash algorithm block size: the HMAC code for push certs. Expose this constant in struct git_hash_algo and expose values for SHA-1 and for the largest value of any hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
Junio C Hamano	879a8d4bf2	Merge branch 'jk/detect-truncated-zlib-input' A regression in Git 2.12 era made "git fsck" fall into an infinite loop while processing truncated loose objects. * jk/detect-truncated-zlib-input: cat-file: handle streaming failures consistently check_stream_sha1(): handle input underflow t1450: check large blob in trailing-garbage test	2018-11-13 22:37:17 +09:00
Jeff King	61c7711cfe	sha1-file: use loose object cache for quick existence check In cases where we expect to ask has_sha1_file() about a lot of objects that we are not likely to have (e.g., during fetch negotiation), we already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with a simultaneous write or repack) for speed (we avoid re-scanning the pack directory). However, even checking for loose objects can be expensive, as we will stat() each one. On many systems this cost isn't too noticeable, but stat() can be particularly slow on some operating systems, or due to network filesystems. Since the QUICK flag already tells us that we're OK with a slightly stale answer, we can use that as a cue to look in our in-memory cache of each object directory. That basically trades an in-memory binary search for a stat() call. Note that it is possible for this to actually be _slower_. We'll do a full readdir() to fill the cache, so if you have a very large number of loose objects and a very small number of lookups, that readdir() may end up more expensive. This shouldn't be a big deal in practice. If you have a large number of reachable loose objects, you'll already run into performance problems (which you should remedy by repacking). You may have unreachable objects which wouldn't otherwise impact performance. Usually these would go away with the prune step of "git gc", but they may be held for up to 2 weeks in the default configuration. So it comes down to how many such objects you might reasonably expect to have, how much slower is readdir() on N entries versus M stat() calls (and here we really care about the syscall backing readdir(), like getdents() on Linux, but I'll just call this readdir() below). If N is much smaller than M (a typical packed repo), we know this is a big win (few readdirs() followed by many uses of the resulting cache). When N and M are similar in size, it's also a win. We care about the latency of making a syscall, and readdir() should be giving us many values in a single call. How many? On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512 entries per call (which is 64 bytes per entry; the name itself is 38 bytes, plus there are some other fields). So we can imagine that this is always a win as long as the number of loose objects in the repository is a factor of 500 less than the number of lookups you make. It's hard to auto-tune this because we don't generally know up front how many lookups we're going to do. But it's unlikely for this to perform significantly worse. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:03 +09:00
Jeff King	3a2e08245c	object-store: provide helpers for loose_objects_cache Our object_directory struct has a loose objects cache that all users of the struct can see. But the only one that knows how to load the cache is find_short_object_filename(). Let's extract that logic in to a reusable function. While we're at it, let's also reset the cache when we re-read the object directories. This shouldn't have an impact on performance, as re-reads are meant to be rare (and are already expensive, so we avoid them with things like OBJECT_INFO_QUICK). Since the cache is already meant to be an approximation, it's tempting to skip even this bit of safety. But it's necessary to allow more code to use it. For instance, fetch-pack explicitly re-reads the object directory after performing its fetch, and would be confused if we didn't clear the cache. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:03 +09:00
Jeff King	f0eaf63819	sha1-file: use an object_directory for the main object dir Our handling of alternate object directories is needlessly different from the main object directory. As a result, many places in the code basically look like this: do_something(r->objects->objdir); for (odb = r->objects->alt_odb_list; odb; odb = odb->next) do_something(odb->path); That gets annoying when do_something() is non-trivial, and we've resorted to gross hacks like creating fake alternates (see find_short_object_filename()). Instead, let's give each raw_object_store a unified list of object_directory structs. The first will be the main store, and everything after is an alternate. Very few callers even care about the distinction, and can just loop over the whole list (and those who care can just treat the first element differently). A few observations: - we don't need r->objects->objectdir anymore, and can just mechanically convert that to r->objects->odb->path - object_directory's path field needs to become a real pointer rather than a FLEX_ARRAY, in order to fill it with expand_base_dir() - we'll call prepare_alt_odb() earlier in many functions (i.e., outside of the loop). This may result in us calling it even when our function would be satisfied looking only at the main odb. But this doesn't matter in practice. It's not a very expensive operation in the first place, and in the majority of cases it will be a noop. We call it already (and cache its results) in prepare_packed_git(), and we'll generally check packs before loose objects. So essentially every program is going to call it immediately once per program. Arguably we should just prepare_alt_odb() immediately upon setting up the repository's object directory, which would save us sprinkling calls throughout the code base (and forgetting to do so has been a source of subtle bugs in the past). But I've stopped short of that here, since there are already a lot of other moving parts in this patch. - Most call sites just get shorter. The check_and_freshen() functions are an exception, because they have entry points to handle local and nonlocal directories separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:03 +09:00
Jeff King	f3f043a103	handle alternates paths the same as the main object dir When we generate loose file paths for the main object directory, the caller provides a buffer to loose_object_path (formerly sha1_file_name). The callers generally keep their own static buffer to avoid excessive reallocations. But for alternate directories, each struct carries its own scratch buffer. This is needlessly different; let's unify them. We could go either direction here, but this patch moves the alternates struct over to the main directory style (rather than vice-versa). Technically the alternates style is more efficient, as it avoids rewriting the object directory name on each call. But this is unlikely to matter in practice, as we avoid reallocations either way (and nobody has ever noticed or complained that the main object directory is copying a few extra bytes before making a much more expensive system call). And this has the advantage that the reusable buffers are tied to particular calls, which makes the invalidation rules simpler (for example, the return value from stat_sha1_file() used to be invalidated by basically any other object call, but now it is affected only by other calls to stat_sha1_file()). We do steal the trick from alt_sha1_path() of returning a pointer to the filled buffer, which makes a few conversions more convenient. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:02 +09:00
Jeff King	b69fb867b4	sha1_file_name(): overwrite buffer instead of appending The sha1_file_name() function is used to generate the path to a loose object in the object directory. It doesn't make much sense for it to append, since the the path we write may be absolute (i.e., you cannot reliably build up a path with it). Because many callers use it with a static buffer, they have to strbuf_reset() manually before each call (and the other callers always use an empty buffer, so they don't care either way). Let's handle this automatically. Since we're changing the semantics, let's take the opportunity to give it a more hash-neutral name (which will also catch any callers from topics in flight). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:02 +09:00
Jeff King	263db403fa	rename "alternate_object_database" to "object_directory" In preparation for unifying the handling of alt odb's and the normal repo object directory, let's use a more neutral name. This patch is purely mechanical, swapping the type name, and converting any variables named "alt" to "odb". There should be no functional change, but it will reduce the noise in subsequent diffs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:02 +09:00
Torsten Bögershausen	ca473cef91	Upcast size_t variables to uintmax_t when printing When printing variables which contain a size, today "unsigned long" is used at many places. In order to be able to change the type from "unsigned long" into size_t some day in the future, we need to have a way to print 64 bit variables on a system that has "unsigned long" defined to be 32 bit, like Win64. Upcast all those variables into uintmax_t before they are printed. This is to prepare for a bigger change, when "unsigned long" will be converted into size_t for variables which may be > 4Gib. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-12 16:43:52 +09:00
Junio C Hamano	18ad13e5b2	Adjust for 2.19.x series * jk/detect-truncated-zlib-input cat-file: handle streaming failures consistently check_stream_sha1(): handle input underflow t1450: check large blob in trailing-garbage test	2018-10-31 13:12:12 +09:00
brian m. carlson	2f90b9d9b4	sha1-file: provide functions to look up hash algorithms There are several ways we might refer to a hash algorithm: by name, such as in the config file; by format ID, such as in a pack; or internally, by a pointer to the hash_algos array. Provide functions to look up hash algorithms based on these various forms and return the internal constant used for them. If conversion to another form is necessary, this internal constant can be used to look up the proper data in the hash_algos array. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-22 12:59:08 +09:00
brian m. carlson	1ccf07cbb7	sha1-file: rename algorithm to "sha1" The transition plan anticipates us using a syntax such as "^{sha1}" for disambiguation. Since this is a syntax some people will be typing a lot, it makes sense to provide a short, easy-to-type syntax. Omitting the dash doesn't create any ambiguity; however, it does make the syntax shorter and easier to type, especially for touch typists. In addition, the transition plan already uses "sha1" in this context. Rename the name of SHA-1 implementation to "sha1". Note that this change creates no backwards compatibility concerns, since we haven't yet used this field in any configuration settings. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-22 12:59:08 +09:00
Stefan Beller	33b94066f2	packfile: allow has_packed_and_bad to handle arbitrary repositories has_packed_and_bad is not widely used, so just migrate it all at once. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-19 16:21:10 +09:00
Stefan Beller	1b9b5c695e	sha1_file: allow read_object to read objects in arbitrary repositories Allow read_object (a file local functon in sha1_file) to handle arbitrary repositories by passing the repository down to oid_object_info_extended. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-19 16:21:04 +09:00
Junio C Hamano	11877b9ebe	Merge branch 'nd/the-index' Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...	2018-10-19 13:34:02 +09:00
Junio C Hamano	ee99ba7afb	Merge branch 'jt/lazy-object-fetch-fix' The code to backfill objects in lazily cloned repository did not work correctly, which has been corrected. * jt/lazy-object-fetch-fix: fetch-object: set exact_oid when fetching fetch-object: unify fetch_object[s] functions	2018-09-24 10:30:52 -07:00
Nguyễn Thái Ngọc Duy	58bf2a4cc7	sha1-file.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:11 -07:00

1 2

73 Commits