mirror of
https://github.com/git/git.git
synced 2024-11-18 19:13:58 +01:00
Documentation: describe pack idx v2
Lifted from the log message of c553ca25bd60dc9fd50b8bc7bd329601b81cee66 (pack-objects: learn about pack index version 2). Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
parent
29ab27f4b5
commit
71362bd552
@ -1,9 +1,9 @@
|
||||
GIT pack format
|
||||
===============
|
||||
|
||||
= pack-*.pack file has the following format:
|
||||
= pack-*.pack files have the following format:
|
||||
|
||||
- The header appears at the beginning and consists of the following:
|
||||
- A header appears at the beginning and consists of the following:
|
||||
|
||||
4-byte signature:
|
||||
The signature is: {'P', 'A', 'C', 'K'}
|
||||
@ -34,18 +34,14 @@ GIT pack format
|
||||
|
||||
- The trailer records 20-byte SHA1 checksum of all of the above.
|
||||
|
||||
= pack-*.idx file has the following format:
|
||||
= Original (version 1) pack-*.idx files have the following format:
|
||||
|
||||
- The header consists of 256 4-byte network byte order
|
||||
integers. N-th entry of this table records the number of
|
||||
objects in the corresponding pack, the first byte of whose
|
||||
object name are smaller than N. This is called the
|
||||
object name is less than or equal to N. This is called the
|
||||
'first-level fan-out' table.
|
||||
|
||||
Observation: we would need to extend this to an array of
|
||||
8-byte integers to go beyond 4G objects per pack, but it is
|
||||
not strictly necessary.
|
||||
|
||||
- The header is followed by sorted 24-byte entries, one entry
|
||||
per object in the pack. Each entry is:
|
||||
|
||||
@ -55,10 +51,6 @@ GIT pack format
|
||||
|
||||
20-byte object name.
|
||||
|
||||
Observation: we would definitely need to extend this to
|
||||
8-byte integer plus 20-byte object name to handle a packfile
|
||||
that is larger than 4GB.
|
||||
|
||||
- The file is concluded with a trailer:
|
||||
|
||||
A copy of the 20-byte SHA1 checksum at the end of
|
||||
@ -68,31 +60,30 @@ GIT pack format
|
||||
|
||||
Pack Idx file:
|
||||
|
||||
idx
|
||||
+--------------------------------+
|
||||
| fanout[0] = 2 |-.
|
||||
+--------------------------------+ |
|
||||
-- +--------------------------------+
|
||||
fanout | fanout[0] = 2 (for example) |-.
|
||||
table +--------------------------------+ |
|
||||
| fanout[1] | |
|
||||
+--------------------------------+ |
|
||||
| fanout[2] | |
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
||||
| fanout[255] | |
|
||||
+--------------------------------+ |
|
||||
main | offset | |
|
||||
index | object name 00XXXXXXXXXXXXXXXX | |
|
||||
table +--------------------------------+ |
|
||||
| offset | |
|
||||
| object name 00XXXXXXXXXXXXXXXX | |
|
||||
+--------------------------------+ |
|
||||
.-| offset |<+
|
||||
| | object name 01XXXXXXXXXXXXXXXX |
|
||||
| +--------------------------------+
|
||||
| | offset |
|
||||
| | object name 01XXXXXXXXXXXXXXXX |
|
||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
| | offset |
|
||||
| | object name FFXXXXXXXXXXXXXXXX |
|
||||
| +--------------------------------+
|
||||
| fanout[255] = total objects |---.
|
||||
-- +--------------------------------+ | |
|
||||
main | offset | | |
|
||||
index | object name 00XXXXXXXXXXXXXXXX | | |
|
||||
table +--------------------------------+ | |
|
||||
| offset | | |
|
||||
| object name 00XXXXXXXXXXXXXXXX | | |
|
||||
+--------------------------------+<+ |
|
||||
.-| offset | |
|
||||
| | object name 01XXXXXXXXXXXXXXXX | |
|
||||
| +--------------------------------+ |
|
||||
| | offset | |
|
||||
| | object name 01XXXXXXXXXXXXXXXX | |
|
||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
||||
| | offset | |
|
||||
| | object name FFXXXXXXXXXXXXXXXX | |
|
||||
--| +--------------------------------+<--+
|
||||
trailer | | packfile checksum |
|
||||
| +--------------------------------+
|
||||
| | idxfile checksum |
|
||||
@ -116,3 +107,40 @@ Pack file entry: <+
|
||||
20-byte base object name SHA1 (the size above is the
|
||||
size of the delta data that follows).
|
||||
delta data, deflated.
|
||||
|
||||
|
||||
= Version 2 pack-*.idx files support packs larger than 4 GiB, and
|
||||
have some other reorganizations. They have the format:
|
||||
|
||||
- A 4-byte magic number '\377tOc' which is an unreasonable
|
||||
fanout[0] value.
|
||||
|
||||
- A 4-byte version number (= 2)
|
||||
|
||||
- A 256-entry fan-out table just like v1.
|
||||
|
||||
- A table of sorted 20-byte SHA1 object names. These are
|
||||
packed together without offset values to reduce the cache
|
||||
footprint of the binary search for a specific object name.
|
||||
|
||||
- A table of 4-byte CRC32 values of the packed object data.
|
||||
This is new in v2 so compressed data can be copied directly
|
||||
from pack to pack during repacking withough undetected
|
||||
data corruption.
|
||||
|
||||
- A table of 4-byte offset values (in network byte order).
|
||||
These are usually 31-bit pack file offsets, but large
|
||||
offsets are encoded as an index into the next table with
|
||||
the msbit set.
|
||||
|
||||
- A table of 8-byte offset entries (empty for pack files less
|
||||
than 2 GiB). Pack files are organized with heavily used
|
||||
objects toward the front, so most object references should
|
||||
not need to refer to this table.
|
||||
|
||||
- The same trailer as a v1 pack file:
|
||||
|
||||
A copy of the 20-byte SHA1 checksum at the end of
|
||||
corresponding packfile.
|
||||
|
||||
20-byte SHA1-checksum of all of the above.
|
||||
|
Loading…
Reference in New Issue
Block a user