From 6e68c914102774832c519804498538791cdddff9 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Wed, 27 Sep 2017 02:17:23 -0400 Subject: [PATCH 1/3] validate_headref: NUL-terminate HEAD buffer When we are checking to see if we have a git repo, we peek into the HEAD file and see if it's a plausible symlink, symref, or detached HEAD. For the latter two, we read the contents with read_in_full(), which means they aren't NUL-terminated. The symref check is careful to respect the length we got, but the sha1 check will happily parse up to 40 bytes, even if we read fewer. E.g.,: echo 1234 >.git/HEAD git rev-parse will parse 36 uninitialized bytes from our stack buffer. This isn't a big deal in practice. Our buffer is 256 bytes, so we know we'll never read outside of it. The worst case is that the uninitialized bytes look like valid hex, and we claim a bogus HEAD file is valid. The chances of this happening randomly are quite slim, but let's be careful. One option would be to check that "len == 41" before feeding the buffer to get_sha1_hex(). But we'd like to eventually prepare for a world with variable-length hashes. Let's NUL-terminate as soon as we've read the buffer (we already even leave a spare byte to do so!). That fixes this problem without depending on the size of an object id. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- path.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/path.c b/path.c index e50d2befcf..1ca2cf9a28 100644 --- a/path.c +++ b/path.c @@ -661,6 +661,10 @@ int validate_headref(const char *path) len = read_in_full(fd, buffer, sizeof(buffer)-1); close(fd); + if (len < 0) + return -1; + buffer[len] = '\0'; + /* * Is it a symbolic ref? */ From 7eb4b9d025e1ac2ed5a49530c614b7e69054625a Mon Sep 17 00:00:00 2001 From: Jeff King Date: Wed, 27 Sep 2017 02:17:26 -0400 Subject: [PATCH 2/3] validate_headref: use skip_prefix for symref parsing Since the previous commit guarantees that our symref buffer is NUL-terminated, we can just use skip_prefix() and friends to parse it. This is shorter and saves us having to deal with magic numbers and keeping the "len" counter up to date. While we're at it, let's name the rather obscure "buf" to "refname", since that is the thing we are parsing with it. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- path.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/path.c b/path.c index 1ca2cf9a28..650c66c32d 100644 --- a/path.c +++ b/path.c @@ -636,7 +636,8 @@ void strbuf_git_common_path(struct strbuf *sb, int validate_headref(const char *path) { struct stat st; - char *buf, buffer[256]; + char buffer[256]; + const char *refname; unsigned char sha1[20]; int fd; ssize_t len; @@ -668,14 +669,10 @@ int validate_headref(const char *path) /* * Is it a symbolic ref? */ - if (len < 4) - return -1; - if (!memcmp("ref:", buffer, 4)) { - buf = buffer + 4; - len -= 4; - while (len && isspace(*buf)) - buf++, len--; - if (len >= 5 && !memcmp("refs/", buf, 5)) + if (skip_prefix(buffer, "ref:", &refname)) { + while (isspace(*refname)) + refname++; + if (starts_with(refname, "refs/")) return 0; } From 0bca165fdb57b032e505161a9450fd5e3edfd19a Mon Sep 17 00:00:00 2001 From: Jeff King Date: Wed, 27 Sep 2017 02:17:36 -0400 Subject: [PATCH 3/3] validate_headref: use get_oid_hex for detached HEADs If a candidate HEAD isn't a symref, we check that it contains a viable sha1. But in a post-sha1 world, we should be checking whether it has any plausible object-id. We can do that by switching to get_oid_hex(). Note that both before and after this patch, we only check for a plausible object id at the start of the file, and then call that good enough. We ignore any content _after_ the hex, so a string like: 0123456789012345678901234567890123456789 foo is accepted. Though we do put extra bytes like this into some pseudorefs (e.g., FETCH_HEAD), we don't typically do so with HEAD. We could tighten this up by using parse_oid_hex(), like: if (!parse_oid_hex(buffer, &oid, &end) && *end++ == '\n' && *end == '\0') return 0; But we're probably better to remain on the loose side. We're just checking here for a plausible-looking repository directory, so heuristics are acceptable (if we really want to be meticulous, we should use the actual ref code to parse HEAD). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- path.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/path.c b/path.c index 650c66c32d..883324b10a 100644 --- a/path.c +++ b/path.c @@ -638,7 +638,7 @@ int validate_headref(const char *path) struct stat st; char buffer[256]; const char *refname; - unsigned char sha1[20]; + struct object_id oid; int fd; ssize_t len; @@ -679,7 +679,7 @@ int validate_headref(const char *path) /* * Is this a detached HEAD? */ - if (!get_sha1_hex(buffer, sha1)) + if (!get_oid_hex(buffer, &oid)) return 0; return -1;