shell: limit size of interactive commands

When git-shell is run in interactive mode (which must be enabled by creating $HOME/git-shell-commands), it reads commands from stdin, one per line, and executes them. We read the commands with git_read_line_interactively(), which uses a strbuf under the hood. That means we'll accept an input of arbitrary size (limited only by how much heap we can allocate). That creates two problems: - the rest of the code is not prepared to handle large inputs. The most serious issue here is that split_cmdline() uses "int" for most of its types, which can lead to integer overflow and out-of-bounds array reads and writes. But even with that fixed, we assume that we can feed the command name to snprintf() (via xstrfmt()), which is stuck for historical reasons using "int", and causes it to fail (and even trigger a BUG() call). - since the point of git-shell is to take input from untrusted or semi-trusted clients, it's a mild denial-of-service. We'll allocate as many bytes as the client sends us (actually twice as many, since we immediately duplicate the buffer). We can fix both by just limiting the amount of per-command input we're willing to receive. We should also fix split_cmdline(), of course, which is an accident waiting to happen, but that can come on top. Most calls to split_cmdline(), including the other one in git-shell, are OK because they are reading from an OS-provided argv, which is limited in practice. This patch should eliminate the immediate vulnerabilities. I picked 4MB as an arbitrary limit. It's big enough that nobody should ever run into it in practice (since the point is to run the commands via exec, we're subject to OS limits which are typically much lower). But it's small enough that allocating it isn't that big a deal. The code is mostly just swapping out fgets() for the strbuf call, but we have to add a few niceties like flushing and trimming line endings. We could simplify things further by putting the buffer on the stack, but 4MB is probably a bit much there. Note that we'll _always_ allocate 4MB, which for normal, non-malicious requests is more than we would before this patch. But on the other hand, other git programs are happy to use 96MB for a delta cache. And since we'd never touch most of those pages, on a lazy-allocating OS like Linux they won't even get allocated to actual RAM. The ideal would be a version of strbuf_getline() that accepted a maximum value. But for a minimal vulnerability fix, let's keep things localized and simple. We can always refactor further on top. The included test fails in an obvious way with ASan or UBSan (which notice the integer overflow and out-of-bounds reads). Without them, it fails in a less obvious way: we may segfault, or we may try to xstrfmt() a long string, leading to a BUG(). Either way, it fails reliably before this patch, and passes with it. Note that we don't need an EXPENSIVE prereq on it. It does take 10-15s to fail before this patch, but with the new limit, we fail almost immediately (and the perl process generating 2GB of data exits via SIGPIPE). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-05-17 14:16:10 +02:00 · 2022-09-28 18:52:48 -04:00 · 2022-09-28 18:52:48 -04:00 · 71ad7fe1bc
parent 32696a4cbe
commit 71ad7fe1bc
2 changed files with 36 additions and 4 deletions
--- a/shell.c
+++ b/shell.c
@ -47,6 +47,8 @@ static void cd_to_homedir(void)
 		die("could not chdir to user's home directory");
 }

+#define MAX_INTERACTIVE_COMMAND (4*1024*1024)
+
 static void run_shell(void)
 {
 	int done = 0;
@ -67,22 +69,46 @@ static void run_shell(void)
 	run_command_v_opt(help_argv, RUN_SILENT_EXEC_FAILURE);

 	do {
-		struct strbuf line = STRBUF_INIT;
 		const char *prog;
 		char *full_cmd;
 		char *rawargs;
+		size_t len;
 		char *split_args;
 		const char **argv;
 		int code;
 		int count;

 		fprintf(stderr, "git> ");
-		if (git_read_line_interactively(&line) == EOF) {
+
+		/*
+		 * Avoid using a strbuf or git_read_line_interactively() here.
+		 * We don't want to allocate arbitrary amounts of memory on
+		 * behalf of a possibly untrusted client, and we're subject to
+		 * OS limits on command length anyway.
+		 */
+		fflush(stdout);
+		rawargs = xmalloc(MAX_INTERACTIVE_COMMAND);
+		if (!fgets(rawargs, MAX_INTERACTIVE_COMMAND, stdin)) {
 			fprintf(stderr, "\n");
-			strbuf_release(&line);
+			free(rawargs);
 			break;
 		}
-		rawargs = strbuf_detach(&line, NULL);
+		len = strlen(rawargs);
+
+		/*
+		 * If we truncated due to our input buffer size, reject the
+		 * command. That's better than running bogus input, and
+		 * there's a good chance it's just malicious garbage anyway.
+		 */
+		if (len >= MAX_INTERACTIVE_COMMAND - 1)
+			die("invalid command format: input too long");
+
+		if (len > 0 && rawargs[len - 1] == '\n') {
+			if (--len > 0 && rawargs[len - 1] == '\r')
+				--len;
+			rawargs[len] = '\0';
+		}
+
 		split_args = xstrdup(rawargs);
 		count = split_cmdline(split_args, &argv);
 		if (count < 0) {
--- a/t/t9850-shell.sh
+++ b/t/t9850-shell.sh
@ -28,4 +28,10 @@ test_expect_success 'shell allows interactive command' '
 	test_cmp expect actual
 '

+test_expect_success 'shell complains of overlong commands' '
+	perl -e "print \"a\" x 2**12 for (0..2**19)" |
+	test_must_fail git shell 2>err &&
+	grep "too long" err
+'
+
 test_done