1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-05-23 18:26:08 +02:00
Commit Graph

208 Commits

Author SHA1 Message Date
Junio C Hamano 11529ecec9 Merge branch 'jk/tighten-alloc'
Update various codepaths to avoid manually-counted malloc().

* jk/tighten-alloc: (22 commits)
  ewah: convert to REALLOC_ARRAY, etc
  convert ewah/bitmap code to use xmalloc
  diff_populate_gitlink: use a strbuf
  transport_anonymize_url: use xstrfmt
  git-compat-util: drop mempcpy compat code
  sequencer: simplify memory allocation of get_message
  test-path-utils: fix normalize_path_copy output buffer size
  fetch-pack: simplify add_sought_entry
  fast-import: simplify allocation in start_packfile
  write_untracked_extension: use FLEX_ALLOC helper
  prepare_{git,shell}_cmd: use argv_array
  use st_add and st_mult for allocation size computation
  convert trivial cases to FLEX_ARRAY macros
  use xmallocz to avoid size arithmetic
  convert trivial cases to ALLOC_ARRAY
  convert manual allocations to argv_array
  argv-array: add detach function
  add helpers for allocating flex-array structs
  harden REALLOC_ARRAY and xcalloc against size_t overflow
  tree-diff: catch integer overflow in combine_diff_path allocation
  ...
2016-02-26 13:37:16 -08:00
Jeff King 850d2fec53 convert manual allocations to argv_array
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
     array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
     then allocate, then fill in the array. Now we can do it
     in one pass, making the code shorter and easier to
     follow.

  3. argv_array handles memory ownership for us, making it
     more obvious when things should be free()d and and when
     not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:50:32 -08:00
Junio C Hamano 8f309aeb82 strbuf: introduce strbuf_getline_{lf,nul}()
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago.  No useful caller that uses other value has emerged.

By now, it is clear that the interface is overly broad without a
good reason.  Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.

This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them.  The changes contained in this patch are:

 * introduction of these two functions in strbuf.[ch]

 * mechanical conversion of all callers to strbuf_getline() with
   either '\n' or '\0' as the third parameter to instead call the
   respective thin wrapper.

After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller.  An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:12:51 -08:00
Junio C Hamano c3c592ef95 Merge branch 'rs/daemon-plug-child-leak'
"git daemon" uses "run_command()" without "finish_command()", so it
needs to release resources itself, which it forgot to do.

* rs/daemon-plug-child-leak:
  daemon: plug memory leak
  run-command: factor out child_process_clear()
2015-11-03 15:13:12 -08:00
René Scharfe b1b49ff8d4 daemon: plug memory leak
Call child_process_clear() when a child ends to release the memory
allocated for its environment.  This is necessary because unlike all
other users of start_command() we don't call finish_command(), which
would have taken care of that for us.

This leak was introduced by f063d38b (daemon: use cld->env_array
when re-spawning).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02 15:01:23 -08:00
Jeff King f063d38b80 daemon: use cld->env_array when re-spawning
This avoids an ugly strcat into a fixed-size buffer. It's
not wrong (the buffer is plenty large enough for an IPv6
address plus some minor formatting), but it takes some
effort to verify that.

Unfortunately we are still stuck with some fixed-size
buffers to hold the output of inet_ntop. But at least we now
pass very easy-to-verify parameters, rather than doing a
manual computation to account for other data in the buffer.

As a side effect, this also fixes the case where we might
pass an uninitialized portbuf buffer through the
environment. This probably couldn't happen in practice, as
it would mean that addr->sa_family was neither AF_INET nor
AF_INET6 (and that is all we are listening on).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King 5096d4909f convert trivial sprintf / strcpy calls to xsnprintf
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano 1f76a10b2d write_file(): drop caller-supplied LF from calls to create a one-liner file
All of the callsites covered by this change call write_file() or
write_file_gently() to create a one-liner file.  Drop the caller
supplied LF and let these callees to append it as necessary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-25 12:49:19 -07:00
Junio C Hamano 12d6ce1dba write_file(): drop "fatal" parameter
All callers except three passed 1 for the "fatal" parameter to ask
this function to die upon error, but to a casual reader of the code,
it was not all obvious what that 1 meant.  Instead, split the
function into two based on a common write_file_v() that takes the
flag, introduce write_file_gently() as a new way to attempt creating
a file without dying on error, and make three callers to call it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-24 13:09:02 -07:00
Junio C Hamano 9e4d2f6d45 Merge branch 'jc/daemon-no-ipv6-for-2.4.1'
"git daemon" fails to build from the source under NO_IPV6
configuration (regression in 2.4).

* jc/daemon-no-ipv6-for-2.4.1:
  daemon: unbreak NO_IPV6 build regression
2015-05-11 14:23:53 -07:00
Junio C Hamano 68a2e6a2c8 Merge branch 'nd/multiple-work-trees'
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.

* nd/multiple-work-trees: (41 commits)
  prune --worktrees: fix expire vs worktree existence condition
  t1501: fix test with split index
  t2026: fix broken &&-chain
  t2026 needs procondition SANITY
  git-checkout.txt: a note about multiple checkout support for submodules
  checkout: add --ignore-other-wortrees
  checkout: pass whole struct to parse_branchname_arg instead of individual flags
  git-common-dir: make "modules/" per-working-directory directory
  checkout: do not fail if target is an empty directory
  t2025: add a test to make sure grafts is working from a linked checkout
  checkout: don't require a work tree when checking out into a new one
  git_path(): keep "info/sparse-checkout" per work-tree
  count-objects: report unused files in $GIT_DIR/worktrees/...
  gc: support prune --worktrees
  gc: factor out gc.pruneexpire parsing code
  gc: style change -- no SP before closing parenthesis
  checkout: clean up half-prepared directories in --to mode
  checkout: reject if the branch is already checked out elsewhere
  prune: strategies for linked checkouts
  checkout: support checking out into a new working directory
  ...
2015-05-11 14:23:39 -07:00
Junio C Hamano d358f771e3 daemon: unbreak NO_IPV6 build regression
When 01cec54e (daemon: deglobalize hostname information, 2015-03-07)
wrapped the global variables such as hostname inside a struct, it
forgot to convert one location that spelled "hostname" that needs to
be updated to "hi->hostname".

This was inside NO_IPV6 block, and was not caught by anybody.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05 11:03:24 -07:00
René Scharfe 01cec54e13 daemon: deglobalize hostname information
Move the variables related to the client-supplied hostname into its own
struct, let execute() own an instance of that instead of storing the
information in global variables and pass the struct to any function that
needs to access it as a parameter.

The lifetime of the variables is easier to see this way.  Allocated
memory is released within execute().  The strbufs don't have to be reset
anymore because they are written to only once at most: parse_host_arg()
is only called once by execute() and lookup_hostname() guards against
being called twice using hostname_lookup_done.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-09 18:18:07 -07:00
René Scharfe 7a646cec5b daemon: use strbuf for hostname info
Convert hostname, canon_hostname, ip_address and tcp_port to strbuf.
This allows to get rid of the helpers strbuf_addstr_or_null() and STRARG
because a strbuf always represents a valid (initially empty) string.

sanitize_client() is not needed anymore and sanitize_client_strbuf()
takes its place and name.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-09 18:17:18 -07:00
Junio C Hamano 4c3dbbf722 Merge branch 'jk/daemon-interpolate'
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.

* jk/daemon-interpolate:
  daemon: sanitize incoming virtual hostname
  t5570: test git-daemon's --interpolated-path option
  git_connect: let user override virtual-host we send to daemon
2015-03-03 14:37:06 -08:00
Junio C Hamano ef4cdb8bb7 Merge branch 'rs/daemon-interpolate'
"git daemon" looked up the hostname even when "%CH" and "%IP"
interpolations are not requested, which was unnecessary.

* rs/daemon-interpolate:
  daemon: use callback to build interpolated path
  daemon: look up client-supplied hostname lazily
2015-03-03 14:37:04 -08:00
René Scharfe dc8edc8f7d daemon: use callback to build interpolated path
Provide a callback function for strbuf_expand() instead of using the
helper strbuf_expand_dict_cb().  While the resulting code is longer, it
only looks up the canonical hostname and IP address if at least one of
the placeholders %CH and %IP are used with --interpolated-path.

Use a struct for passing the directory to the callback function instead
of passing it directly to avoid having to cast away its const qualifier.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 13:40:49 -08:00
René Scharfe edef953e48 daemon: look up client-supplied hostname lazily
Look up canonical hostname and IP address using getaddrinfo(3) or
gethostbyname(3) only if --interpolated-path or --access-hook were
specified.

Do that by introducing getter functions for canon_hostname and
ip_address and using them for all read accesses.  These wrappers call
the new helper lookup_hostname(), which sets the variables only at its
first call.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 13:40:36 -08:00
Jeff King b485373052 daemon: sanitize incoming virtual hostname
We use the daemon_avoid_alias function to make sure that the
pathname the user gives us is sane. However, after applying
that check, we might then interpolate the path using a
string given by the server admin, but which may contain more
untrusted data from the client. We should be sure to
sanitize this data, as well.

We cannot use daemon_avoid_alias here, as it is more strict
than we need in requiring a leading '/'. At the same time,
we can be much more strict here. We are interpreting a
hostname, which should not contain slashes or excessive runs
of dots, as those things are not allowed in DNS names.

Note that in addition to cleansing the hostname field, we
must check the "canonical hostname" (%CH) as well as the
port (%P), which we take as a raw string.  For the canonical
hostname, this comes from an actual DNS lookup on the
accessed IP, which makes it a much less likely vector for
problems. But it does not hurt to sanitize it in the same
way. Unfortunately we cannot test this case easily, as it
would involve a custom hostname lookup.

We do not need to check %IP, as it comes straight from
inet_ntop, so must have a sane form.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 13:15:30 -08:00
Nguyễn Thái Ngọc Duy 91aacda85a use new wrapper write_file() for simple file writing
This fixes common problems in these code about error handling,
forgetting to close the file handle after fprintf() fails, or not
printing out the error string..

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 11:00:16 -08:00
Junio C Hamano a8f01f87d0 Merge branch 'rs/daemon-fixes' into maint
* rs/daemon-fixes:
  daemon: remove write-only variable maxfd
  daemon: fix error message after bind()
  daemon: handle gethostbyname() error
2014-10-29 10:35:09 -07:00
Junio C Hamano dc11fc2de8 Merge branch 'rs/daemon-fixes'
"git daemon" (with NO_IPV6 build configuration) used to incorrectly
use the hostname even when gethostbyname() reported that the given
hostname is not found.

* rs/daemon-fixes:
  daemon: remove write-only variable maxfd
  daemon: fix error message after bind()
  daemon: handle gethostbyname() error
2014-10-14 10:49:23 -07:00
René Scharfe 107efbeb24 daemon: remove write-only variable maxfd
It became unused when 6573faff (NO_IPV6 support for git daemon) replaced
select() with poll().

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:34:56 -07:00
René Scharfe 9d1b9aa9e1 daemon: fix error message after bind()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:34:54 -07:00
René Scharfe eb6c403500 daemon: handle gethostbyname() error
If the user-supplied hostname can't be found then we should not use it.
We already avoid doing that in the non-NO_IPV6 case by checking if the
return value of getaddrinfo() is zero (success).  Do the same in the
NO_IPV6 case and make sure the return value of gethostbyname() isn't
NULL before dereferencing this pointer.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:34:53 -07:00
Junio C Hamano 825fd93767 Merge branch 'rs/child-process-init'
Code clean-up.

* rs/child-process-init:
  run-command: inline prepare_run_command_v_opt()
  run-command: call run_command_v_opt_cd_env() instead of duplicating it
  run-command: introduce child_process_init()
  run-command: introduce CHILD_PROCESS_INIT
2014-09-11 10:33:27 -07:00
René Scharfe d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Tanay Abhra 8939d32d81 daemon.c: replace `git_config()` with `git_config_get_bool()` family
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of
the config-set API which provides a cleaner control flow.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 13:33:25 -07:00
Junio C Hamano dcc1b38517 Merge branch 'cc/replace-edit'
Teach "git replace" an "--edit" mode.

* cc/replace-edit:
  replace: use argv_array in export_object
  avoid double close of descriptors handed to run_command
  replace: replace spaces with tabs in indentation
2014-07-16 11:25:47 -07:00
Jeff King 28bf9429ef avoid double close of descriptors handed to run_command
When a file descriptor is given to run_command via the
"in", "out", or "err" parameters, run_command takes
ownership. The descriptor will be closed in the parent
process whether the process is spawned successfully or not,
and closing it again is wrong.

In practice this has not caused problems, because we usually
close() right after start_command returns, meaning no other
code has opened a descriptor in the meantime. So we just get
EBADF and ignore it (rather than accidentally closing
somebody else's descriptor!).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-25 15:27:24 -07:00
Jeff King d12c24d2a9 daemon: use skip_prefix to avoid magic numbers
Like earlier cases, we can use skip_prefix to avoid magic
numbers that must match the length of starts_with prefixes.
However, the numbers are a little more complicated here, as
we keep parsing past the prefix. We can solve it by keeping
a running pointer as we parse; its final value is the
location we want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:45:18 -07:00
Jeff King ae021d8791 use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:

  if (starts_with(foo, "bar"))
	  foo += 3;

This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes.  We can use skip_prefix to avoid the magic
numbers here.

Note that some of these conversions could be much shorter.
For example:

  if (starts_with(arg, "--foo=")) {
	  bar = arg + 6;
	  continue;
  }

could become:

  if (skip_prefix(arg, "--foo=", &bar))
	  continue;

However, I have left it as:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = v;
	  continue;
  }

to visually match nearby cases which need to actually
process the string. Like:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = atoi(v);
	  continue;
  }

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:45 -07:00
Jeff King 1055a890f0 daemon: mark some strings as const
None of these strings is modified; marking them as const
will help later refactoring.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 14:56:24 -07:00
Junio C Hamano b4516df9b8 Merge branch 'jk/daemon-tolower'
* jk/daemon-tolower:
  daemon/config: factor out duplicate xstrdup_tolower
2014-06-16 10:07:15 -07:00
Jeff King 88d5a6f6cd daemon/config: factor out duplicate xstrdup_tolower
We have two implementations of the same function; let's drop
that to one. We take the name from daemon.c, but the
implementation (which is just slightly more efficient) from
the config code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-23 12:39:44 -07:00
Nguyễn Thái Ngọc Duy de0957ce2e daemon: move daemonize() to libgit.a
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 10:46:35 -08:00
Junio C Hamano 0c52457b7c Merge branch 'nd/daemon-informative-errors-typofix'
* nd/daemon-informative-errors-typofix:
  daemon: be strict at parsing parameters --[no-]informative-errors
2014-01-10 10:32:59 -08:00
Nguyễn Thái Ngọc Duy 82246b765b daemon: be strict at parsing parameters --[no-]informative-errors
Use strcmp() instead of starts_with()/!prefixcmp() to stop accepting
--informative-errors-just-a-little

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-20 14:05:07 -08:00
Christian Couder 5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano 0c544a22f9 Merge branch 'sb/misc-fixes'
Assorted code cleanups and a minor fix.

* sb/misc-fixes:
  diff.c: Do not initialize a variable, which gets reassigned anyway.
  commit: Fix a memory leak in determine_author_info
  daemon.c:handle: Remove unneeded check for null pointer.
2013-07-24 19:20:59 -07:00
Junio C Hamano cb29dfde48 Merge branch 'tr/protect-low-3-fds'
When "git" is spawned in such a way that any of the low 3 file
descriptors is closed, our first open() may yield file descriptor 2,
and writing error message to it would screw things up in a big way.

* tr/protect-low-3-fds:
  git: ensure 0/1/2 are open in main()
  daemon/shell: refactor redirection of 0/1/2 from /dev/null
2013-07-22 11:23:35 -07:00
Thomas Rast 1d999ddd1d daemon/shell: refactor redirection of 0/1/2 from /dev/null
Both daemon.c and shell.c contain logic to open FDs 0/1/2 from
/dev/null if they are not already open.  Move the function in daemon.c
to setup.c and use it in shell.c, too.

While there, remove a 'not' that inverted the meaning of the comment.
The point is indeed to *avoid* messing up.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 12:50:34 -07:00
Stefan Beller 5d9cfa29d2 daemon.c:handle: Remove unneeded check for null pointer.
addr doesn't need to be checked at that line as it it already accessed
7 lines before in the if (addr->sa_family).

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 09:45:11 -07:00
Junio C Hamano e013bdab0f Merge branch 'jk/pkt-line-cleanup'
Clean up pkt-line API, implementation and its callers to make them
more robust.

* jk/pkt-line-cleanup:
  do not use GIT_TRACE_PACKET=3 in tests
  remote-curl: always parse incoming refs
  remote-curl: move ref-parsing code up in file
  remote-curl: pass buffer straight to get_remote_heads
  teach get_remote_heads to read from a memory buffer
  pkt-line: share buffer/descriptor reading implementation
  pkt-line: provide a LARGE_PACKET_MAX static buffer
  pkt-line: move LARGE_PACKET_MAX definition from sideband
  pkt-line: teach packet_read_line to chomp newlines
  pkt-line: provide a generic reading function with options
  pkt-line: drop safe_write function
  pkt-line: move a misplaced comment
  write_or_die: raise SIGPIPE when we get EPIPE
  upload-archive: use argv_array to store client arguments
  upload-archive: do not copy repo name
  send-pack: prefer prefixcmp over memcmp in receive_status
  fetch-pack: fix out-of-bounds buffer offset in get_ack
  upload-pack: remove packet debugging harness
  upload-pack: do not add duplicate objects to shallow list
  upload-pack: use get_sha1_hex to parse "shallow" lines
2013-04-01 08:59:37 -07:00
Junio C Hamano 2b0dda5318 Merge branch 'dm/ni-maxhost-may-be-missing' into maint-1.8.1
Some sources failed to compile on systems that lack NI_MAXHOST in
their system header.

* dm/ni-maxhost-may-be-missing:
  git-compat-util.h: Provide missing netdb.h definitions
2013-03-25 13:45:42 -07:00
Junio C Hamano 31ccd35df4 Merge branch 'dm/ni-maxhost-may-be-missing'
On systems without NI_MAXHOST in their system header files,
connect.c (hence most of the transport) did not compile.

* dm/ni-maxhost-may-be-missing:
  git-compat-util.h: Provide missing netdb.h definitions
2013-03-19 12:18:21 -07:00
David Michael 3b130ade45 git-compat-util.h: Provide missing netdb.h definitions
Some platforms may lack the NI_MAXHOST and NI_MAXSERV values in their
system headers, so ensure they are available.

Signed-off-by: David Michael <fedora.dm0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-25 12:16:08 -08:00
Jeff King 4981fe750b pkt-line: share buffer/descriptor reading implementation
The packet_read function reads from a descriptor. The
packet_get_line function is similar, but reads from an
in-memory buffer, and uses a completely separate
implementation. This patch teaches the generic packet_read
function to accept either source, and we can do away with
packet_get_line's implementation.

There are two other differences to account for between the
old and new functions. The first is that we used to read
into a strbuf, but now read into a fixed size buffer. The
only two callers are fine with that, and in fact it
simplifies their code, since they can use the same
static-buffer interface as the rest of the packet_read_line
callers (and we provide a similar convenience wrapper for
reading from a buffer rather than a descriptor).

This is technically an externally-visible behavior change in
that we used to accept arbitrary sized packets up to 65532
bytes, and now cap out at LARGE_PACKET_MAX, 65520. In
practice this doesn't matter, as we use it only for parsing
smart-http headers (of which there is exactly one defined,
and it is small and fixed-size). And any extension headers
would be breaking the protocol to go over LARGE_PACKET_MAX
anyway.

The other difference is that packet_get_line would return
on error rather than dying. However, both callers of
packet_get_line are actually improved by dying.

The first caller does its own error checking, but we can
drop that; as a result, we'll actually get more specific
reporting about protocol breakage when packet_read dies
internally. The only downside is that packet_read will not
print the smart-http URL that failed, but that's not a big
deal; anybody not debugging can already see the remote's URL
already, and anybody debugging would want to run with
GIT_CURL_VERBOSE anyway to see way more information.

The second caller, which is just trying to skip past any
extra smart-http headers (of which there are none defined,
but which we allow to keep room for future expansion), did
not error check at all. As a result, it would treat an error
just like a flush packet. The resulting mess would generally
cause an error later in get_remote_heads, but now we get
error reporting much closer to the source of the problem.

Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:14:15 -08:00
Jeff King 74543a0423 pkt-line: provide a LARGE_PACKET_MAX static buffer
Most of the callers of packet_read_line just read into a
static 1000-byte buffer (callers which handle arbitrary
binary data already use LARGE_PACKET_MAX). This works fine
in practice, because:

  1. The only variable-sized data in these lines is a ref
     name, and refs tend to be a lot shorter than 1000
     characters.

  2. When sending ref lines, git-core always limits itself
     to 1000 byte packets.

However, the only limit given in the protocol specification
in Documentation/technical/protocol-common.txt is
LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
pack-protocol.txt, and then only describing what we write,
not as a specific limit for readers.

This patch lets us bump the 1000-byte limit to
LARGE_PACKET_MAX. Even though git-core will never write a
packet where this makes a difference, there are two good
reasons to do this:

  1. Other git implementations may have followed
     protocol-common.txt and used a larger maximum size. We
     don't bump into it in practice because it would involve
     very long ref names.

  2. We may want to increase the 1000-byte limit one day.
     Since packets are transferred before any capabilities,
     it's difficult to do this in a backwards-compatible
     way. But if we bump the size of buffer the readers can
     handle, eventually older versions of git will be
     obsolete enough that we can justify bumping the
     writers, as well. We don't have plans to do this
     anytime soon, but there is no reason not to start the
     clock ticking now.

Just bumping all of the reading bufs to LARGE_PACKET_MAX
would waste memory. Instead, since most readers just read
into a temporary buffer anyway, let's provide a single
static buffer that all callers can use. We can further wrap
this detail away by having the packet_read_line wrapper just
use the buffer transparently and return a pointer to the
static storage.  That covers most of the cases, and the
remaining ones already read into their own LARGE_PACKET_MAX
buffers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:22 -08:00
Jeff King 819b929d33 pkt-line: teach packet_read_line to chomp newlines
The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.

This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.

As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.

Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility.  However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.

This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.

By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch.  The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00