1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-06 12:36:09 +02:00
Commit Graph

258 Commits

Author SHA1 Message Date
David Turner a6720955f1 unpack-trees: fix accidentally quadratic behavior
While unpacking trees (e.g. during git checkout), when we hit a cache
entry that's past and outside our path, we cut off iteration.

This provides about a 45% speedup on git checkout between master and
master^20000 on Twitter's monorepo.  Speedup in general will depend on
repostitory structure, number of changes, and packfile packing
decisions.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-22 13:03:10 -08:00
David Turner d9c2bd560e do_compare_entry: use already-computed path
In traverse_trees, we generate the complete traverse path for a
traverse_info.  Later, in do_compare_entry, we used to go do a bunch
of work to compare the traverse_info to a cache_entry's name without
computing that path.  But since we already have that path, we don't
need to do all that work.  Instead, we can just put the generated
path into the traverse_info, and do the comparison more directly.

We copy the path because prune_traversal might mutate `base`. This
doesn't happen in any codepaths where do_compare_entry is called,
but it's better to be safe.

This makes git checkout much faster -- about 25% on Twitter's
monorepo.  Deeper directory trees are likely to benefit more than
shallower ones.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-05 13:39:46 -08:00
Jeff King 75faa45ae0 replace trivial malloc + sprintf / strcpy calls with xstrfmt
It's a common pattern to do:

  foo = xmalloc(strlen(one) + strlen(two) + 1 + 1);
  sprintf(foo, "%s %s", one, two);

(or possibly some variant with strcpy()s or a more
complicated length computation).  We can switch these to use
xstrfmt, which is shorter, involves less error-prone manual
computation, and removes many sprintf and strcpy calls which
make it harder to audit the code for real buffer overflows.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano f0bc854623 Sync with 2.5.2 2015-09-09 14:30:35 -07:00
Junio C Hamano 3d3caf0b78 Sync with 2.4.9 2015-09-04 10:43:23 -07:00
Junio C Hamano 8267cd11d6 Sync with 2.2.3 2015-09-04 10:29:28 -07:00
Jeff King f514ef9787 verify_absent: allow filenames longer than PATH_MAX
When unpack-trees wants to know whether a path will
overwrite anything in the working tree, we use lstat() to
see if there is anything there. But if we are going to write
"foo/bar", we can't just lstat("foo/bar"); we need to look
for leading prefixes (e.g., "foo"). So we use the lstat cache
to find the length of the leading prefix, and copy the
filename up to that length into a temporary buffer (since
the original name is const, we cannot just stick a NUL in
it).

The copy we make goes into a PATH_MAX-sized buffer, which
will overflow if the prefix is longer than PATH_MAX. How
this happens is a little tricky, since in theory PATH_MAX is
the biggest path we will have read from the filesystem. But
this can happen if:

  - the compiled-in PATH_MAX does not accurately reflect
    what the filesystem is capable of

  - the leading prefix is not _quite_ what is on disk; it
    contains the next element from the name we are checking.
    So if we want to write "aaa/bbb/ccc/ddd" and "aaa/bbb"
    exists, the prefix of interest is "aaa/bbb/ccc". If
    "aaa/bbb" approaches PATH_MAX, then "ccc" can overflow
    it.

So this can be triggered, but it's hard to do. In
particular, you cannot just "git clone" a bogus repo. The
verify_absent checks happen before unpack-trees writes
anything to the filesystem, so there are never any leading
prefixes during the initial checkout, and the bug doesn't
trigger. And by definition, these files are larger than
PATH_MAX, so writing them will fail, and clone will
complain (though it may write a partial path, which will
cause a subsequent "git checkout" to hit the bug).

We can fix it by creating the temporary path on the heap.
The extra malloc overhead is not important, as we are
already making at least one stat() call (and probably more
for the prefix discovery).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-04 08:50:50 -07:00
Junio C Hamano 8c9155e031 Merge branch 'jk/git-path'
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4.  Their uses have been reduced.

* jk/git-path:
  memoize common git-path "constant" files
  get_repo_path: refactor path-allocation
  find_hook: keep our own static buffer
  refs.c: remove_empty_directories can take a strbuf
  refs.c: avoid git_path assignment in lock_ref_sha1_basic
  refs.c: avoid repeated git_path calls in rename_tmp_log
  refs.c: simplify strbufs in reflog setup and writing
  path.c: drop git_path_submodule
  refs.c: remove extra git_path calls from read_loose_refs
  remote.c: drop extraneous local variable from migrate_file
  prefer mkpathdup to mkpath in assignments
  prefer git_pathdup to git_path in some possibly-dangerous cases
  add_to_alternates_file: don't add duplicate entries
  t5700: modernize style
  cache.h: complete set of git_path_submodule helpers
  cache.h: clarify documentation for git_path, et al
2015-08-19 14:48:56 -07:00
Junio C Hamano 4f66e44300 Merge branch 'as/sparse-checkout-removal' into maint
"sparse checkout" misbehaved for a path that is excluded from the
checkout when switching between branches that differ at the path.

* as/sparse-checkout-removal:
  unpack-trees: don't update files with CE_WT_REMOVE set
2015-08-19 14:41:27 -07:00
Junio C Hamano 9ad8474b98 Merge branch 'dt/unpack-trees-cache-tree-revalidate'
The code to perform multi-tree merges has been taught to repopulate
the cache-tree upon a successful merge into the index, so that
subsequent "diff-index --cached" (hence "status") and "write-tree"
(hence "commit") will go faster.

The same logic in "git checkout" may now be removed, but that is a
separate issue.

* dt/unpack-trees-cache-tree-revalidate:
  unpack-trees: populate cache-tree on successful merge
2015-08-12 14:09:57 -07:00
Jeff King fcd12db6af prefer git_pathdup to git_path in some possibly-dangerous cases
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.

None of the cases changed here has a known bug, but they're
worth converting away from git_path because:

  1. It's easy to use git_pathdup in these cases.

  2. They use constructs (like assignment) that make it
     hard to tell whether they're safe or not.

The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:12 -07:00
Junio C Hamano 8348bf1b69 Merge branch 'as/sparse-checkout-removal'
"sparse checkout" misbehaved for a path that is excluded from the
checkout when switching between branches that differ at the path.

* as/sparse-checkout-removal:
  unpack-trees: don't update files with CE_WT_REMOVE set
2015-08-03 11:01:28 -07:00
Brian Degenhardt 52fca2184d unpack-trees: populate cache-tree on successful merge
When we unpack trees into an existing index, we discard the old
index and replace it with the new, merged index.  Ensure that this
index has its cache-tree populated.  This will make subsequent git
status and commit commands faster.

Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 13:43:13 -07:00
David Turner 7d782416cb unpack-trees: don't update files with CE_WT_REMOVE set
Don't update files in the worktree from cache entries which are
flagged with CE_WT_REMOVE.

When a user does a sparse checkout, git removes files that are
marked with CE_WT_REMOVE (because they are out-of-scope for the
sparse checkout). If those files are also marked CE_UPDATE (for
instance, because they differ in the branch that is being checked
out and the outgoing branch), git would previously recreate them.
This patch prevents them from being recreated.

These erroneously-created files would also interfere with merges,
causing pre-merge revisions of out-of-scope files to appear in the
worktree.

apply_sparse_checkout() is the function where all "action"
manipulation (add, delete, update files..) for sparse checkout
occurs; it should not ask to delete and update both at the same
time.

Signed-off-by: Anatole Shaw <git-devel@omni.poc.net>
Signed-off-by: David Turner <dturner@twopensource.com>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 13:19:20 -07:00
Nguyễn Thái Ngọc Duy e931371a8f untracked cache: invalidate at index addition or removal
Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Junio C Hamano 3f1509809e Sync with v2.2.1
* maint:
  Git 2.2.1
  Git 2.1.4
  Git 2.0.5
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-18 12:30:53 -08:00
Junio C Hamano 58f1d950e3 Sync with v2.0.5
* maint-2.0:
  Git 2.0.5
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:42:28 -08:00
Junio C Hamano 5e519fb8b0 Sync with v1.9.5
* maint-1.9:
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:28:54 -08:00
Junio C Hamano 6898b79721 Sync with v1.8.5.6
* maint-1.8.5:
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:20:31 -08:00
Junio C Hamano 2aa9100846 Merge branch 'dotgit-case-maint-1.8.5' into maint-1.8.5
* dotgit-case-maint-1.8.5:
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:11:15 -08:00
Jeff King 4616918013 unpack-trees: propagate errors adding entries to the index
When unpack_trees tries to write an entry to the index,
add_index_entry may report an error to stderr, but we ignore
its return value. This leads to us returning a successful
exit code for an operation that partially failed. Let's make
sure to propagate this code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17 10:57:53 -08:00
Junio C Hamano c09988ad94 Merge branch 'jc/unpack-trees-plug-leak'
* jc/unpack-trees-plug-leak:
  unpack_trees: plug leakage of o->result
2014-12-12 14:31:33 -08:00
Junio C Hamano a16cc8b247 unpack_trees: plug leakage of o->result
Most of the time the caller specifies to which destination variable
the resulting index_state should be assigned by passing a non-NULL
pointer in o->dst_index to receive that state, but for a caller that
gives a NULL o->dst_index, the resulting index simply leaked.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-17 13:34:07 -08:00
Junio C Hamano a28e876b9d Merge branch 'jn/unpack-trees-checkout-m-carry-deletion' into maint
* jn/unpack-trees-checkout-m-carry-deletion:
  checkout -m: attempt merge when deletion of path was staged
  unpack-trees: use 'cuddled' style for if-else cascade
  unpack-trees: simplify 'all other failures' case
2014-09-19 14:05:12 -07:00
Junio C Hamano f28763d756 Merge branch 'jn/unpack-trees-checkout-m-carry-deletion'
"git checkout -m" did not switch to another branch while carrying
the local changes forward when a path was deleted from the index.

* jn/unpack-trees-checkout-m-carry-deletion:
  checkout -m: attempt merge when deletion of path was staged
  unpack-trees: use 'cuddled' style for if-else cascade
  unpack-trees: simplify 'all other failures' case
2014-09-11 10:33:36 -07:00
Jonathan Nieder 6a143aa2b2 checkout -m: attempt merge when deletion of path was staged
twoway_merge() is missing an o->gently check in the case where a file
that needs to be modified is missing from the index but present in the
old and new trees.  As a result, in this case 'git checkout -m' errors
out instead of trying to perform a merge.

Fix it by checking o->gently.  While at it, inline the o->gently check
into reject_merge to prevent future call sites from making the same
mistake.

Noticed by code inspection.  The test for the motivating case was
added by JC.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-25 15:17:34 -07:00
Jonathan Nieder 6c1db1b388 unpack-trees: use 'cuddled' style for if-else cascade
Match the predominant style in git by following K&R style for if/else
cascades.  Documentation/CodingStyle from linux.git explains:

  Note that the closing brace is empty on a line of its own, _except_ in
  the cases where it is followed by a continuation of the same statement,
  ie a "while" in a do-statement or an "else" in an if-statement, like
  this:

	if (x == y) {
		..
	} else if (x > y) {
		...
	} else {
		....
	}

  Rationale: K&R.

  Also, note that this brace-placement also minimizes the number of empty
  (or almost empty) lines, without any loss of readability.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-13 10:32:12 -07:00
Stefan Beller 0ecd180a27 unpack-trees: simplify 'all other failures' case
In the 'if (current)' block of twoway_merge, we handle the boring
errors by checking if the entry from the old tree, current index, and
new tree are present, to get a pathname for the error message from one
of them:

	if (oldtree)
		return o->gently ? -1 : reject_merge(oldtree, o);
	if (current)
		return o->gently ? -1 : reject_merge(current, o);
	if (newtree)
		return o->gently ? -1 : reject_merge(newtree, o);
	return -1;

Since this is guarded by 'if (current)', the second test is guaranteed
to succeed.  Moreover, any of the three entries, if present, would
have the same path because there is no rename detection in this code
path.   Even if some day in the future the entries' paths differ, the
'current' path used in the index and worktree would presumably be the
most recognizable for the end user.

Simplify by just using 'current'.

Noticed by coverity, Id:290002

Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-13 10:32:08 -07:00
Junio C Hamano 788cef81d4 Merge branch 'nd/split-index'
An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.

* nd/split-index: (32 commits)
  t1700: new tests for split-index mode
  t2104: make sure split index mode is off for the version test
  read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  read-tree: note about dropping split-index mode or index version
  read-tree: force split-index mode off on --index-output
  rev-parse: add --shared-index-path to get shared index path
  update-index --split-index: do not split if $GIT_DIR is read only
  update-index: new options to enable/disable split index mode
  split-index: strip pathname of on-disk replaced entries
  split-index: do not invalidate cache-tree at read time
  split-index: the reading part
  split-index: the writing part
  read-cache: mark updated entries for split index
  read-cache: save deleted entries in split index
  read-cache: mark new entries for split index
  read-cache: split-index mode
  read-cache: save index SHA-1 after reading
  entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  cache-tree: mark istate->cache_changed on prime_cache_tree()
  cache-tree: mark istate->cache_changed on cache tree update
  ...
2014-07-16 11:25:40 -07:00
Junio C Hamano 3b8e8af187 Merge branch 'jk/xstrfmt'
* jk/xstrfmt:
  setup_git_env(): introduce git_path_from_env() helper
  unique_path: fix unlikely heap overflow
  walker_fetch: fix minor memory leak
  merge: use argv_array when spawning merge strategy
  sequencer: use argv_array_pushf
  setup_git_env: use git_pathdup instead of xmalloc + sprintf
  use xstrfmt to replace xmalloc + strcpy/strcat
  use xstrfmt to replace xmalloc + sprintf
  use xstrdup instead of xmalloc + strcpy
  use xstrfmt in favor of manual size calculations
  strbuf: add xstrfmt helper
2014-07-09 11:34:05 -07:00
Jeremiah Mahler ccdd4a0f3c cleanup duplicate name_compare() functions
We often represent our strings as a counted string, i.e. a pair of
the pointer to the beginning of the string and its length, and the
string may not be NUL terminated to that length.

To compare a pair of such counted strings, unpack-trees.c and
read-cache.c implement their own name_compare() functions
identically.  In addition, the cache_name_compare() function in
read-cache.c is nearly identical.  The only difference is when one
string is the prefix of the other string, in which case
name_compare() returns -1/+1 to show which one is longer, and
cache_name_compare() returns the difference of the lengths to show
the same information.

Unify these three functions by using the implementation from
cache_name_compare().  This does not make any difference to the
existing and future callers, as they must be paying attention only
to the sign of the returned value (and not the magnitude) because
the original implementations of these two functions return values
returned by memcmp(3) when the one string is not a prefix of the
other string, and the only thing memcmp(3) guarantees its callers is
the sign of the returned value, not the magnitude.

Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:12:14 -07:00
Jeff King fa3f60b783 use xstrfmt in favor of manual size calculations
In many parts of the code, we do an ugly and error-prone
malloc like:

  const char *fmt = "something %s";
  buf = xmalloc(strlen(foo) + 10 + 1);
  sprintf(buf, fmt, foo);

This makes the code brittle, and if we ever get the
allocation wrong, is a potential heap overflow. Let's
instead favor xstrfmt, which handles the allocation
automatically, and makes the code shorter and more readable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 12:25:17 -07:00
Nguyễn Thái Ngọc Duy 078a58e825 read-cache: mark updated entries for split index
The large part of this patch just follows CE_ENTRY_CHANGED
marks. replace_index_entry() is updated to update
split_index->base->cache[] as well so base->cache[] does not reference
to a freed entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:40 -07:00
Nguyễn Thái Ngọc Duy 5fc2fc8fa2 read-cache: split-index mode
This split-index mode is designed to keep write cost proportional to
the number of changes the user has made, not the size of the work
tree. (Read cost is another matter, to be dealt separately.)

This mode stores index info in a pair of $GIT_DIR/index and
$GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over
time while "index" is smaller and updated often. Format details are in
index-format.txt, although not everything is implemented in this
patch.

Shared indexes are not automatically removed, because it's unclear if
the shared index is needed by any (even temporary) indexes by just
looking at it. After a while you'll collect stale shared indexes. The
good news is one shared index is useable for long, until
$GIT_DIR/index becomes too big and sluggish that the new shared index
must be created.

The safest way to clean shared indexes is to turn off split index
mode, so shared files are all garbage, delete them all, then turn on
split index mode again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy e93021b20a read-cache: save index SHA-1 after reading
Also update SHA-1 after writing. If we do not do that, the second
read_index() will see "initialized" variable already set and not read
.git/index again, which is fine, except istate->sha1 now has a stale
value.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy d4a2024aef entry.c: update cache_changed if refresh_cache is set in checkout_entry()
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy a5400efe29 cache-tree: mark istate->cache_changed on cache tree invalidation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy a5c446f116 unpack-trees: be specific what part of the index has changed
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Junio C Hamano 0963008cbf Merge branch 'nd/i18n-progress'
Mark the progress indicators from various time-consuming commands
for i18n/l10n.

* nd/i18n-progress:
  i18n: mark all progress lines for translation
2014-03-14 14:26:31 -07:00
Junio C Hamano d637d1b9a8 Merge branch 'kb/fast-hashmap'
Improvements to our hash table to get it to meet the needs of the
msysgit fscache project, with some nice performance improvements.

* kb/fast-hashmap:
  name-hash: retire unused index_name_exists()
  hashmap.h: use 'unsigned int' for hash-codes everywhere
  test-hashmap.c: drop unnecessary #includes
  .gitignore: test-hashmap is a generated file
  read-cache.c: fix memory leaks caused by removed cache entries
  builtin/update-index.c: cleanup update_one
  fix 'git update-index --verbose --again' output
  remove old hash.[ch] implementation
  name-hash.c: remove cache entries instead of marking them CE_UNHASHED
  name-hash.c: use new hash map implementation for cache entries
  name-hash.c: remove unreferenced directory entries
  name-hash.c: use new hash map implementation for directories
  diffcore-rename.c: use new hash map implementation
  diffcore-rename.c: simplify finding exact renames
  diffcore-rename.c: move code around to prepare for the next patch
  buitin/describe.c: use new hash map implementation
  add a hashtable implementation that supports O(1) removal
  submodule: don't access the .gitmodules cache entry after removing it
2014-02-27 14:01:09 -08:00
Nguyễn Thái Ngọc Duy 754dbc43f0 i18n: mark all progress lines for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 09:08:37 -08:00
Junio C Hamano 4766036ecd Merge branch 'jk/two-way-merge-corner-case-fix' into maint
"git am --abort" sometimes complained about not being able to write
a tree with an 0{40} object in it.

* jk/two-way-merge-corner-case-fix:
  t1005: add test for "read-tree --reset -u A B"
  t1005: reindent
  unpack-trees: fix "read-tree -u --reset A B" with conflicted index
2013-12-17 11:32:17 -08:00
Antoine Pelisse fc2b621454 Prevent buffer overflows when path is too long
Some buffers created with PATH_MAX length are not checked when being
written, and can overflow if PATH_MAX is not big enough to hold the
path.

Replace those buffers by strbufs so that their size is automatically
grown if necessary. They are created as static local variables to avoid
reallocating memory on each call. Note that prefix_filename() returns
this static buffer so each callers should copy or use the string
immediately (this is currently true).

Reported-by: Wataru Noguchi <wnoguchi.0727@gmail.com>
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 14:06:19 -08:00
Junio C Hamano 3979580265 Merge branch 'jk/two-way-merge-corner-case-fix'
Fix a rather longstanding corner-case bug in twoway "reset to
there" merge, which is most often seen in "git am --abort".

* jk/two-way-merge-corner-case-fix:
  t1005: add test for "read-tree --reset -u A B"
  t1005: reindent
  unpack-trees: fix "read-tree -u --reset A B" with conflicted index
2013-12-05 12:59:25 -08:00
Karsten Blees 419a597f64 name-hash.c: remove cache entries instead of marking them CE_UNHASHED
The new hashmap implementation supports remove, so really remove unused
cache entries from the name hashmap instead of just marking them.

The CE_UNHASHED flag and CE_STATE_MASK are no longer needed.

Keep the CE_HASHED flag to prevent adding entries twice.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18 13:04:24 -08:00
Karsten Blees 8b013788a1 name-hash.c: use new hash map implementation for cache entries
Note: the "ce->next = NULL;" in unpack-trees.c::do_add_entry can safely be
removed, as ce->next (now ce->ent.next) is always properly initialized in
name-hash.c::hash_index_entry.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18 13:04:24 -08:00
Jeff King b018ff6085 unpack-trees: fix "read-tree -u --reset A B" with conflicted index
When we call "read-tree --reset -u HEAD ORIG_HEAD", the first thing we
do with the index is to call read_cache_unmerged.  Originally that
would read the index, leaving aside any unmerged entries.  However, as
of d1a43f2 (reset --hard/read-tree --reset -u: remove unmerged new
paths, 2008-10-15), it actually creates a new cache entry to serve as
a placeholder, so that we later know to update the working tree.

However, we later noticed that the sha1 of that unmerged entry was
just copied from some higher stage, leaving you with random content in
the index.  That was fixed by e11d7b5 ("reset --merge": fix unmerged
case, 2009-12-31), which instead puts the null sha1 into the newly
created entry, and sets a CE_CONFLICTED flag. At the same time, it
teaches the unpack-trees machinery to pay attention to this flag, so
that oneway_merge throws away the current value.

However, it did not update the code paths for twoway_merge, which is
where we end up in the two-way read-tree with --reset. We notice that
the HEAD and ORIG_HEAD versions are the same, and say "oh, we can just
reuse the current version". But that's not true. The current version
is bogus.

Notice this case and make sure we do not keep the bogus entry; either
we do not have that path in the tree we are moving to (i.e. remove
it), or we want to have the cache entry we created for the tree we are
moving to (i.e. resolve by explicitly saying the "newtree" version is
what we want).

[jc: this is from the almost year-old $gmane/212316]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-04 10:13:27 -08:00
Eric Sunshine ebbd7439b1 employ new explicit "exists in index?" API
Each caller of index_name_exists() knows whether it is looking for a
directory or a file, and can avoid the unnecessary indirection of
index_name_exists() by instead calling index_dir_exists() or
index_file_exists() directly.

Invoking the appropriate search function explicitly will allow a
subsequent patch to relieve callers of the artificial burden of having
to add a trailing '/' to the pathname given to index_dir_exists().

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17 10:07:37 -07:00
Felipe Contreras e28f764159 unpack-trees: plug a memory leak
Before overwriting the destination index, first let's discard its
contents.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Лежанкин Иван <abyss.7@gmail.com> wrote:
Reviewed-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-13 14:37:30 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00