1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-01 20:36:33 +02:00
Commit Graph

466 Commits

Author SHA1 Message Date
Jeff King 2608c24940 log-tree: make name_decoration hash static
In the previous commit, we made add_name_decoration global
so that adders would not have to access the hash directly.
We now make the hash itself static so that callers _have_ to
add through our function, making sure that all additions go
through a single point.  To do this, we have to add one more
accessor function: a way to lookup entries in the hash.

Since the only caller doesn't actually look at the returned
value, but rather only asks whether there is a decoration or
not, we could provide only a boolean "has_name_decoration".
That would allow us to make "struct name_decoration" local
to log-tree, as well.

However, it's unlikely to cause any maintainability harm
making the actual data public, and this interface is more
flexible if we need to look at decorations from other parts
of the code in the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-26 10:34:26 -07:00
Nguyễn Thái Ngọc Duy 429bb40abd pathspec: convert some match_pathspec_depth() to ce_path_match()
This helps reduce the number of match_pathspec_depth() call sites and
show how match_pathspec_depth() is used.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:36:52 -08:00
Junio C Hamano 63763273de Merge branch 'jc/revision-range-unpeel'
"git log --left-right A...B" lost the "leftness" of commits
reachable from A when A is a tag as a side effect of a recent
bugfix.  This is a regression in 1.8.4.x series.

* jc/revision-range-unpeel:
  revision: propagate flag bits from tags to pointees
  revision: mark contents of an uninteresting tree uninteresting
2014-01-27 10:44:10 -08:00
Junio C Hamano a74352867e revision: propagate flag bits from tags to pointees
With the previous fix 895c5ba3 (revision: do not peel tags used in
range notation, 2013-09-19), handle_revision_arg() that processes
command line arguments for the "git log" family of commands no
longer directly places the object pointed by the tag in the pending
object array when it sees a tag object.  We used to place pointee
there after copying the flag bits like UNINTERESTING and
SYMMETRIC_LEFT.

This change meant that any flag that is relevant to later history
traversal must now be propagated to the pointed objects (most often
these are commits) while starting the traversal, which is partly
done by handle_commit() that is called from prepare_revision_walk().
We did propagate UNINTERESTING, but did not do so for others, most
notably SYMMETRIC_LEFT.  This caused "git log --left-right v1.0..."
(where "v1.0" is a tag) to start losing the "leftness" from the
commit the tag points at.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 15:53:51 -08:00
Junio C Hamano 2ac5e4470b revision: mark contents of an uninteresting tree uninteresting
"git rev-list --objects ^A^{tree} B^{tree}" ought to mean "I want a
list of objects inside B's tree, but please exclude the objects that
appear inside A's tree".

we see the top-level tree marked as uninteresting (i.e. ^A^{tree} in
the above example) and call mark_tree_uninteresting() on it; this
unfortunately prevents us from recursing into the tree and marking
the objects in the tree as uninteresting.

The reason why "git log ^A A" yields an empty set of commits,
i.e. we do not have a similar issue for commits, is because we call
mark_parents_uninteresting() after seeing an uninteresting commit.
The uninteresting-ness of the commit itself does not prevent its
parents from being marked as uninteresting.

Introduce mark_tree_contents_uninteresting() and structure the code
in handle_commit() in such a way that it makes it the responsibility
of the callchain leading to this function to mark commits, trees and
blobs as uninteresting, and also make it the responsibility of the
helpers called from this function to mark objects that are reachable
from them.

Note that this is a very old bug that probably dates back to the day
when "rev-list --objects" was introduced.  The line to clear
tree->object.parsed at the end of mark_tree_contents_uninteresting()
can be removed when this fix is merged to the codebase after
6e454b9a (clear parsed flag when we free tree buffers, 2013-06-05).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 15:48:58 -08:00
Junio C Hamano ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Christian Couder 5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano 10167eb251 Merge branch 'jc/ref-excludes'
People often wished a way to tell "git log --branches" (and "git
log --remotes --not --branches") to exclude some local branches
from the expansion of "--branches" (similarly for "--tags", "--all"
and "--glob=<pattern>").  Now they have one.

* jc/ref-excludes:
  rev-parse: introduce --exclude=<glob> to tame wildcards
  rev-list --exclude: export add/clear-ref-exclusion and ref-excluded API
  rev-list --exclude: tests
  document --exclude option
  revision: introduce --exclude=<glob> to tame wildcards
2013-12-05 12:59:09 -08:00
Junio C Hamano c6f1b920ac Merge branch 'nd/literal-pathspecs'
Fixes a regression on 'master' since v1.8.4.

* nd/literal-pathspecs:
  pathspec: stop --*-pathspecs impact on internal parse_pathspec() uses
2013-11-18 14:31:29 -08:00
Junio C Hamano ff32d3420a rev-list --exclude: export add/clear-ref-exclusion and ref-excluded API
... while updating their function signature.  To be squashed into
the initial patch to rev-list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-01 13:09:24 -07:00
Felipe Contreras 9e57ac55ce revision: trivial style fixes
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-31 13:48:05 -07:00
Junio C Hamano 4cebbe6f55 Merge branch 'nd/magic-pathspec'
All callers to parse_pathspec() must choose between getting no
pathspec or one path that is limited to the current directory
when there is no paths given on the command line, but there were
two callers that violated this rule, triggering a BUG().

* nd/magic-pathspec:
  Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flags
2013-10-30 12:10:33 -07:00
Junio C Hamano 2d99baab2f Merge branch 'jc/revision-range-unpeel'
"git rev-list --objects ^v1.0^ v1.0" gave v1.0 tag itself in the
output, but "git rev-list --objects v1.0^..v1.0" did not.

* jc/revision-range-unpeel:
  revision: do not peel tags used in range notation
2013-10-28 10:43:16 -07:00
Nguyễn Thái Ngọc Duy 4a2d5ae262 pathspec: stop --*-pathspecs impact on internal parse_pathspec() uses
Normally parse_pathspec() is used on command line arguments where it
can do fancy thing like parsing magic on each argument or adding magic
for all pathspecs based on --*-pathspecs options.

There's another use of parse_pathspec(), where pathspec is needed, but
the input is known to be pure paths. In this case we usually don't
want --*-pathspecs to interfere. And we definitely do not want to
parse magic in these paths, regardless of --literal-pathspecs.

Add new flag PATHSPEC_LITERAL_PATH for this purpose. When it's set,
--*-pathspecs are ignored, no magic is parsed. And if the caller
allows PATHSPEC_LITERAL (i.e. the next calls can take literal magic),
then PATHSPEC_LITERAL will be set.

This fixes cases where git chokes when GIT_*_PATHSPECS are set because
parse_pathspec() indicates it won't take any magic. But
GIT_*_PATHSPECS add them anyway. These are

   export GIT_LITERAL_PATHSPECS=1
   git blame -- something
   git log --follow something
   git log --merge

"git ls-files --with-tree=path" (aka parse_pathspec() in
overlay_tree_on_cache()) is safe because the input is empty, and
producing one pathspec due to PATHSPEC_PREFER_CWD does not take any
magic into account.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-28 09:57:36 -07:00
Nguyễn Thái Ngọc Duy c8556c6213 Fix calling parse_pathspec with no paths nor PATHSPEC_PREFER_* flags
When parse_pathspec() is called with no paths, the behavior could be
either return no paths, or return one path that is cwd. Some commands
do the former, some the latter. parse_pathspec() itself does not make
either the default and requires the caller to specify either flag if
it may run into this situation.

I've grep'd through all parse_pathspec() call sites. Some pass
neither, but those are guaranteed never pass empty path to
parse_pathspec(). There are two call sites that may pass empty path
and are fixed with this patch.

[jc: added a test from Antoine's bug report]

Reported-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-22 10:49:43 -07:00
Junio C Hamano 895c5ba3c1 revision: do not peel tags used in range notation
A range notation "A..B" means exactly the same thing as what "^A B"
means, i.e. the set of commits that are reachable from B but not
from A.  But the internal representation after the revision parser
parsed these two notations are subtly different.

 - "rev-list ^A B" leaves A and B in the revs->pending.objects[]
   array, with the former marked as UNINTERESTING and the revision
   traversal machinery propagates the mark to underlying commit
   objects A^0 and B^0.

 - "rev-list A..B" peels tags and leaves A^0 (marked as
   UNINTERESTING) and B^0 in revs->pending.objects[] array before
   the traversal machinery kicks in.

This difference usually does not matter, but starts to matter when
the --objects option is used.  For example, we see this:

    $ git rev-list --objects v1.8.4^1..v1.8.4 | grep $(git rev-parse v1.8.4)
    $ git rev-list --objects v1.8.4 ^v1.8.4^1 | grep $(git rev-parse v1.8.4)
    04f013dc38 v1.8.4

With the former invocation, the revision traversal machinery never
hears about the tag v1.8.4 (it only sees the result of peeling it,
i.e. the commit v1.8.4^0), and the tag itself does not appear in the
output.  The latter does send the tag object itself to the output.

Make the range notation keep the unpeeled objects and feed them to
the traversal machinery to fix this inconsistency.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-15 16:17:09 -07:00
Junio C Hamano f406140baa Merge branch 'fc/at-head'
Instead of typing four capital letters "HEAD", you can say "@" now,
e.g. "git log @".

* fc/at-head:
  Add new @ shortcut for HEAD
  sha1-name: pass len argument to interpret_branch_name()
2013-09-20 12:38:10 -07:00
Junio C Hamano b8f23112f0 Merge branch 'jk/free-tree-buffer'
* jk/free-tree-buffer:
  clear parsed flag when we free tree buffers
2013-09-17 11:37:33 -07:00
Junio C Hamano b02f5aeda6 Merge branch 'jl/submodule-mv'
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
2013-09-09 14:36:15 -07:00
Felipe Contreras cf99a761d3 sha1-name: pass len argument to interpret_branch_name()
This is useful to make sure we don't step outside the boundaries of what
we are interpreting at the moment. For example while interpreting
foobar@{u}~1, the job of interpret_branch_name() ends right before ~1,
but there's no way to figure that out inside the function, unless the
len argument is passed.

So let's do that.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:33:00 -07:00
Junio C Hamano e7b432c521 revision: introduce --exclude=<glob> to tame wildcards
People often find "git log --branches" etc. that includes _all_
branches is cumbersome to use when they want to grab most but except
some.  The same applies to --tags, --all and --glob.

Teach the revision machinery to remember patterns, and then upon the
next such a globbing option, exclude those that match the pattern.

With this, I can view only my integration branches (e.g. maint,
master, etc.) without topic branches, which are named after two
letters from primary authors' names, slash and topic name.

    git rev-list --no-walk --exclude=??/* --branches |
    git name-rev --refs refs/heads/* --stdin

This one shows things reachable from local and remote branches that
have not been merged to the integration branches.

    git log --remotes --branches --not --exclude=??/* --branches

It may be a bit rough around the edges, in that the pattern to give
the exclude option depends on what globbing option follows.  In
these examples, the pattern "??/*" is used, not "refs/heads/??/*",
because the globbing option that follows the -"-exclude=<pattern>"
is "--branches".  As each use of globbing option resets previously
set "--exclude", this may not be such a bad thing, though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 16:37:55 -07:00
Thomas Rast 838f9a1566 log: use true parents for diff when walking reflogs
The reflog walking logic (git log -g) replaces the true parent list
with the preceding commit in the reflog.  This results in bogus commit
diffs when combined with options such as -p; the diff is against the
reflog predecessor, not the parent of the commit.

Save the true parents on the side, extending the functions from the
previous commit.  The diff logic picks them up and uses them to show
the correct diffs.

We do have to be somewhat careful about repeated calling of
save_parents(), since the reflog may list a commit more than once.  We
now store (commit_list*)-1 to distinguish the "not saved yet" and
"root commit" cases.  This lets us preserve an empty parent list even
if save_parents() is repeatedly called.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 08:27:00 -07:00
Thomas Rast 53d00b39ce log: use true parents for diff even when rewriting
When using pathspec filtering in combination with diff-based log
output, parent simplification happens before the diff is computed.
The diff is therefore against the *simplified* parents.

This works okay, arguably by accident, in the normal case:
simplification reduces to one parent as long as the commit is TREESAME
to it.  So the simplified parent of any given commit must have the
same tree contents on the filtered paths as its true (unfiltered)
parent.

However, --full-diff breaks this guarantee, and indeed gives pretty
spectacular results when comparing the output of

  git log --graph --stat ...
  git log --graph --full-diff --stat ...

(--graph internally kicks in parent simplification, much like
--parents).

To fix it, store a copy of the parent list before simplification (in a
slab) whenever --full-diff is in effect.  Then use the stored parents
instead of the simplified ones in the commit display code paths.  The
latter do not actually check for --full-diff to avoid duplicated code;
they just grab the original parents if save_parents() has not been
called for this revision walk.

For ordinary commits it should be obvious that this is the right thing
to do.

Merge commits are a bit subtle.  Observe that with default
simplification, merge simplification is an all-or-nothing decision:
either the merge is TREESAME to one parent and disappears, or it is
different from all parents and the parent list remains intact.
Redundant parents are not pruned, so the existing code also shows them
as a merge.

So if we do show a merge commit, the parent list just consists of the
rewrite result on each parent.  Running, e.g., --cc on this in
--full-diff mode is not very useful: if any commits were skipped, some
hunks will disagree with all sides of the merge (with one side,
because commits were skipped; with the others, because they didn't
have those changes in the first place).  This triggers --cc showing
these hunks spuriously.

Therefore I believe that even for merge commits it is better to show
the diffs wrt. the original parents.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01 10:25:48 -07:00
Nguyễn Thái Ngọc Duy 9a08727443 remove init_pathspec() in favor of parse_pathspec()
While at there, move free_pathspec() to pathspec.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy bd1928df1d remove diff_tree_{setup,release}_paths
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 0fdc2ae512 convert some get_pathspec() calls to parse_pathspec()
These call sites follow the pattern:

   paths = get_pathspec(prefix, argv);
   init_pathspec(&pathspec, paths);

which can be converted into a single parse_pathspec() call.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano 534f0e0996 Merge branch 'jc/topo-author-date-sort'
"git log" learned the "--author-date-order" option, with which the
output is topologically sorted and commits in parallel histories
are shown intermixed together based on the author timestamp.

* jc/topo-author-date-sort:
  t6003: add --author-date-order test
  topology tests: teach a helper to set author dates as well
  t6003: add --date-order test
  topology tests: teach a helper to take abbreviated timestamps
  t/lib-t6000: style fixes
  log: --author-date-order
  sort-in-topological-order: use prio-queue
  prio-queue: priority queue of pointers to structs
  toposort: rename "lifo" field
2013-07-01 12:41:23 -07:00
Junio C Hamano ede63a195c Merge branch 'mh/reflife'
Define memory ownership and lifetime rules for what for-each-ref
feeds to its callbacks (in short, "you do not own it, so make a
copy if you want to keep it").

* mh/reflife: (25 commits)
  refs: document the lifetime of the args passed to each_ref_fn
  register_ref(): make a copy of the bad reference SHA-1
  exclude_existing(): set existing_refs.strdup_strings
  string_list_add_refs_by_glob(): add a comment about memory management
  string_list_add_one_ref(): rename first parameter to "refname"
  show_head_ref(): rename first parameter to "refname"
  show_head_ref(): do not shadow name of argument
  add_existing(): do not retain a reference to sha1
  do_fetch(): clean up existing_refs before exiting
  do_fetch(): reduce scope of peer_item
  object_array_entry: fix memory handling of the name field
  find_first_merges(): remove unnecessary code
  find_first_merges(): initialize merges variable using initializer
  fsck: don't put a void*-shaped peg in a char*-shaped hole
  object_array_remove_duplicates(): rewrite to reduce copying
  revision: use object_array_filter() in implementation of gc_boundary()
  object_array: add function object_array_filter()
  revision: split some overly-long lines
  cmd_diff(): make it obvious which cases are exclusive of each other
  cmd_diff(): rename local variable "list" -> "entry"
  ...
2013-06-14 08:46:14 -07:00
Junio C Hamano b27a79d16b Merge branch 'kb/full-history-compute-treesame-carefully-2'
Major update to the revision traversal logic to improve culling of
irrelevant parents while traversing a mergy history.

* kb/full-history-compute-treesame-carefully-2:
  revision.c: make default history consider bottom commits
  revision.c: don't show all merges for --parents
  revision.c: discount side branches when computing TREESAME
  revision.c: add BOTTOM flag for commits
  simplify-merges: drop merge from irrelevant side branch
  simplify-merges: never remove all TREESAME parents
  t6012: update test for tweaked full-history traversal
  revision.c: Make --full-history consider more merges
  Documentation: avoid "uninteresting"
  rev-list-options.txt: correct TREESAME for P
  t6111: add parents to tests
  t6111: allow checking the parents as well
  t6111: new TREESAME test set
  t6019: test file dropped in -s ours merge
  decorate.c: compact table when growing
2013-06-14 08:45:59 -07:00
Junio C Hamano 81c6b38b67 log: --author-date-order
Sometimes people would want to view the commits in parallel
histories in the order of author dates, not committer dates.

Teach "topo-order" sort machinery to do so, using a commit-info slab
to record the author dates of each commit, and prio-queue to sort
them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-11 15:15:21 -07:00
Junio C Hamano 08f704f294 toposort: rename "lifo" field
The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are.  When
traversing a forked history like this with "git log C E":

    A----B----C
     \
      D----E

we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.

In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.

Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output).  The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour.  We start the traversal by knowing two
commits, C and E.  While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected.  By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.

When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.

The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.

Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code.  The mechanical replacement rule is:

  "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
  "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-11 15:15:21 -07:00
Jeff King 6e454b9a31 clear parsed flag when we free tree buffers
Many code paths will free a tree object's buffer and set it
to NULL after finishing with it in order to keep memory
usage down during a traversal. However, out of 8 sites that
do this, only one actually unsets the "parsed" flag back.
Those sites that don't are setting a trap for later users of
the tree object; even after calling parse_tree, the buffer
will remain NULL, causing potential segfaults.

It is not known whether this is triggerable in the current
code. Most commands do not do an in-memory traversal
followed by actually using the objects again. However, it
does not hurt to be safe for future callers.

In most cases, we can abstract this out to a
"free_tree_buffer" helper. However, there are two
exceptions:

  1. The fsck code relies on the parsed flag to know that we
     were able to parse the object at one point. We can
     switch this to using a flag in the "flags" field.

  2. The index-pack code sets the buffer to NULL but does
     not free it (it is freed by a caller). We should still
     unset the parsed flag here, but we cannot use our
     helper, as we do not want to free the buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-06 10:29:12 -07:00
Junio C Hamano ed73fe5642 Merge branch 'tr/line-log'
* tr/line-log:
  git-log(1): remove --full-line-diff description
  line-log: fix documentation formatting
  log -L: improve comments in process_all_files()
  log -L: store the path instead of a diff_filespec
  log -L: test merge of parallel modify/rename
  t4211: pass -M to 'git log -M -L...' test
  log -L: fix overlapping input ranges
  log -L: check range set invariants when we look it up
  Speed up log -L... -M
  log -L: :pattern:file syntax to find by funcname
  Implement line-history search (git log -L)
  Export rewrite_parents() for 'log -L'
  Refactor parse_loc
2013-06-02 16:00:44 -07:00
Michael Haggerty 31faeb2088 object_array_entry: fix memory handling of the name field
Previously, the memory management of the object_array_entry::name
field was inconsistent and undocumented.  object_array_entries are
ultimately created by a single function, add_object_array_with_mode(),
which has an argument "const char *name".  This function used to
simply set the name field to reference the string pointed to by the
name parameter, and nobody on the object_array side ever freed the
memory.  Thus, it assumed that the memory for the name field would be
managed by the caller, and that the lifetime of that string would be
at least as long as the lifetime of the object_array_entry.  But
callers were inconsistent:

* Some passed pointers to constant strings or argv entries, which was
  OK.

* Some passed pointers to newly-allocated memory, but didn't arrange
  for the memory ever to be freed.

* Some passed the return value of sha1_to_hex(), which is a pointer to
  a statically-allocated buffer that can be overwritten at any time.

* Some passed pointers to refnames that they received from a
  for_each_ref()-type iteration, but the lifetimes of such refnames is
  not guaranteed by the refs API.

Bring consistency to this mess by changing object_array to make its
own copy for the object_array_entry::name field and free this memory
when an object_array_entry is deleted from the array.

Many callers were passing the empty string as the name parameter, so
as a performance optimization, treat the empty string specially.
Instead of making a copy, store a pointer to a statically-allocated
empty string to object_array_entry::name.  When deleting such an
entry, skip the free().

Change the callers that were already passing copies to
add_object_array_with_mode() to either skip the copy, or (if the
memory needed to be allocated anyway) freeing the memory itself.

A part of this commit effectively reverts

    70d26c6e76 read_revisions_from_stdin: make copies for handle_revision_arg

because the copying introduced by that commit (which is still
necessary) is now done at a deeper level.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:28:46 -07:00
Michael Haggerty be6754c67f revision: use object_array_filter() in implementation of gc_boundary()
Use object_array_filter(), which will soon be made smarter about
cleaning up discarded entries properly.  Also add a function comment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-28 09:25:01 -07:00
Michael Haggerty ff5f5f268f revision: split some overly-long lines
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-28 09:25:01 -07:00
Michael Haggerty df835d3a0c add_rev_cmdline(): make a copy of the name argument
Instead of assuming that the memory pointed to by the name argument
will live forever, make a local copy of it before storing it in the
ref_cmdline_info.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-28 09:25:00 -07:00
Kevin Bracey 141efdba57 revision.c: make default history consider bottom commits
Previously, the default history treated bottom commits the same as any
other UNINTERESTING commit, which could force it down side branches.

Consider the following history:

   *A--*B---D--*F         * marks !TREESAME parent paths
     \     /*
      `-C-'

When requesting "B..F", B is UNINTERESTING but TREESAME to D. C is
!UNINTERESTING.

So default following would go from D into the irrelevant side branch C
to A, rather than to B.  Note also that if there had been an extra
!UNINTERESTING commit B1 between B and D, it wouldn't have gone down C.

Change the default following to test relevant_commit() instead of
!UNINTERESTING, so it can proceed straight from D to B, thus finishing
the traversal of that path.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:10 -07:00
Kevin Bracey bf3418b08b revision.c: don't show all merges for --parents
When using --parents or --children, get_commit_action() previously showed
all merges, even if TREESAME to both parents.

This was intended to tie together the topology of the rewritten parents,
but it was excessive - in fact we only need to show merges that have two
or more relevant parents. Merges at the boundary do not necessarily need
to be shown.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:10 -07:00
Kevin Bracey 4d826608e9 revision.c: discount side branches when computing TREESAME
Use the BOTTOM flag to define relevance for pruning. Relevant commits
are those that are !UNINTERESTING or BOTTOM, and this allows us to
identify irrelevant side branches (UNINTERESTING && !BOTTOM).

If a merge has relevant parents, and it is TREESAME to them, then do not
let irrelevant parents cause the merge to be treated as !TREESAME.

When considering simplification, don't always include all merges -
merges with exactly one relevant parent can be simplified, if TREESAME
according to the above rule.

These two changes greatly increase simplification in limited, pruned
revision lists.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:10 -07:00
Kevin Bracey 7f34a46ff5 revision.c: add BOTTOM flag for commits
When performing edge-based operations on the revision graph, it can be
useful to be able to identify the INTERESTING graph's connection(s) to
the bottom commit(s) specified by the user.

Conceptually when the user specifies "A..B" (== B ^A), they are asking
for the history from A to B. The first connection from A onto the
INTERESTING graph is part of that history, and should be considered. If
we consider only INTERESTING nodes and their connections, then we're
really only considering the history from A's immediate descendants to B.

This patch does not change behaviour, but adds a new BOTTOM flag to
indicate the bottom commits specified by the user, ready to be used by
following patches.

We immediately use the BOTTOM flag to return collect_bottom_commits() to
its original approach of examining the pending commit list rather than
the command line. This will ensure alignment of the definition of
"bottom" with future patches.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:10 -07:00
Kevin Bracey 143f1eafdb simplify-merges: drop merge from irrelevant side branch
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.

Original log message follows.

The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially.  Namely, in a history with this
shape:

	---o---o---M
	          /
         x---x---x

where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.

That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:09 -07:00
Kevin Bracey 9c129eab99 simplify-merges: never remove all TREESAME parents
When simplifying an odd merge, such as one that used "-s ours", we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.

This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:09 -07:00
Kevin Bracey d0af663e42 revision.c: Make --full-history consider more merges
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.

While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by "-s
ours", then:

  git log -m -p --full-history -Schange file

would successfully locate "change"'s addition but would not locate the
merge that resolved against it.

Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.

Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.

Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.

This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:51:09 -07:00
Kevin Bracey a765499a08 revision.c: treat A...B merge bases as if manually specified
The documentation assures users that "A...B" is defined as "A B --not
$(git merge-base --all A B)". This wasn't in fact quite true, because
the calculated merge bases were not sent to add_rev_cmdline().

The main effect of this was that although

  git rev-list --ancestry-path A B --not $(git merge-base --all A B)

worked, the simpler form

  git rev-list --ancestry-path A...B

failed with a "no bottom commits" error.

Other potential users of bottom commits could also be affected by this
problem, if they examine revs->cmdline_info; I came across the issue in
my proposed history traversal refinements series.

So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged with new 'whence' enum value REV_CMD_MERGE_BASE.

Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 11:45:34 -07:00
Junio C Hamano e52e6f79cc Merge branch 'nd/pretty-formats'
pretty-printing body of the commit that is stored in non UTF-8
encoding did not work well.  The early part of this series fixes
it.  And then it adds %C(auto) specifier that turns the coloring on
when we are emitting to the terminal, and adds column-aligning
format directives.

* nd/pretty-formats:
  pretty: support %>> that steal trailing spaces
  pretty: support truncating in %>, %< and %><
  pretty: support padding placeholders, %< %> and %><
  pretty: add %C(auto) for auto-coloring
  pretty: split color parsing into a separate function
  pretty: two phase conversion for non utf-8 commits
  utf8.c: add reencode_string_len() that can handle NULs in string
  utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences
  utf8.c: move display_mode_esc_sequence_len() for use by other functions
  pretty: share code between format_decoration and show_decorations
  pretty-formats.txt: wrap long lines
  pretty: get the correct encoding for --pretty:format=%e
  pretty: save commit encoding from logmsg_reencode if the caller needs it
2013-04-23 11:22:48 -07:00
Junio C Hamano 8d41addacb Merge branch 'tr/copy-revisions-from-stdin'
A fix to a long-standing issue in the command line parser for
revisions, which was triggered by mv/sequence-pick-error-diag topic.

* tr/copy-revisions-from-stdin:
  read_revisions_from_stdin: make copies for handle_revision_arg
2013-04-19 13:40:13 -07:00
Nguyễn Thái Ngọc Duy 5a10d23658 pretty: save commit encoding from logmsg_reencode if the caller needs it
The commit encoding is parsed by logmsg_reencode, there's no need for
the caller to re-parse it again. The reencoded message now has the new
encoding, not the original one. The caller would need to read commit
object again before parsing.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-18 16:28:27 -07:00
Thomas Rast 70d26c6e76 read_revisions_from_stdin: make copies for handle_revision_arg
read_revisions_from_stdin() has passed pointers to its read buffer
down to handle_revision_arg() since its inception way back in 42cabc3
(Teach rev-list an option to read revs from the standard input.,
2006-09-05).  Even back then, this was a bug: through
add_pending_object, the argument was recorded in the object_array's
'name' field.

Fix it by making a copy whenever read_revisions_from_stdin() passes an
argument down the callchain.  The other caller runs handle_revision_arg()
on argv[], where it would be redundant to make a copy.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-16 11:17:48 -07:00