1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-06-02 14:06:12 +02:00
Commit Graph

1036 Commits

Author SHA1 Message Date
Junio C Hamano b76a9e1648 Merge branch 'ap/maint-diff-rename-avoid-overlap' into maint
* ap/maint-diff-rename-avoid-overlap:
  tests: make sure rename pretty print works
  diff: prevent pprint_rename from underrunning input
  diff: Fix rename pretty-print when suffix and prefix overlap
2013-04-01 09:19:47 -07:00
Junio C Hamano caf217a3b8 Merge branch 'ap/maint-diff-rename-avoid-overlap'
The logic used by "git diff -M --stat" to shorten the names of
files before and after a rename did not work correctly when the
common prefix and suffix between the two filenames overlapped.

* ap/maint-diff-rename-avoid-overlap:
  tests: make sure rename pretty print works
  diff: prevent pprint_rename from underrunning input
  diff: Fix rename pretty-print when suffix and prefix overlap
2013-03-25 14:00:37 -07:00
Max Nanasy c9fc4415e2 diff.c: diff.renamelimit => diff.renameLimit in message
In the warning message printed when rename or unmodified copy
detection was skipped due to too many files, change "diff.renamelimit"
to "diff.renameLimit", in order to make it consistent with git
documentation, which consistently uses "diff.renameLimit".

Signed-off-by: Max Nanasy <max.nanasy@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-21 14:06:49 -07:00
Thomas Rast dd281f09b7 diff: prevent pprint_rename from underrunning input
The logic described in d020e27 (diff: Fix rename pretty-print when
suffix and prefix overlap, 2013-02-23) is wrong: The proof in the
comment is valid only if both strings are the same length.  *One* of
old/new can reach a-1 (b-1, resp.) if 'a' is a suffix of 'b' (or vice
versa).

Since the intent was to let the loop run down to the '/' at the end of
the common prefix, fix it by making that distinction explicit: if
there is no prefix, allow no underrun.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-26 13:01:34 -08:00
Antoine Pelisse d020e27fda diff: Fix rename pretty-print when suffix and prefix overlap
When considering a rename for two files that have a suffix and a prefix
that can overlap, a confusing line is shown. As an example, renaming
"a/b/b/c" to "a/b/c" shows "a/b/{ => }/b/c".

Currently, what we do is calculate the common prefix ("a/b/"), and the
common suffix ("/b/c"), but the same "/b/" is actually counted both in
prefix and suffix. Then when calculating the size of the non-common part,
we end-up with a negative value which is reset to 0, thus the "{ => }".

Do not allow the common suffix to overlap the common prefix and stop
when reaching a "/" that would be in both.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-23 23:52:39 -08:00
Junio C Hamano abea4dc76a Merge branch 'mp/diff-algo-config'
Add diff.algorithm configuration so that the user does not type
"diff --histogram".

* mp/diff-algo-config:
  diff: Introduce --diff-algorithm command line option
  config: Introduce diff.algorithm variable
  git-completion.bash: Autocomplete --minimal and --histogram for git-diff
2013-02-17 15:25:52 -08:00
Junio C Hamano a1d68bea89 Merge branch 'jk/diff-graph-cleanup'
Refactors a lot of repetitive code sequence from the graph drawing
code and adds it to the combined diff output.

* jk/diff-graph-cleanup:
  combine-diff.c: teach combined diffs about line prefix
  diff.c: use diff_line_prefix() where applicable
  diff: add diff_line_prefix function
  diff.c: make constant string arguments const
  diff: write prefix to the correct file
  graph: output padding for merge subsequent parents
2013-02-14 10:29:59 -08:00
John Keeping 30997bb8f1 diff.c: use diff_line_prefix() where applicable
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping f192223447 diff: add diff_line_prefix function
This is a helper function to call the diff output_prefix function and
return its value as a C string, allowing us to greatly simplify
everywhere that needs to get the output prefix.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping 32b367e444 diff.c: make constant string arguments const
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping 3bf25c23cd diff: write prefix to the correct file
Write the prefix for an output line to the same file as the actual
content.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
Michal Privoznik 07924d4d50 diff: Introduce --diff-algorithm command line option
Since command line options have higher priority than config file
variables and taking previous commit into account, we need a way
how to specify myers algorithm on command line. However,
inventing `--myers` is not the right answer. We need far more
general option, and that is `--diff-algorithm`.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 09:41:18 -08:00
Michal Privoznik 07ab4dec80 config: Introduce diff.algorithm variable
Some users or projects prefer different algorithms over others, e.g.
patience over myers or similar. However, specifying appropriate
argument every time diff is to be used is impractical. Moreover,
creating an alias doesn't play nicely with other tools based on diff
(git-show for instance). Hence, a configuration variable which is able
to set specific algorithm is needed. For now, these four values are
accepted: 'myers' (which has the same effect as not setting the config
variable at all), 'minimal', 'patience' and 'histogram'.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 09:37:45 -08:00
Junio C Hamano 90d0b8a9f0 Merge branch 'jc/blame-no-follow'
Teaches "--no-follow" option to "git blame" to disable its
whole-file rename detection.

* jc/blame-no-follow:
  blame: pay attention to --no-follow
  diff: accept --no-follow option
2013-01-14 08:15:51 -08:00
Junio C Hamano a4eab8f38e Merge branch 'lt/diff-stat-show-0-lines'
"git diff --stat" miscounted the total number of changed lines when
binary files were involved and hidden beyond --stat-count.  It also
miscounted the total number of changed files when there were
unmerged paths.

* lt/diff-stat-show-0-lines:
  t4049: refocus tests
  diff --shortstat: do not count "unmerged" entries
  diff --stat: do not count "unmerged" entries
  diff --stat: move the "total count" logic to the last loop
  diff --stat: use "file" temporary variable to refer to data->files[i]
  diff --stat: status of unmodified pair in diff-q is not zero
  test: add failing tests for "diff --stat" to t4049
2012-11-29 12:53:54 -08:00
Junio C Hamano 20c8cde456 diff --shortstat: do not count "unmerged" entries
Fix the same issue as the previous one for "git diff --stat";
unmerged entries was doubly-counted.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 14:19:36 -08:00
Junio C Hamano 82dfc2c44e diff --stat: do not count "unmerged" entries
Even though we show a separate *UNMERGED* entry in the patch and
diffstat output (or in the --raw format, for that matter) in
addition to and separately from the diff against the specified stage
(defaulting to #2) for unmerged paths, they should not be counted in
the total number of files affected---that would lead to counting the
same path twice.

The separation done by the previous step makes this fix simple and
straightforward.  Among the filepairs in diff_queue, paths that
weren't modified, and the extra "unmerged" entries do not count as
total number of files.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano a20d3c0de1 diff --stat: move the "total count" logic to the last loop
The diffstat generation logic, with --stat-count limit, is
implemented as three loops.

 - The first counts the width necessary to show stats up to
   specified number of entries, and notes up to how many entries in
   the data we need to iterate to show the graph;

 - The second iterates that many times to draw the graph, adjusts
   the number of "total modified files", and counts the total
   added/deleted lines for the part that was shown in the graph;

 - The third iterates over the remainder and only does the part to
   count "total added/deleted lines" and to adjust "total modified
   files" without drawing anything.

Move the logic to count added/deleted lines and modified files from
the second loop to the third loop.

This incidentally fixes a bug.  The third loop was not filtering
binary changes (counted in bytes) from the total added/deleted as it
should.  The second loop implemented this correctly, so if a binary
change appeared earlier than the --stat-count cutoff, the code
counted number of added/deleted lines correctly, but if it appeared
beyond the cutoff, the number of lines would have mixed with the
byte count in the buggy third loop.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano af0ed819c5 diff --stat: use "file" temporary variable to refer to data->files[i]
The generated code shouldn't change but it is easier to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano 99bfd40700 diff --stat: status of unmodified pair in diff-q is not zero
It is spelled DIFF_STATUS_UNKNOWN these days, and is different from zero.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano be95387af2 Merge branch 'rr/submodule-diff-config'
Allow "git diff --submodule=log" to set to be the default via
configuration.

* rr/submodule-diff-config:
  submodule: display summary header in bold
  diff: rename "set" variable
  diff: introduce diff.submodule configuration variable
  Documentation: move diff.wordRegex from config.txt to diff-config.txt
2012-11-25 18:44:50 -08:00
Junio C Hamano 76c39289ba Merge branch 'lt/diff-stat-show-0-lines'
We failed to mention a file without any content change but whose
permission bit was modified, or (worse yet) a new file without any
content in the "git diff --stat" output.

* lt/diff-stat-show-0-lines:
  Fix "git diff --stat" for interesting - but empty - file changes
2012-11-25 18:44:06 -08:00
Ramkumar Ramachandra 4e215131d2 submodule: display summary header in bold
Currently, 'git diff --submodule' displays output with a bold diff
header for non-submodules.  So this part is in bold:

    diff --git a/file1 b/file1
    index 30b2f6c..2638038 100644
    --- a/file1
    +++ b/file1

For submodules, the header looks like this:

    Submodule submodule1 012b072..248d0fd:

Unfortunately, it's easy to miss in the output because it's not bold.
Change this.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Jeff King d9c552f17a diff: rename "set" variable
Once upon a time the builtin_diff function used one color, and the color
variables were called "set" and "reset". Nowadays it is a much longer
function and we use several colors (e.g., "add", "del"). Rename "set" to
"meta" to show that it is the color for showing diff meta-info (it still
does not indicate that it is a "color", but at least it matches the
scheme of the other color variables).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Ramkumar Ramachandra c47ef57caa diff: introduce diff.submodule configuration variable
Introduce a diff.submodule configuration variable corresponding to the
'--submodule' command-line option of 'git diff'.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Jeff King 19fb613695 Merge branch 'nd/builtin-to-libgit'
Code cleanups so that libgit.a does not depend on anything in the
builtin/ directory.

* nd/builtin-to-libgit:
  fetch-pack: move core code to libgit.a
  fetch-pack: remove global (static) configuration variable "args"
  send-pack: move core code to libgit.a
  Move setup_diff_pager to libgit.a
  Move print_commit_list to libgit.a
  Move estimate_bisect_steps to libgit.a
  Move try_merge_command and checkout_fast_forward to libgit.a
2012-11-09 12:51:06 -05:00
Jeff King 8736c9010c Merge branch 'mh/maint-parse-dirstat-fix'
Cleans up some code and avoids a potential bug.

* mh/maint-parse-dirstat-fix:
  parse_dirstat_params(): use string_list to split comma-separated string
2012-11-09 12:42:21 -05:00
Nguyễn Thái Ngọc Duy 4914c9629c Move setup_diff_pager to libgit.a
This is used by diff-no-index.c, part of libgit.a while it stays in
builtin/diff.c. Move it to diff.c so that we won't get undefined
reference if a program that uses libgit.a happens to pull it in.

While at it, move check_pager from git.c to pager.c. It makes more
sense there and pager.c is also part of libgit.a

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:08:30 -04:00
Michael Haggerty 02e8ca0e50 parse_dirstat_params(): use string_list to split comma-separated string
Use string_list_split_in_place() to split the comma-separated
parameters string.  This simplifies the code and also fixes a bug: the
old code made calls like

    memcmp(p, "lines", p_len)

which needn't work if p_len is different than the length of the
constant string (and could illegally access memory if p_len is larger
than the length of the constant string).

When p_len was less than the length of the constant string, the old
code would have allowed some abbreviations to be accepted (e.g., "cha"
for "changes") but this seems to have been a bug rather than a
feature, because (1) it is not documented; (2) no attempt was made to
handle ambiguous abbreviations, like "c" for "changes" vs
"cumulative".

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 02:52:41 -04:00
Linus Torvalds 74faaa16f0 Fix "git diff --stat" for interesting - but empty - file changes
The behavior of "git diff --stat" is rather odd for files that have
zero lines of changes: it will discount them entirely unless they were
renames.

Which means that the stat output will simply not show files that only
had "other" changes: they were created or deleted, or their mode was
changed.

Now, those changes do show up in the summary, but so do renames, so
the diffstat logic is inconsistent. Why does it show renames with zero
lines changed, but not mode changes or added files with zero lines
changed?

So change the logic to not check for "is_renamed", but for
"is_interesting" instead, where "interesting" is judged to be any
action but a pure data change (because a pure data change with zero
data changed really isn't worth showing, if we ever get one in our
diffpairs).

So if you did

   chmod +x Makefile
   git diff --stat

before, it would show empty (" 0 files changed"), with this it shows

 Makefile | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

which I think is a more correct diffstat (and then with "--summary" it
shows *what* the metadata change to Makefile was - this is completely
consistent with our handling of renamed files).

Side note: the old behavior was *really* odd. With no changes at all,
"git diff --stat" output was empty. With just a chmod, it said "0
files changed". No way is our legacy behavior sane.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-17 11:50:50 -07:00
Jeff Muizelaar 6468a4e548 diff: diff.context configuration gives default to -U
Introduce a configuration variable diff.context that tells
Porcelain commands to use a non-default number of context
lines instead of 3 (the default).  With this variable, users
do not have to keep repeating "git log -U8" from the command
line; instead, it becomes sufficient to say "git config
diff.context 8" just once.

Signed-off-by: Jeff Muizelaar <jmuizelaar@mozilla.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-30 20:16:01 -07:00
Junio C Hamano aebbcf5797 diff: accept --no-follow option
Once you do

	$ alias glogone git log --follow

there is no way to say

	$ glogone --no-follow ...

Not that "log --follow" is all that useful, but it is cheap to
support the common "you can defeat an undesirable option with a
'no-' variant of it later on the command line" pattern.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-21 13:49:18 -07:00
Junio C Hamano 8ef2794ba8 Merge branch 'nd/maint-diffstat-summary' into maint
* nd/maint-diffstat-summary:
  Revert diffstat back to English
2012-09-20 15:55:31 -07:00
Junio C Hamano 06e211acc6 Merge branch 'jc/make-static'
Turn many file-scope private symbols to static to reduce the
global namespace contamination.

* jc/make-static:
  sequencer.c: mark a private file-scope symbol as static
  ident.c: mark private file-scope symbols as static
  trace.c: mark a private file-scope symbol as static
  wt-status.c: mark a private file-scope symbol as static
  read-cache.c: mark a private file-scope symbol as static
  strbuf.c: mark a private file-scope symbol as static
  sha1-array.c: mark a private file-scope symbol as static
  symlinks.c: mark private file-scope symbols as static
  notes.c: mark a private file-scope symbol as static
  rerere.c: mark private file-scope symbols as static
  graph.c: mark private file-scope symbols as static
  diff.c: mark a private file-scope symbol as static
  commit.c: mark a file-scope private symbol as static
  builtin/notes.c: mark file-scope private symbols as static
2012-09-18 14:37:46 -07:00
Junio C Hamano 9e40b6e595 Merge branch 'nd/maint-diffstat-summary'
Earlier we made the diffstat summary line that shows the number of
lines added/deleted localizable, but it was found irritating having
to see them in various languages on a list whose discussion language
is English.

The original had trivial thinko in reverting Q_(), which has been
fixed.

* nd/maint-diffstat-summary:
  Revert diffstat back to English
2012-09-17 15:57:22 -07:00
Junio C Hamano d2aea1371b diff.c: mark a private file-scope symbol as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:20 -07:00
Nguyễn Thái Ngọc Duy 218adaaaa0 Revert diffstat back to English
This reverts the i18n part of 7f81463 (Use correct grammar in diffstat
summary line - 2012-02-01) but still keeps the grammar correctness for
English. It also reverts b354f11 (Fix tests under GETTEXT_POISON on
diffstat - 2012-08-27). The result is diffstat always in English
for all commands.

This helps stop users from accidentally sending localized
format-patch'd patches.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-14 09:52:16 -07:00
Junio C Hamano 1c88a6d174 Sync with 1.7.11.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-11 11:23:54 -07:00
Junio C Hamano d9b983fc26 Merge branch 'ab/diff-write-incomplete-line' into maint-1.7.11
* ab/diff-write-incomplete-line:
  Fix '\ No newline...' annotation in rewrite diffs
2012-09-11 11:08:30 -07:00
Junio C Hamano 738c218760 Merge branch 'tr/void-diff-setup-done' into maint-1.7.11
* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-09-11 10:53:40 -07:00
Junio C Hamano e3f26752b5 Merge branch 'maint-1.7.11' into maint
* maint-1.7.11:
  Almost 1.7.11.6
  gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
  rebase -i: use full onto sha1 in reflog
  sh-setup: protect from exported IFS
  receive-pack: do not leak output from auto-gc to standard output
  t/t5400: demonstrate breakage caused by informational message from prune
  setup: clarify error messages for file/revisions ambiguity
  send-email: improve RFC2047 quote parsing
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:31:06 -07:00
Junio C Hamano 03adeeaad6 Merge branch 'jk/maint-null-in-trees' into maint-1.7.11
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:24:54 -07:00
Junio C Hamano e6daf0ac22 Merge branch 'ab/diff-write-incomplete-line'
The output from "git diff -B" for a file that ends with an
incomplete line did not put "\ No newline..." on a line of its own.

* ab/diff-write-incomplete-line:
  Fix '\ No newline...' annotation in rewrite diffs
2012-08-27 11:54:46 -07:00
Junio C Hamano 3b753148b6 Merge branch 'jk/maint-null-in-trees'
We do not want a link to 0{40} object stored anywhere in our objects.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-08-27 11:54:28 -07:00
Junio C Hamano 9cd33bbc52 Merge branch 'tr/void-diff-setup-done'
Remove unnecessary code.

* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-08-22 11:52:27 -07:00
Adam Butcher 35e2d03c2c Fix '\ No newline...' annotation in rewrite diffs
When a file that ends with an incomplete line is expressed as a
complete rewrite with the -B option, git diff incorrectly
appends the incomplete line indicator "\ No newline at end of
file" after such a line, rather than writing it on a line of its
own (the output codepath for normal output without -B does not
have this problem).  Add a LF after the incomplete line before
writing the "\ No newline ..." out to fix this.

Add a couple of tests to confirm that the indicator comment is
generated on its own line in both plain diff and rewrite mode.

Signed-off-by: Adam Butcher <dev.lists@jessamine.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-05 12:37:52 -07:00
Thomas Rast 28452655af diff_setup_done(): return void
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09).  The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.

Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.

Note that the function can still die().

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 12:11:07 -07:00
Junio C Hamano 97c7934049 Merge branch 'nd/maint-i18n-diffstat'
* nd/maint-i18n-diffstat:
  i18n: leave \n out of translated diffstat
2012-07-31 09:43:07 -07:00
Junio C Hamano 70f6be7aa9 Merge branch 'jv/maint-no-ext-diff' into maint
"git diff --no-ext-diff" did not output anything for a typechange
filepair when GIT_EXTERNAL_DIFF is in effect.

* jv/maint-no-ext-diff:
  diff: test precedence of external diff drivers
  diff: correctly disable external_diff with --no-ext-diff
2012-07-30 13:04:59 -07:00
Jeff King e54501004a diff: do not use null sha1 as a sentinel value
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29 15:04:32 -07:00
Nguyễn Thái Ngọc Duy 8212333012 i18n: leave \n out of translated diffstat
GETTEXT_POISON scrapes everything in translated strings, including \n.
t4205.12 however needs this \n in matching the end result. Keep this
\n out of translation to make t4205.12 happy.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-26 10:48:02 -07:00
Junio C Hamano 7ccb945973 Merge branch 'jv/maint-no-ext-diff'
"git diff --no-ext-diff" did not output anything for a typechange
filepair when GIT_EXTERNAL_DIFF is in effect.

* jv/maint-no-ext-diff:
  diff: test precedence of external diff drivers
  diff: correctly disable external_diff with --no-ext-diff
2012-07-23 20:56:03 -07:00
Junio C Hamano 106ef55f3a Merge branch 'jc/refactor-diff-stdin' into maint
"git diff", "git status" and anything that internally uses the
comparison machinery was utterly broken when the difference
involved a file with "-" as its name.  This was due to the way "git
diff --no-index" was incorrectly bolted on to the system, making
any comparison that involves a file "-" at the root level
incorrectly read from the standard input.

* jc/refactor-diff-stdin:
  diff-index.c: "git diff" has no need to read blob from the standard input
  diff-index.c: unify handling of command line paths
  diff-index.c: do not pretend paths are pathspecs
2012-07-22 13:01:23 -07:00
Junio C Hamano bd8c1a9b49 diff: correctly disable external_diff with --no-ext-diff
Upon seeing a type-change filepair, "diff --no-ext-diff" does not
show the usual "deletion followed by addition" split patch and does
not run the external diff driver either.

This is because the logic to disable external diff was placed at a
wrong level in the callchain.  run_diff_cmd() decides to show the
split patch only when external diff driver is not configured or
specified via GIT_EXTERNAL_DIFF environment, but this is done before
checking if --no-ext-diff was given.  To make things worse,
run_diff_cmd() checks --no-ext-diff and disables the output for such
a filepair completely, as the callchain below it (e.g. builtin_diff)
does not want to handle typechange filepairs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-17 22:51:11 -07:00
Junio C Hamano d7afe648dc Merge branch 'jc/refactor-diff-stdin'
Due to the way "git diff --no-index" is bolted onto by touching the
low level code that is shared with the rest of the "git diff" code,
even though it has to work in a very different way, any comparison
that involves a file "-" at the root level incorrectly tried to read
from the standard input.  This cleans up the no-index codepath
further to remove code that reads from the standard input from the
core side, which is never necessary when git is running its usual
diff operation.

* jc/refactor-diff-stdin:
  diff-index.c: "git diff" has no need to read blob from the standard input
  diff-index.c: unify handling of command line paths
  diff-index.c: do not pretend paths are pathspecs
2012-07-13 15:38:05 -07:00
Junio C Hamano 4682d8521c diff-index.c: "git diff" has no need to read blob from the standard input
Only "diff --no-index -" does.  Bolting the logic into the low-level
function diff_populate_filespec() was a layering violation from day
one.  Move populate_from_stdin() function out of the generic diff.c
to its only user, diff-index.c.

Also make sure "-" from the command line stays a special token "read
from the standard input", even if we later decide to sanitize the
result from prefix_filename() function in a few obvious ways,
e.g. removing unnecessary "./" prefix, duplicated slashes "//" in
the middle, etc.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-28 16:18:19 -07:00
Junio C Hamano 0b6e913c8b Merge branch 'as/diff-shortstat-ignore-binary'
# By Alexander Strasser
* as/diff-shortstat-ignore-binary:
  diff: Only count lines in show_shortstats
2012-06-15 15:00:53 -07:00
Alexander Strasser de9658b511 diff: Only count lines in show_shortstats
Do not mix byte and line counts. Binary files have byte counts;
skip them when accumulating line insertions/deletions.

The regression was introduced in e18872b.

Signed-off-by: Alexander Strasser <eclipse7@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-15 15:00:04 -07:00
Junio C Hamano fc1320bfe2 Merge branch 'zj/diff-empty-chmod'
"git diff --stat" used to fully count a binary file with modified
execution bits whose contents is unmodified, which was not right.

By Zbigniew Jędrzejewski-Szmek (4) and Johannes Sixt (1)
* zj/diff-empty-chmod:
  t4006: Windows do not have /dev/zero
  diff --stat: do not run diff on indentical files
  diff --stat: report mode-only changes for binary files like text files
  tests: check --[short]stat output after chmod
  test: modernize style of t4006

Conflicts:
	diff.c
2012-05-07 13:29:08 -07:00
Junio C Hamano 29c2a3dbad Merge branch 'zj/diff-stat-smaller-num-columns'
Spend only minimum number of columns necessary to show the number of lines
in the output from "diff --stat", instead of always allocating 4 columns
even when showing changes that are much smaller than 1000 lines.

By Zbigniew Jędrzejewski-Szmek
* zj/diff-stat-smaller-num-columns:
  diff --stat: use less columns for change counts
2012-05-02 13:53:28 -07:00
Junio C Hamano 73ff8cf784 Merge branch 'lp/diffstat-with-graph'
"log --graph" was not very friendly with "--stat" option and its output
had line breaks at wrong places.

By Lucian Poston (5) and Zbigniew Jędrzejewski-Szmek (2)
* lp/diffstat-with-graph:
  t4052: work around shells unable to set COLUMNS to 1
  Prevent graph_width of stat width from falling below min
  t4052: Test diff-stat output with minimum columns
  t4052: Adjust --graph --stat output for prefixes
  Adjust stat width calculations to take --graph output into account
  Add output_prefix_length to diff_options
  t4052: test --stat output with --graph
2012-05-02 13:51:59 -07:00
Zbigniew Jędrzejewski-Szmek 352ca4e105 diff --stat: do not run diff on indentical files
If two objects are known to be equal, there is no point running the diff.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-01 21:29:03 -07:00
Zbigniew Jędrzejewski-Szmek e18872b2f0 diff --stat: report mode-only changes for binary files like text files
Mode-only changes to binary files without content change were reported as
if they were rewritten, but text files in the same situation were reported
as "unchanged". Let's treat binary files like text files here, and simply
say that they are unchanged.

Output of --shortstat is modified in the same way.

Reported-by: Martin Mareš <mj@ucw.cz>
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-01 21:26:46 -07:00
Zbigniew Jędrzejewski-Szmek dc801e71a7 diff --stat: use less columns for change counts
Number of columns required for change counts is now computed based on
the maximum number of changed lines instead of being fixed. This means
that usually a few more columns will be available for the filenames
and the graph.

The graph width logic is also modified to include enough space for
"Bin XXX -> YYY bytes".

If changes to binary files are mixed with changes to text files,
change counts are padded to take at least three columns. And the other
way around, if change counts require more than three columns, then
"Bin"s are padded to align with the change count. This way, the +-
part starts in the same column as "XXX -> YYY" part for binary files.
This makes the graph easier to parse visually thanks to the empty
column. This mimics the layout of diff --stat before this change.

Tests and the tutorial are updated to reflect the new --stat output.
This means either the removal of extra padding and/or the addition of
up to three extra characters to truncated filenames. One test is added
to check the graph alignment when a binary file change and text file
change of more than 999 lines are committed together.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-30 14:17:26 -07:00
Junio C Hamano 31a199a76e Merge branch 'lp/maint-diff-three-dash-with-graph'
"log -p --graph" used with "--stat" had a few formatting error.

By Lucian Poston
* lp/maint-diff-three-dash-with-graph:
  t4202: add test for "log --graph --stat -p" separator lines
  log --graph: fix break in graph lines
  log --graph --stat: three-dash separator should come after graph lines
2012-04-23 12:57:21 -07:00
Lucian Poston 678c574111 Prevent graph_width of stat width from falling below min
Update tests in t4052 fixed by this change.

Signed-off-by: Lucian Poston <lucian.poston@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-18 16:08:11 -07:00
Junio C Hamano c0599f6993 Merge branch 'jk/diff-no-rename-empty'
Forbids rename detection logic from matching two empty files as renames
during merge-recursive to prevent mismerges.

By Jeff King
* jk/diff-no-rename-empty:
  merge-recursive: don't detect renames of empty files
  teach diffcore-rename to optionally ignore empty content
  make is_empty_blob_sha1 available everywhere
  drop casts from users EMPTY_TREE_SHA1_BIN
2012-04-16 12:41:49 -07:00
Lucian Poston 3f1451326a Adjust stat width calculations to take --graph output into account
The recent change to compute the width of diff --stat did not take into
consideration the output from --graph. The consequence is that when both
options are used, e.g. in 'log --stat --graph', the lines are too long.

Adjust stat width calculations to take --graph output into account.

Signed-off-by: Lucian Poston <lucian.poston@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-16 11:28:39 -07:00
Junio C Hamano 3bec29bb07 Merge branch 'tr/maint-word-diff-regex-sticky'
The regexp configured with wordregex was incorrectly reused across files.

By Thomas Rast (2) and Johannes Sixt (1)
* tr/maint-word-diff-regex-sticky:
  diff: tweak a _copy_ of diff_options with word-diff
  diff: refactor the word-diff setup from builtin_diff_cmd
  t4034: diff.*.wordregex should not be "sticky" in --word-diff
2012-04-15 22:51:34 -07:00
Junio C Hamano 86c340e082 Merge branch 'jc/diff-algo-cleanup'
Resurrects the preparatory clean-up patches from another topic that was
discarded, as this would give a saner foundation to build on diff.algo
configuration option series.

* jc/diff-algo-cleanup:
  xdiff: PATIENCE/HISTOGRAM are not independent option bits
  xdiff: remove XDL_PATCH_* macros
2012-04-15 22:51:15 -07:00
Jeff King 90d43b0768 teach diffcore-rename to optionally ignore empty content
Our rename detection is a heuristic, matching pairs of
removed and added files with similar or identical content.
It's unlikely to be wrong when there is actual content to
compare, and we already take care not to do inexact rename
detection when there is not enough content to produce good
results.

However, we always do exact rename detection, even when the
blob is tiny or empty. It's easy to get false positives with
an empty blob, simply because it is an obvious content to
use as a boilerplate (e.g., when telling git that an empty
directory is worth tracking via an empty .gitignore).

This patch lets callers specify whether or not they are
interested in using empty files as rename sources and
destinations. The default is "yes", keeping the original
behavior. It works by detecting the empty-blob sha1 for
rename sources and destinations.

One more flexible alternative would be to allow the caller
to specify a minimum size for a blob to be "interesting" for
rename detection. But that would catch small boilerplate
files, not large ones (e.g., if you had the GPL COPYING file
in many directories).

A better alternative would be to allow a "-rename"
gitattribute to allow boilerplate files to be marked as
such. I'll leave the complexity of that solution until such
time as somebody actually wants it. The complaints we've
seen so far revolve around empty files, so let's start with
the simple thing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-23 13:52:49 -07:00
Lucian Poston b18e97ceb9 log --graph: fix break in graph lines
Output from "git log --graph --stat -p" broke the ancestry graph lines
with a single empty line between the diffstat and the patch.

Signed-off-by: Lucian Poston <lucian.poston@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-20 12:30:56 -07:00
Thomas Rast 6440d3417c diff: tweak a _copy_ of diff_options with word-diff
When using word diff, the code sets the word_regex from various
defaults if it was not set already.  The problem is that it does this
on the original diff_options, which will also be used in subsequent
diffs.

This means that when the word_regex is not given on the command line,
only the first diff for which a setting for word_regex (either from
attributes or diff.wordRegex) ever takes effect.  This value then
propagates to the rest of the diff runs and in particular prevents
further attribute lookups.

Fix the problem of changing diff state once and for all, by working
with a _copy_ of the diff_options.

Noticed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-14 14:41:20 -07:00
Thomas Rast 77d1a520fb diff: refactor the word-diff setup from builtin_diff_cmd
Quite a chunk of builtin_diff_cmd deals with word-diff setup, defaults
and such.  This makes the function a bit hard to read, but is also
asymmetric because the corresponding teardown lives in free_diff_words_data
already.

Refactor into a new function init_diff_words_data.  For simplicity,
also shuffle around some functions it depends on.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-14 14:40:15 -07:00
Junio C Hamano fce8b5d82f Merge branch 'jc/maint-diff-patch-header' into maint
"git diff-index" and its friends at the plumbing level showed the
"diff --git" header and nothing else for a path whose cached stat
info is dirty without actual difference when asked to produce a
patch. This was a longstanding bug that we could have fixed long
time ago.

By Junio C Hamano
* jc/maint-diff-patch-header:
  diff -p: squelch "diff --git" header for stat-dirty paths
  t4011: illustrate "diff-index -p" on stat-dirty paths
  t4011: modernise style
2012-03-12 15:46:32 -07:00
Junio C Hamano 239d6eddcd Merge branch 'jc/maint-diff-patch-header'
By Junio C Hamano
* jc/maint-diff-patch-header:
  diff -p: squelch "diff --git" header for stat-dirty paths
  t4011: illustrate "diff-index -p" on stat-dirty paths
  t4011: modernise style
2012-03-06 14:53:07 -08:00
Junio C Hamano af050219e4 Merge branch 'zj/diff-stat-dyncol'
By Zbigniew Jędrzejewski-Szmek (8) and Junio C Hamano (1)
* zj/diff-stat-dyncol:
  : This breaks tests. Perhaps it is not worth using the decimal-width stuff
  : for this series, at least initially.
  diff --stat: add config option to limit graph width
  diff --stat: enable limiting of the graph part
  diff --stat: add a test for output with COLUMNS=40
  diff --stat: use a maximum of 5/8 for the filename part
  merge --stat: use the full terminal width
  log --stat: use the full terminal width
  show --stat: use the full terminal width
  diff --stat: use the full terminal width
  diff --stat: tests for long filenames and big change counts
2012-03-06 14:53:06 -08:00
Junio C Hamano b3f01ff29f diff -p: squelch "diff --git" header for stat-dirty paths
The plumbing "diff" commands look at the working tree files without
refreshing the index themselves for performance reasons (the calling
script is expected to do that upfront just once, before calling one or
more of them).  In the early days of git, they showed the "diff --git"
header before they actually ask the xdiff machinery to produce patches,
and ended up showing only these headers if the real contents are the same
and the difference they noticed was only because the stat info cached in
the index did not match that of the working tree. It was too late for the
implementation to take the header that it already emitted back.

But 3e97c7c (No diff -b/-w output for all-whitespace changes, 2009-11-19)
introduced necessary logic to keep the meta-information headers in a
strbuf and delay their output until the xdiff machinery noticed actual
changes. This was primarily in order to generate patches that ignore
whitespaces. When operating under "-w" mode, we wouldn't know if the
header is needed until we actually look at the resulting patch, so it was
a sensible thing to do, but we did not realize that the same reasoning
applies to stat-dirty paths.

Later, 296c6bb (diff: fix "git show -C -C" output when renaming a binary
file, 2010-05-26) generalized this machinery and added must_show_header
toggle.  This is turned on when the header must be shown even when there
is no patch to be produced, e.g. only the mode was changed, or the path
was renamed, without changing the contents.  However, when it did so, it
still kept the special case for the "-w" mode, which meant that the
plumbing would keep showing these phantom changes.

This corrects this historical inconsistency by allowing the plumbing to
omit paths that are only stat-dirty from its output in the same way as it
handles whitespace only changes under "-w" option.

The change in the behaviour can be seen in the updated test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-01 12:00:01 -08:00
Zbigniew Jędrzejewski-Szmek df44483a5d diff --stat: add config option to limit graph width
Config option diff.statGraphWidth=<width> is equivalent to
--stat-graph-width=<width>, except that the config option is ignored
by format-patch.

For the graph-width limiting to be usable, it should happen
'automatically' once configured, hence the config option.
Nevertheless, graph width limiting only makes sense when used on a
wide terminal, so it should not influence the output of format-patch,
which adheres to the 80-column standard.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-01 09:15:58 -08:00
Zbigniew Jędrzejewski-Szmek 969fe57b84 diff --stat: enable limiting of the graph part
A new option --stat-graph-width=<width> can be used to limit the width
of the graph part even is more space is available. Up to <width>
columns will be used for the graph.

If commits changing a lot of lines are displayed in a wide terminal
window (200 or more columns), and the +- graph uses the full width,
the output can be hard to comfortably scan with a horizontal movement
of human eyes. Messages wrapped to about 80 columns would be
interspersed with very long +- lines. It makes sense to limit the
width of the graph part to a fixed value (e.g. 70 columns), even if
more columns are available.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-01 09:15:47 -08:00
Zbigniew Jędrzejewski-Szmek 1b058bc30d diff --stat: use a maximum of 5/8 for the filename part
The way that available columns are divided between the filename part
and the graph part is modified to use as many columns as necessary for
the filenames and the rest for the graph.

If there isn't enough columns to print both the filename and the
graph, at least 5/8 of available space is devoted to filenames. On a
standard 80 column terminal, or if not connected to a terminal and
using the default of 80 columns, this gives the same partition as
before.

The effect of this change is visible in the patch to the test vector
in t4052; with a small change with long filename, it stops truncating
the name part too short, and also allocates a bit more columns to the
graph for larger changes.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-01 09:14:58 -08:00
Zbigniew Jędrzejewski-Szmek af9fedc128 diff --stat: use the full terminal width
Default to the real terminal width for diff --stat output, instead
of the hard-coded 80 columns.

Some projects (especially in Java), have long filename paths, with
nested directories or long individual filenames. When files are
renamed, the filename part in stat output can be almost useless. If
the middle part between { and } is long (because the file was moved to
a completely different directory), then most of the path would be
truncated.

It makes sense to detect and use the full terminal width and display
full filenames if possible.

The are commands like diff, show, and log, which can adapt the output
to the terminal width. There are also commands like format-patch,
whose output should be independent of the terminal width. Since it is
safer to use the 80-column default, the real terminal width is only
used if requested by the calling code by setting diffopts.stat_width=-1.
Normally this value is 0, and can be set by the user only to a
non-negative value, so -1 is safe to use internally.

This patch only changes the diff builtin to use the full terminal width.

Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-01 09:13:06 -08:00
Junio C Hamano db65f0fc3b Merge branches zj/decimal-width, zj/term-columns and jc/diff-stat-scaler 2012-02-24 16:07:04 -08:00
Junio C Hamano a67c235448 Merge branch 'jc/diff-stat-scaler' into maint
* jc/diff-stat-scaler:
  diff --stat: show bars of same length for paths with same amount of changes
2012-02-21 15:00:33 -08:00
Junio C Hamano 8c60fcbcfd Merge branch 'jc/diff-stat-scaler'
* jc/diff-stat-scaler:
  diff --stat: show bars of same length for paths with same amount of changes
2012-02-20 00:15:15 -08:00
Junio C Hamano 307ab20b33 xdiff: PATIENCE/HISTOGRAM are not independent option bits
Because the default Myers, patience and histogram algorithms cannot be in
effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not
independent bits.  Instead of wasting one bit per algorithm, define a few
macros to access the few bits they occupy and update the code that access
them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19 15:36:55 -08:00
Junio C Hamano 2eeeef24ff diff --stat: show bars of same length for paths with same amount of changes
When commit 3ed74e6 (diff --stat: ensure at least one '-' for deletions,
and one '+' for additions, 2006-09-28) improved the output for files with
tiny modifications, we accidentally broke the logic to ensure that two
equal sized changes are shown with the bars of the same length, even when
rounding errors exist.

Compute the length of the graph bars, using the same "non-zero changes is
shown with at least one column" scaling logic, but by scaling the sum of
additions and deletions to come up with the total length of the bar (this
ensures that two equal sized changes result in bars of the same length),
and then scaling the smaller of the additions or deletions. The other side
is computed as the difference between the two.

This makes the apportioning between additions and deletions less accurate
due to rounding errors, but it is much less noticeable than two files with
the same amount of change showing bars of different length.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-14 14:21:49 -08:00
Junio C Hamano 63d37c3062 Merge branch 'jk/userdiff-config-simplify'
* jk/userdiff-config-simplify:
  drop odd return value semantics from userdiff_config
2012-02-14 12:57:17 -08:00
Jeff King 6680a0874f drop odd return value semantics from userdiff_config
When the userdiff_config function was introduced in be58e70
(diff: unify external diff and funcname parsing code,
2008-10-05), it used a return value convention unlike any
other config callback. Like other callbacks, it used "-1" to
signal error. But it returned "1" to indicate that it found
something, and "0" otherwise; other callbacks simply
returned "0" to indicate that no error occurred.

This distinction was necessary at the time, because the
userdiff namespace overlapped slightly with the color
configuration namespace. So "diff.color.foo" could mean "the
'foo' slot of diff coloring" or "the 'foo' component of the
"color" userdiff driver". Because the color-parsing code
would die on an unknown color slot, we needed the userdiff
code to indicate that it had matched the variable, letting
us bypass the color-parsing code entirely.

Later, in 8b8e862 (ignore unknown color configuration,
2009-12-12), the color-parsing code learned to silently
ignore unknown slots. This means we no longer need to
protect userdiff-matched variables from reaching the
color-parsing code.

We can therefore change the userdiff_config calling
convention to a more normal one. This drops some code from
each caller, which is nice. But more importantly, it reduces
the cognitive load for readers who may wonder why
userdiff_config is unlike every other config callback.

There's no need to add a new test confirming that this
works; t4020 already contains a test that sets
diff.color.external.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-07 10:44:54 -08:00
Nguyễn Thái Ngọc Duy 7f814632f5 Use correct grammar in diffstat summary line
"git diff --stat" and "git apply --stat" now learn to print the line
"%d files changed, %d insertions(+), %d deletions(-)" in singular form
whenever applicable. "0 insertions" and "0 deletions" are also omitted
unless they are both zero.

This matches how versions of "diffstat" that are not prehistoric produced
their output, and also makes this line translatable.

[jc: with help from Thomas Dickey in archaeology of "diffstat"]
[jc: squashed Jonathan's updates to illustrations in tutorials and a test]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-03 23:19:42 -08:00
Junio C Hamano 05c65cb116 Merge branch 'tr/maint-word-diff-incomplete-line'
* tr/maint-word-diff-incomplete-line:
  word-diff: ignore '\ No newline at eof' marker
2012-01-18 15:16:19 -08:00
Thomas Rast c7c2bc0ac9 word-diff: ignore '\ No newline at eof' marker
The word-diff logic accumulates + and - lines until another line type
appears (normally [ @\]), at which point it generates the word diff.
This is usually correct, but it breaks when the preimage does not have
a newline at EOF:

  $ printf "%s" "a a a" >a
  $ printf "%s\n" "a ab a" >b
  $ git diff --no-index --word-diff a b
  diff --git 1/a 2/b
  index 9f68e94..6a7c02f 100644
  --- 1/a
  +++ 2/b
  @@ -1 +1 @@
  [-a a a-]
   No newline at end of file
  {+a ab a+}

Because of the order of the lines in a unified diff

  @@ -1 +1 @@
  -a a a
  \ No newline at end of file
  +a ab a

the '\' line flushed the buffers, and the - and + lines were never
matched with each other.

A proper fix would defer such markers until the end of the hunk.
However, word-diff is inherently whitespace-ignoring, so as a cheap
fix simply ignore the marker (and hide it from the output).

We use a prefix match for '\ ' to parallel the logic in
apply.c:parse_fragment().  We currently do not localize this string
(just accept other variants of it in git-apply), but this should be
future-proof.

Noticed-by: Ivan Shirokoff <shirokoff@yandex-team.ru>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12 11:27:41 -08:00
Junio C Hamano eee947fb95 Merge branch 'jc/maint-diffstat-numstat-context' into maint
* jc/maint-diffstat-numstat-context:
  diff: teach --stat/--numstat to honor -U$num
2011-11-01 16:10:56 -07:00
Junio C Hamano 713b85c758 Merge branch 'rs/diff-cleanup-records-fix' into maint
* rs/diff-cleanup-records-fix:
  diff: resurrect XDF_NEED_MINIMAL with --minimal
  Revert removal of multi-match discard heuristic in 27af01
2011-10-21 10:49:25 -07:00
Junio C Hamano 9b55aa03da Merge branch 'rs/diff-whole-function'
* rs/diff-whole-function:
  diff: add option to show whole functions as context
  xdiff: factor out get_func_line()
2011-10-19 10:49:13 -07:00
Junio C Hamano 7a63a920fd Merge branch 'rs/diff-cleanup-records-fix'
* rs/diff-cleanup-records-fix:
  diff: resurrect XDF_NEED_MINIMAL with --minimal
  Revert removal of multi-match discard heuristic in 27af01
2011-10-13 19:03:22 -07:00
Junio C Hamano 7ddd582402 Merge branch 'jc/maint-diffstat-numstat-context'
* jc/maint-diffstat-numstat-context:
  diff: teach --stat/--numstat to honor -U$num
2011-10-10 15:56:18 -07:00
René Scharfe 14937c2c06 diff: add option to show whole functions as context
Add the option -W/--function-context to git diff.  It is similar to
the same option of git grep and expands the context of change hunks
so that the whole surrounding function is shown.  This "natural"
context can allow changes to be understood better.

Note: GNU patch doesn't like diffs generated with the new option;
it seems to expect context lines to be the same before and after
changes.  git apply doesn't complain.

This implementation has the same shortcoming as the one in grep,
namely that there is no way to explicitly find the end of a
function.  That means that a few lines of extra context are shown,
right up to the next recognized function begins.  It's already
useful in its current form, though.

The function get_func_line() in xdiff/xemit.c is extended to work
forward as well as backward to find post-context as well as
pre-context.  It returns the position of the first found matching
line.  The func_line parameter is made optional, as we don't need
it for -W.

The enhanced function is then used in xdl_emit_diff() to extend
the context as needed.  If the added context overlaps with the
next change, it is merged into the current hunk.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-10 12:05:07 -07:00
Junio C Hamano 81b568c839 diff: resurrect XDF_NEED_MINIMAL with --minimal
Earlier, 582aa00 (git diff too slow for a file, 2010-05-02)
unconditionally dropped XDF_NEED_MINIMAL option from the internal xdiff
invocation to help performance on pathological cases, while hinting that a
follow-up patch could reintroduce it with "--minimal" option from the
command line.

Make it so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-03 11:58:18 -07:00
Junio C Hamano f01cae918f diff: teach --stat/--numstat to honor -U$num
"git diff -p" piped to external diffstat and "git diff --stat" may see
different patch text (both are valid and describe the same change
correctly) when counting the number of added and deleted lines, arriving
at different results to confuse the users, as --stat/--numstat codepath
always uses the hardcoded -U0 as the context length.

Make --stat/--numstat codepath to honor the context length the same way
as the textual patch codepath does to avoid this problem.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-22 10:54:47 -07:00
Junio C Hamano f946b465d7 Merge branch 'jk/color-and-pager'
* jk/color-and-pager:
  want_color: automatically fallback to color.ui
  diff: don't load color config in plumbing
  config: refactor get_colorbool function
  color: delay auto-color decision until point of use
  git_config_colorbool: refactor stdout_is_tty handling
  diff: refactor COLOR_DIFF from a flag into an int
  setup_pager: set GIT_PAGER_IN_USE
  t7006: use test_config helpers
  test-lib: add helper functions for config
  t7006: modernize calls to unset

Conflicts:
	builtin/commit.c
	parse-options.c
2011-08-28 21:19:16 -07:00
Jeff King 3e1dd17a89 diff: don't load color config in plumbing
The diff config callback is split into two functions: one
which loads "ui" config, and one which loads "basic" config.
The former chains to the latter, as the diff UI config is a
superset of the plumbing config.

The color.diff variable is only loaded in the UI config.
However, the basic config actually chains to
git_color_default_config, which loads color.ui. This doesn't
actually cause any bugs, because the plumbing diff code does
not actually look at the value of color.ui.

However, it is somewhat nonsensical, and it makes it
difficult to refactor the color code. It probably came about
because there is no git_color_config to load only color
config, but rather just git_color_default_config, which
loads color config and chains to git_default_config.

This patch splits out the color-specific portion of
git_color_default_config so that the diff UI config can call
it directly. This is perhaps better explained by the
chaining of callbacks. Before we had:

  git_diff_ui_config
    -> git_diff_basic_config
      -> git_color_default_config
        -> git_default_config

Now we have:

  git_diff_ui_config
    -> git_color_config
    -> git_diff_basic_config
      -> git_default_config

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-19 15:51:38 -07:00
Jeff King daa0c3d971 color: delay auto-color decision until point of use
When we read a color value either from a config file or from
the command line, we use git_config_colorbool to convert it
from the tristate always/never/auto into a single yes/no
boolean value.

This has some timing implications with respect to starting
a pager.

If we start (or decide not to start) the pager before
checking the colorbool, everything is fine. Either isatty(1)
will give us the right information, or we will properly
check for pager_in_use().

However, if we decide to start a pager after we have checked
the colorbool, things are not so simple. If stdout is a tty,
then we will have already decided to use color. However, the
user may also have configured color.pager not to use color
with the pager. In this case, we need to actually turn off
color. Unfortunately, the pager code has no idea which color
variables were turned on (and there are many of them
throughout the code, and they may even have been manipulated
after the colorbool selection by something like "--color" on
the command line).

This bug can be seen any time a pager is started after
config and command line options are checked. This has
affected "git diff" since 89d07f7 (diff: don't run pager if
user asked for a diff style exit code, 2007-08-12). It has
also affect the log family since 1fda91b (Fix 'git log'
early pager startup error case, 2010-08-24).

This patch splits the notion of parsing a colorbool and
actually checking the configuration. The "use_color"
variables now have an additional possible value,
GIT_COLOR_AUTO. Users of the variable should use the new
"want_color()" wrapper, which will lazily determine and
cache the auto-color decision.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-19 15:51:34 -07:00
Jeff King e269eb7946 git_config_colorbool: refactor stdout_is_tty handling
Usually this function figures out for itself whether stdout
is a tty. However, it has an extra parameter just to allow
git-config to override the auto-detection for its
--get-colorbool option.

Instead of an extra parameter, let's just use a global
variable. This makes calling easier in the common case, and
will make refactoring the colorbool code much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 14:48:29 -07:00
Jeff King f1c9626105 diff: refactor COLOR_DIFF from a flag into an int
This lets us store more than just a bit flag for whether we
want color; we can also store whether we want automatic
colors. This can be useful for making the automatic-color
decision closer to the point of use.

This mostly just involves replacing DIFF_OPT_* calls with
manipulations of the flag. The biggest exception is that
calls to DIFF_OPT_TST must check for "o->use_color > 0",
which lets an "unknown" value (i.e., the default) stay at
"no color". In the previous code, a value of "-1" was not
propagated at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 14:35:53 -07:00
Junio C Hamano ca01600306 Merge branch 'rc/histogram-diff'
* rc/histogram-diff:
  xdiff/xhistogram: drop need for additional variable
  xdiff/xhistogram: rely on xdl_trim_ends()
  xdiff/xhistogram: rework handling of recursed results
  xdiff: do away with xdl_mmfile_next()
  Make test number unique
  xdiff/xprepare: use a smaller sample size for histogram diff
  xdiff/xprepare: skip classification
  teach --histogram to diff
  t4033-diff-patience: factor out tests
  xdiff/xpatience: factor out fall-back-diff function
  xdiff/xprepare: refactor abort cleanups
  xdiff/xprepare: use memset()
2011-08-17 17:36:06 -07:00
Junio C Hamano a35d78c0f4 Merge branch 'jc/zlib-wrap' into maint
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter
2011-08-16 11:23:26 -07:00
Junio C Hamano e10e476fb1 Merge branch 'jk/combine-diff-binary-etc' into maint
* jk/combine-diff-binary-etc:
  combine-diff: respect textconv attributes
  refactor get_textconv to not require diff_filespec
  combine-diff: handle binary files as binary
  combine-diff: calculate mode_differs earlier
  combine-diff: split header printing into its own function
2011-08-16 11:23:24 -07:00
Junio C Hamano eb4f4076aa Merge branch 'jc/zlib-wrap'
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter

Conflicts:
	sha1_file.c
2011-07-19 09:33:04 -07:00
Tay Ray Chuan 8c912eea94 teach --histogram to diff
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show
that it is faster than its --patience cousin, as well as the default
Meyers algorithm.

The implementation has been reworked to use structs and pointers,
instead of bitmasks, thus doing away with JGit's 2^28 line limit.

We also use xdiff's default hash table implementation (xdl_hash_bits()
with XDL_HASHLONG()) for convenience.

Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12 09:29:20 -07:00
Junio C Hamano a852aac48d Merge branch 'mg/diff-stat-count'
* mg/diff-stat-count:
  diff --stat-count: finishing touches
  diff-options.txt: describe --stat-{width,name-width,count}
  diff: introduce --stat-lines to limit the stat lines
  diff.c: omit hidden entries from namelen calculation with --stat
2011-06-29 17:03:10 -07:00
Junio C Hamano dbae1a1336 Merge branch 'jk/combine-diff-binary-etc'
* jk/combine-diff-binary-etc:
  combine-diff: respect textconv attributes
  refactor get_textconv to not require diff_filespec
  combine-diff: handle binary files as binary
  combine-diff: calculate mode_differs earlier
  combine-diff: split header printing into its own function
2011-06-29 17:03:10 -07:00
Junio C Hamano ef49a7a012 zlib: zlib can only process 4GB at a time
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:52:15 -07:00
Junio C Hamano 225a6f1068 zlib: wrap deflateBound() too
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:18:17 -07:00
Junio C Hamano 55bb5c9147 zlib: wrap deflate side of the API
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:10:29 -07:00
Junio C Hamano 456a4c08b8 Merge branch 'jk/diff-not-so-quick'
* jk/diff-not-so-quick:
  diff: futureproof "stop feeding the backend early" logic
  diff_tree: disable QUICK optimization with diff filter

Conflicts:
	diff.c
2011-06-06 11:40:14 -07:00
Junio C Hamano b3c89315a3 Merge branch 'jc/rename-degrade-cc-to-c' into maint
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-05-31 12:00:02 -07:00
Junio C Hamano 28b9264dd6 diff: futureproof "stop feeding the backend early" logic
Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 09:21:36 -07:00
Junio C Hamano e5f85df87e diff --stat-count: finishing touches
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 21:50:39 -07:00
Michael J Gruber 808e1db231 diff: introduce --stat-lines to limit the stat lines
Often one is interested in the full --stat output only for commits which
change a few files, but not others, because larger restructuring gives a
--stat which fills a few screens.

Introduce a new option --stat-count=<count> which limits the --stat output
to the first <count> lines, followed by a "..." line. It can
also be given as the third parameter in
--stat=<width>,<name-width>,<count>.

Also, the unstuck form is supported analogous to the other two stat
parameters.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:34 -07:00
Michael J Gruber 358e460eeb diff.c: omit hidden entries from namelen calculation with --stat
Currently, --stat calculates the longest name from all items but then
drops some (mode changes) from the output later on.

Instead, drop them from the namelen generation and calculation.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:02 -07:00
Junio C Hamano d9ac3e41c3 Merge branch 'jm/maint-diff-words-with-sbe' into maint
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-26 09:43:00 -07:00
Jeff King 3813e69031 refactor get_textconv to not require diff_filespec
This function actually does two things:

  1. Load the userdiff driver for the filespec.

  2. Decide whether the driver has a textconv component, and
     initialize the textconv cache if applicable.

Only part (1) requires the filespec object, and some callers
may not have a filespec at all. So let's split them it into
two functions, and put part (2) with the userdiff code,
which is a better fit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-23 15:46:02 -07:00
Junio C Hamano 34ad5a52b4 Merge branch 'jm/maint-diff-words-with-sbe'
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-23 10:27:42 -07:00
Jim Meyering 42536dd9b9 do not read beyond end of malloc'd buffer
With diff.suppress-blank-empty=true, "git diff --word-diff" would
output data that had been read from uninitialized heap memory.
The problem was that fn_out_consume did not account for the
possibility of a line with length 1, i.e., the empty context line
that diff.suppress-blank-empty=true converts from " \n" to "\n".
Since it assumed there would always be a prefix character (the space),
it decremented "len" unconditionally, thus passing len=0 to emit_line,
which would then blindly call emit_line_0 with len=-1 which would
pass that value on to fwrite as SIZE_MAX.  Boom.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 11:39:49 -07:00
Junio C Hamano df54e2bfd6 Merge branch 'jh/dirstat-lines'
* jh/dirstat-lines:
  Mark dirstat error messages for translation
  Improve error handling when parsing dirstat parameters
  New --dirstat=lines mode, doing dirstat analysis based on diffstat
  Allow specifying --dirstat cut-off percentage as a floating point number
  Add config variable for specifying default --dirstat behavior
  Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
  Make --dirstat=0 output directories that contribute < 0.1% of changes
  Add several testcases for --dirstat and friends
2011-05-13 11:01:32 -07:00
Junio C Hamano a613b534bc Merge branch 'jc/fix-diff-files-unmerged' into maint
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-13 10:41:54 -07:00
Junio C Hamano 22dbeee715 Merge branch 'jc/fix-diff-files-unmerged'
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-06 10:52:58 -07:00
Junio C Hamano f5bf1b5f6b Merge branch 'jh/dirstat' into maint
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-05-04 14:59:07 -07:00
Johan Herland 7478ac57c4 Mark dirstat error messages for translation
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland 51670fc87e Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.

This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).

The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).

The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland 1c57a627bf New --dirstat=lines mode, doing dirstat analysis based on diffstat
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.

The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).

For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.

In linux-2.6.git, running the three different --dirstat modes:

  time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null

yields the following average runtimes on my machine:

 - "changes" (default): ~6.0 s
 - "lines":             ~9.6 s
 - "files":             ~0.1 s

So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.

The patch also includes documentation and tests for the new dirstat mode.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:55 -07:00
Johan Herland 712d2c7dd8 Allow specifying --dirstat cut-off percentage as a floating point number
Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.

Selftests verifying floating-point percentage input has been added.

Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:11 -07:00
Johan Herland 2d17495196 Add config variable for specifying default --dirstat behavior
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.

When not specified, the --dirstat defaults are 'changes,noncumulative,3'.

The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:03 -07:00
Johan Herland 333f3fb0c5 Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.

 - Preserve the current --dirstat=<limit> (where <limit> is an integer
   specifying a cut-off percentage)
 - Add --dirstat=cumulative, replacing --cumulative
 - Add --dirstat=files, replacing --dirstat-by-file
 - Also add --dirstat=changes and --dirstat=noncumulative for specifying the
   current default behavior. These allow the user to reset other --dirstat
   parameters (e.g. 'cumulative' and 'files') occuring earlier on the
   command line.

The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.

Allow multiple parameters to be separated by commas, e.g.:
  --dirstat=files,10,cumulative

Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Johan Herland 58a8756a98 Make --dirstat=0 output directories that contribute < 0.1% of changes
The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.

This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.

This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Junio C Hamano 50d3062ab2 Merge branch 'jc/diff-irreversible-delete'
* jc/diff-irreversible-delete:
  git diff -D: omit the preimage of deletes
2011-04-28 14:11:47 -07:00
Junio C Hamano 76a89d6d82 Merge branch 'jc/rename-degrade-cc-to-c'
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-04-28 14:11:43 -07:00
Junio C Hamano d98a509ec3 Merge branch 'jh/dirstat'
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-28 14:11:19 -07:00
Junio C Hamano fa7b290895 diff: remove often unused parameters from diff_unmerge()
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair.  Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Junio C Hamano 76399c0195 diff.c: return filepair from diff_unmerge()
The underlying diff_queue() returns diff_filepair so that the caller can
further add information to it, and the helper function diff_unmerge()
utilizes the feature itself, but does not expose it to its callers, which
was kind of selfish.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Johan Herland 2ca8671470 --dirstat: In case of renames, use target filename instead of source filename
This changes --dirstat analysis to count "damage" toward the target filename,
rather than the source filename. For renames within a directory, this won't
matter to the final output, but when moving files between diretories, the
output now lists the target directory rather than the source directory.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-12 11:29:34 -07:00
Johan Herland 2ff3a80334 Teach --dirstat not to completely ignore rearranged lines within a file
Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.

Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.

Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 11:16:15 -07:00
Johan Herland 0133dab75d --dirstat-by-file: Make it faster and more correct
Currently, when using --dirstat-by-file, it first does the full --dirstat
analysis (using diffcore_count_changes()), and then resets 'damage' to 1,
if any damage was found by diffcore_count_changes().

But --dirstat-by-file is not interested in the file damage per se. It only
cares if the file changed at all. In that sense it only cares if the blob
object for a file has changed. We therefore only need to compare the
object names of each file pair in the diff queue and we can skip the
entire --dirstat analysis and simply set 'damage' to 1 for each entry
where the object name has changed.

This makes --dirstat-by-file faster, and also bypasses --dirstat's practice
of ignoring rearranged lines within a file.

The patch also contains an added testcase verifying that --dirstat-by-file
now detects changes that only rearrange lines within a file.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 10:12:24 -07:00
Junio C Hamano 467ddc14fe git diff -D: omit the preimage of deletes
When reviewing a patch while concentrating primarily on the text after
then change, wading through pages of deleted text involves a cognitive
burden.

Introduce the -D option that omits the preimage text from the patch output
for deleted files.  When used with -B (represent total rewrite as a single
wholesale deletion followed by a single wholesale addition), the preimage
text is also omitted.

To prevent such a patch from being applied by mistake, the output is
designed not to be usable by "git apply" (or GNU "patch"); it is strictly
for human consumption.

It of course is possible to "apply" such a patch by hand, as a human can
read the intention out of such a patch.  It however is impossible to apply
such a patch even manually in reverse, as the whole point of this option
is to omit the information necessary to do so from the output.

Initial request by Mart Sõmermaa, documentation and tests helped by
Michael J Gruber.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-02 23:52:20 -07:00
Junio C Hamano f31027c99c diffcore-rename: fall back to -C when -C -C busts the rename limit
When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed.  We however could fall back
to "git diff -C" if the number of modified paths is sufficiently small.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 14:29:07 -07:00
Johannes Schindelin c0aa335c95 Remove unused variables
Noticed by gcc 4.6.0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 11:43:27 -07:00
Stephen Boyd c2e86addb8 Fix sparse warnings
Fix warnings from 'make check'.

 - These files don't include 'builtin.h' causing sparse to complain that
   cmd_* isn't declared:

   builtin/clone.c:364, builtin/fetch-pack.c:797,
   builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
   builtin/merge-index.c:69, builtin/merge-recursive.c:22
   builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
   builtin/notes.c:822, builtin/pack-redundant.c:596,
   builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
   builtin/remote.c:1512, builtin/remote-ext.c:240,
   builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
   builtin/unpack-file.c:25, builtin/var.c:75

 - These files have symbols which should be marked static since they're
   only file scope:

   submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
   submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
   unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
   url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48

 - These files redeclare symbols to be different types:

   builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
   usage.c:49, usage.c:58, usage.c:63, usage.c:72

 - These files use a literal integer 0 when they really should use a NULL
   pointer:

   daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362

While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 10:16:54 -07:00
Junio C Hamano 0ce6a51b43 Merge branch 'jk/merge-rename-ux'
* jk/merge-rename-ux:
  pull: propagate --progress to merge
  merge: enable progress reporting for rename detection
  add inexact rename detection progress infrastructure
  commit: stop setting rename limit
  bump rename limit defaults (again)
  merge: improve inexact rename limit warning
2011-03-19 23:23:56 -07:00
Junio C Hamano 4e530c5049 Merge branch 'jk/diffstat-binary' into maint
* jk/diffstat-binary:
  diff: don't retrieve binary blobs for diffstat
  diff: handle diffstat of rewritten binary files
2011-03-16 16:47:26 -07:00
Jonathan Nieder 9cba13ca5d standardize brace placement in struct definitions
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:

 struct foo {
	int bar;
	char *baz;
 };

Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.

Linus sayeth:

 Heretic people all over the world have claimed that this inconsistency
 is ...  well ...  inconsistent, but all right-thinking people know that
 (a) K&R are _right_ and (b) K&R are right.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 12:49:02 -07:00
Jeff King abb371a1ef diff: don't retrieve binary blobs for diffstat
We only need the size, which is much cheaper to get,
especially if it is a big binary file.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:58:18 -08:00
Jeff King ded0abc73c diff: handle diffstat of rewritten binary files
The logic in builtin_diffstat assumes that a
complete_rewrite pair should have its lines counted. This is
nonsensical for binary files and leads to confusing things
like:

  $ git diff --stat --summary HEAD^ HEAD
   foo.rand |  Bin 4096 -> 4096 bytes
   1 files changed, 0 insertions(+), 0 deletions(-)

  $ git diff --stat --summary -B HEAD^ HEAD
   foo.rand |   34 +++++++++++++++-------------------
   1 files changed, 15 insertions(+), 19 deletions(-)
   rewrite foo.rand (100%)

So let's reorder the function to handle binary files first
(which from diffstat's perspective look like complete
rewrites anyway), then rewrites, then actual diffstats.

There are two bonus prizes to this reorder:

  1. It gets rid of a now-superfluous goto.

  2. The binary case is at the top, which means we can
     further optimize it in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:57:58 -08:00
Jeff King 92c57e5c1d bump rename limit defaults (again)
We did this once before in 5070591 (bump rename limit
defaults, 2008-04-30). Back then, we were shooting for about
1 second for a diff/log calculation, and 5 seconds for a
merge.

There are a few new things to consider, though:

  1. Average processors are faster now.

  2. We've seen on the mailing list some ugly merges where
     not using inexact rename detection leads to many more
     conflicts. Merges of this size take a long time
     anyway, so users are probably happy to spend a little
     bit of time computing the renames.

Let's bump the diff/merge default limits from 200/500 to
400/1000. Those are 2 seconds and 10 seconds respectively on
my modern hardware.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-21 10:23:36 -08:00
Junio C Hamano 6ae7a51a2e Merge branch 'ks/blame-worktree-textconv-cached'
* ks/blame-worktree-textconv-cached:
  fill_textconv(): Don't get/put cache if sha1 is not valid
  t/t8006: Demonstrate blame is broken when cachetextconv is on
2010-12-21 14:30:52 -08:00
Kirill Smelkov 9ec09b0495 fill_textconv(): Don't get/put cache if sha1 is not valid
When blaming files in the working tree, the filespec is marked with
!sha1_valid, as we have not given the contents an object name yet.  The
function to cache textconv results (keyed on the object name), however,
didn't check this condition, and ended up on storing the cached result
under a random object name.

Cc: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Cc: Clément Poulain <clement.poulain@ensimag.imag.fr>
Cc: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Cc: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-19 18:41:32 -08:00
Junio C Hamano cf7a64b54a Merge branch 'kb/diff-C-M-synonym'
* kb/diff-C-M-synonym:
  diff: use "find" instead of "detect" as prefix for long forms of -M and -C
  diff: add --detect-copies-harder as a synonym for --find-copies-harder
2010-12-16 12:58:59 -08:00
Yann Dirson f611ddc774 diff: use "find" instead of "detect" as prefix for long forms of -M and -C
It is more consistent with existing --find-copies-harder; luckily "detect"
variant has not appeared in any officially released version of git.

Signed-off-by: Yann Dirson <ydirson@altern.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-10 13:52:05 -08:00
Junio C Hamano 8577def6fc Merge branch 'np/diff-in-corrupt-repository' into maint
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-12-09 10:36:39 -08:00
Junio C Hamano ae0a37cd6b Merge branch 'cm/diff-check-at-eol' into maint
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-12-09 10:36:10 -08:00
Junio C Hamano f04aa35eb6 Merge branch 'jk/diff-CBM'
* jk/diff-CBM:
  diff: report bogus input to -C/-M/-B
2010-12-08 11:24:11 -08:00
Junio C Hamano f5a5531e4e Merge branch 'np/diff-in-corrupt-repository'
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-11-29 17:52:33 -08:00
Junio C Hamano 039e84e30d Merge branch 'cm/diff-check-at-eol'
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-11-29 17:52:31 -08:00
Kevin Ballard 150a5daad0 diff: add --detect-copies-harder as a synonym for --find-copies-harder
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 16:58:27 -08:00
Junio C Hamano 9cffe2018a Merge branch 'cb/diff-fname-optim' into maint
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-24 12:46:26 -08:00
Junio C Hamano 78bce6c7e9 Merge branch 'jk/no-textconv-symlink' into maint
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-24 12:46:20 -08:00
Junio C Hamano 8cf666c9ee Merge branch 'cb/diff-fname-optim'
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-17 14:59:16 -08:00
Junio C Hamano 6a2e93f107 Merge branch 'jk/no-textconv-symlink'
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-17 14:59:10 -08:00
Junio C Hamano 329351feeb Merge branch 'kb/merge-recursive-rename-threshold'
* kb/merge-recursive-rename-threshold:
  diff: add synonyms for -M, -C, -B
  merge-recursive: option to specify rename threshold

Conflicts:
	Documentation/diff-options.txt
	Documentation/merge-strategies.txt
2010-10-26 21:54:04 -07:00
Junio C Hamano d7806967bd Merge branch 'maint'
* maint:
  Fix copy-pasted comments related to tree diff handling.
2010-10-26 15:04:05 -07:00
Yann Dirson c3fced6498 Fix copy-pasted comments related to tree diff handling.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-25 00:21:56 -07:00
Nicolas Pitre c50c4316e1 diff: don't presume empty file when corresponding object is missing
The low-level diff code will happily produce totally bogus diff output
with a broken repository via format-patch and friends by treating missing
objects as empty files.  Let's prevent that from happening any longer.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 22:23:34 -07:00
Jeff King 07cd726527 diff: report bogus input to -C/-M/-B
We already detect invalid input to these functions, but we
simply exit with an error code, never saying anything as
simple as "your input was wrong". Let's fix that.

Before:

  $ git diff -CM
  $ echo $?
  128

After:

  $ git diff -CM
  error: invalid argument to -C: M
  $ echo $?
  128

There should be no problems with having diff_opt_parse print
to stderr, as there is already precedent in complaining
about bogus --color and --output arguments.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 15:44:53 -07:00
Christoph Mallon 8837d33595 diff --check: correct line numbers of new blank lines at EOF
The whitespace check printed the value of the wrong variable, i.e. the
beginning of the block of blank lines at the EOF (possibly absent) in the
old file.

As "git diff --check" is used by users to check their changes before
making a commit, we should point at the line number in the file after
the change.

Signed-off-by: Christoph Mallon <christoph.mallon@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-16 18:57:35 -07:00
Junio C Hamano b3c16ee454 Merge branch 'jc/pickaxe-grep'
* jc/pickaxe-grep:
  diff/log -G<pattern>: tests
  git log/diff: add -G<regexp> that greps in the patch text
  diff: pass the entire diff-options to diffcore_pickaxe()
  gitdiffcore doc: update pickaxe description
2010-09-29 13:49:03 -07:00
Matthieu Moy 9ec26eb7cd diff: trivial fix for --output file error message
The option argument is either after the equal sign in --output=... or in
the next command-line argument. optarg is the reliable way to access it.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:25:17 -07:00
Kevin Ballard 37ab5156ae diff: add synonyms for -M, -C, -B
Add new long-form options --detect-renames[=<n>], --detect-copies[=<n>],
and --break-rewrites[=[<n>][/<m>]] as synonyms for the -M, -C, and -B
options (respectively).

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:18:04 -07:00
Kevin Ballard 10ae7526be merge-recursive: option to specify rename threshold
The recursive merge strategy turns on rename detection but leaves the
rename threshold at the default. Add a strategy option to allow the user
to specify a rename threshold to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:15:56 -07:00
Clemens Buchacher ad14b450c0 do not search functions for patch ID
Visual aids, such as the function name in the hunk
header, are not necessary for the purposes of
computing a patch ID.

This is a performance optimization.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:35:07 -07:00
Jeff King d391c0ff94 diff: don't use pathname-based diff drivers for symlinks
When we're diffing symlinks, we consider the contents to be
the pathname that the symlink points to. When a user sets up
a userdiff driver like "*.pdf diff=pdf", their "diff.pdf.*"
config generally tells us what to do with the content of
pdf files.

With the current code, we will actually process a symlink
like "link.pdf" using a configured pdf driver, meaning we
are using contents which consist of a pathname with
configuration that is expecting contents that consist of an
actual pdf file.

The most noticeable example of this would have been
textconv; however, it was already protected in its own
textconv-specific code path. We can still see the breakage
with something like "diff.*.binary", though. You could
also see it with diff.*.funcname, though it is a bit harder
to trigger accidentally there.

This patch adds a check for S_ISREG lower in the callstack
than the textconv-specific check, which should block use of
any userdiff config for non-regular files. We can drop the
check in the textconv code, which is now redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:32:32 -07:00
Junio C Hamano 8ac8cf5bc1 Merge branch 'maint'
* maint:
  xdiff-interface.c: always trim trailing space from xfuncname matches
  diff.c: call regfree to free memory allocated by regcomp when necessary
2010-09-09 17:29:40 -07:00
Brandon Casey ef5644ea6e diff.c: call regfree to free memory allocated by regcomp when necessary
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09 17:18:04 -07:00
Junio C Hamano 7cc1e385a0 Merge branch 'cb/binary-patch-id'
* cb/binary-patch-id:
  hash binary sha1 into patch id
2010-08-31 16:24:48 -07:00
Junio C Hamano f506b8e8b5 git log/diff: add -G<regexp> that greps in the patch text
Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
"git diff" family of commands.  This limits the diff queue to filepairs
whose patch text actually has an added or a deleted line that matches the
given regexp.  Unlike "-S<regexp>", changing other parts of the line that
has a substring that matches the given regexp IS counted as a change, as
such a change would appear as one deletion followed by one addition in a
patch text.

Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
that changes the number of occurrences of hits between the preimage and
the postimage to serve as a part of larger toolchain, this is meant to be
used as the top-level Porcelain feature.

The implementation unfortunately has to run "diff" twice if you are
running "log" family of commands to produce patches in the final output
(e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
result in-core if we wanted to, but that would require larger surgery to
the diffcore machinery (i.e. adding an extra pointer in the filepair
structure to keep a pointer to a strbuf around, stuff the textual diff to
the strbuf inside diffgrep_consume(), and make use of it in later stages
when it is available) and it may not be worth it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:29 -07:00
Junio C Hamano 382f013bc4 diff: pass the entire diff-options to diffcore_pickaxe()
That would make it easier to give enhanced feature to the
pickaxe transformation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:28 -07:00
Junio C Hamano e40b34b1ec Merge branch 'mm/shortopt-detached'
* mm/shortopt-detached:
  log: parse separate option for --glob
  log: parse separate options like git log --grep foo
  diff: parse separate options --stat-width n, --stat-name-width n
  diff: split off a function for --stat-* option parsing
  diff: parse separate options like -S foo

Conflicts:
	revision.c
2010-08-21 23:28:31 -07:00
Junio C Hamano bd3a97a27a Merge branch 'jc/maint-follow-rename-fix'
* jc/maint-follow-rename-fix:
  log: test for regression introduced in v1.7.2-rc0~103^2~2
  diff --follow: do call diffcore_std() as necessary
  diff --follow: do not waste cycles while recursing
2010-08-18 12:47:18 -07:00
Junio C Hamano cc34bb0b02 Merge branch 'jl/submodule-ignore-diff'
* jl/submodule-ignore-diff:
  Add tests for the diff.ignoreSubmodules config option
  Add the 'diff.ignoreSubmodules' config setting
  Submodules: Use "ignore" settings from .gitmodules too for diff and status
  Submodules: Add the new "ignore" config option for diff and status

Conflicts:
	diff.c
2010-08-18 12:36:25 -07:00
Clemens Buchacher 34597c1f5a hash binary sha1 into patch id
Since commit 2f82f760 (Take binary diffs into
account for "git rebase"), binary files are
included in patch ID computation. Binary files are
diffed using the text diff algorithm, however,
which has a huge impact on performance. The
following tests performance for a 50000 line file
marked as binary in .gitattributes.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.367s
user    0m0.354s
sys     0m0.010s

Instead of diffing the binary files, hash the pre-
and post-image sha1, which is just as unique. As a
result, performance is much improved.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.016s
user    0m0.015s
sys     0m0.001s

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-16 18:31:37 -07:00
Junio C Hamano 44c48a909a diff --follow: do call diffcore_std() as necessary
Usually, diff frontends populate the output queue with filepairs without
any rename information and call diffcore_std() to sort the renames out.
When --follow is in effect, however, diff-tree family of frontend has a
hack that looks like this:

    diff-tree frontend
    -> diff_tree_sha1()
       . populate diff_queued_diff
       . if --follow is in effect and there is only one change that
         creates the target path, then
       -> try_to_follow_renames()
	  -> diff_tree_sha1() with no pathspec but with -C
	  -> diffcore_std() to find renames
	  . if rename is found, tweak diff_queued_diff and put a
	    single filepair that records the found rename there
    -> diffcore_std()
       . tweak elements on diff_queued_diff by
       - rename detection
       - path ordering
       - pickaxe filtering

We need to skip parts of the second call to diffcore_std() that is related
to rename detection, and do so only when try_to_follow_renames() did find
a rename.  Earlier 1da6175 (Make diffcore_std only can run once before a
diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it
unconditionally disabled any second call to diffcore_std().

This hopefully fixes the breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13 12:17:45 -07:00
Jakub Narebski d8faea9d18 diff: strip extra "/" when stripping prefix
There are two ways a user might want to use "diff --relative":

  1. For a file in a directory, like "subdir/file", the user
     can use "--relative=subdir/" to strip the directory.

  2. To strip part of a filename, like "foo-10", they can
     use "--relative=foo-".

We currently handle both of those situations. However, if the user passes
"--relative=subdir" (without the trailing slash), we produce inconsistent
results. For the unified diff format, we collapse the double-slash of
"a//file" correctly into "a/file". But for other formats (raw, stat,
name-status), we end up with "/file".

We can do what the user means here and strip the extra "/" (and only a
slash).  We are not hurting any existing users of (2) above with this
behavior change because the existing output for this case was nonsensical.

Patch by Jakub, tests and commit message by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 09:46:47 -07:00
Johannes Schindelin be4f2b408e Add the 'diff.ignoreSubmodules' config setting
When you have a lot of submodules checked out, the time penalty to check
for dirty submodules can easily imply a multiplication of the total time
by the factor 20. This makes the difference between almost instantaneous
(< 2 seconds) and unbearably slow (> 50 seconds) here, since the disk
caches are constantly overloaded.

To this end, the submodule.*.ignore config option was introduced, but it
is per-submodule.

This commit introduces a global config setting to set a default
(porcelain) value for the --ignore-submodules option, keeping the
default at 'none'. It can be overridden by the submodule.*.ignore
setting and by the --ignore-submodules option.

Incidentally, this commit fixes an issue with the overriding logic:
multiple --ignore-submodules options would not clear the previously
set flags.

While at it, fix a typo in the documentation for submodule.*.ignore.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:11:50 -07:00
Jens Lehmann aee9c7d654 Submodules: Add the new "ignore" config option for diff and status
The new "ignore" config option controls the default behavior for "git
status" and the diff family. It specifies under what circumstances they
consider submodules as modified and can be set separately for each
submodule.

The command line option "--ignore-submodules=" has been extended to accept
the new parameter "none" for both status and diff.

Users that chose submodules to get rid of long work tree scanning times
might want to set the "dirty" option for those submodules. This brings
back the pre 1.7.0 behavior, where submodule work trees were never
scanned for modifications. By using "--ignore-submodules=none" on the
command line the status and diff commands can be told to do a full scan.

This option can be set to the following values (which have the same name
and meaning as for the "--ignore-submodules" option of status and diff):

"all": All changes to the submodule will be ignored.

"dirty": Only differences of the commit recorded in the superproject and
	the submodules HEAD will be considered modifications, all changes
	to the work tree of the submodule will be ignored. When using this
	value, the submodule will not be scanned for work tree changes at
	all, leading to a performance benefit on large submodules.

"untracked": Only untracked files in the submodules work tree are ignored,
	a changed HEAD and/or modified files in the submodule will mark it
	as modified.

"none" (which is the default): Either untracked or modified files in a
	submodules work tree or a difference between the subdmodules HEAD
	and the commit recorded in the superproject will make it show up
	as changed. This value is added as a new parameter for the
	"--ignore-submodules" option of the diff family and "git status"
	so the user can override the settings in the configuration.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:01:52 -07:00
Matthieu Moy 1e57208ef0 diff: parse separate options --stat-width n, --stat-name-width n
Part of a campaign for unstuck forms of options.

[jn: with some refactoring]

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:36 -07:00
Jonathan Nieder 4d7f7a4ae7 diff: split off a function for --stat-* option parsing
As an optimization, the diff_opt_parse() switchboard has
a single case for all the --stat-* options.  Split it
off into a separate function so we can enhance it
without bringing code dangerously close to the right
margin.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:28 -07:00
Matthieu Moy dea007fb4c diff: parse separate options like -S foo
Change the option parsing logic in revision.c to accept separate forms
like `-S foo' in addition to `-Sfoo'. The rest of git already accepted
this form, but revision.c still used its own option parsing.

Short options affected are -S<string>, -l<num> and -O<orderfile>, for
which an empty string wouldn't make sense, hence -<option> <arg> isn't
ambiguous.

This patch does not handle --stat-name-width and --stat-width, which are
special-cases where diff_long_opt do not apply. They are handled in a
separate patch to ease review.

Original patch by Matthieu Moy, plus refactoring by Jonathan Nieder.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:22 -07:00
Junio C Hamano bb89e84f95 Merge branch 'sv/maint-diff-q-clear-fix' into maint
* sv/maint-diff-q-clear-fix:
  Fix DIFF_QUEUE_CLEAR refactoring
2010-08-03 15:17:34 -07:00
Junio C Hamano ee38d823f7 Fix DIFF_QUEUE_CLEAR refactoring
It introduced a macro to reduce repeated assignments to three fields,
but an unrelated and incorrect change snuck in by mistake, which broke
commands like "git diff-files -p --submodule".

Noticed by Sven Verdoolaege.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 08:30:02 -07:00
Bo Yang e13f38a33e diff.c: fix a graph output bug
When --graph is in effect, the line-prefix typically has colored graph
line segments and ends with reset.  The color sequence "set" given to
this function is for showing the metainfo part of the patch text and
(1) it should not be applied to the graph lines, and (2) it will be
reset at the end of line_prefix so it won't be in effect anyway.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-08 18:09:14 -07:00
Junio C Hamano a76b2084fb Merge branch 'jl/status-ignore-submodules'
* jl/status-ignore-submodules:
  Add the option "--ignore-submodules" to "git status"
  git submodule: ignore dirty submodules for summary and status

Conflicts:
	builtin/commit.c
	t/t7508-status.sh
	wt-status.c
	wt-status.h
2010-06-30 11:55:39 -07:00
Junio C Hamano e1165dd144 Merge branch 'jl/maint-diff-ignore-submodules'
* jl/maint-diff-ignore-submodules:
  t4027,4041: Use test -s to test for an empty file
  Add optional parameters to the diff option "--ignore-submodules"
  git diff: rename test that had a conflicting name
2010-06-30 11:55:37 -07:00
Junio C Hamano 4af574dbdc Merge branch 'ab/blame-textconv'
* ab/blame-textconv:
  t/t8006: test textconv support for blame
  textconv: support for blame
  textconv: make the API public

Conflicts:
	diff.h
2010-06-27 12:07:44 -07:00
Jens Lehmann 46a958b3da Add the option "--ignore-submodules" to "git status"
In some use cases it is not desirable that "git status" considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree (this was the behavior before
1.7.0). An example for that are scripts which just want to check for
submodule commits while ignoring any changes to the work tree. Also users
having large submodules known not to change might want to use this option,
as the - sometimes substantial - time it takes to scan the submodule work
tree(s) is saved when using the "dirty" parameter.

And if you want to ignore any changes to submodules, you can now do that
by using this option without parameters or with "all" (when the config
option status.submodulesummary is set, using "all" will also suppress the
output of the submodule summary).

A new function handle_ignore_submodules_arg() is introduced to parse this
option new to "git status" in a single location, as "git diff" already
knew it.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-25 11:30:25 -07:00
Junio C Hamano 262657dce6 Merge branch 'maint'
* maint:
  Update draft release notes to 1.7.1.1
  tests: remove unnecessary '^' from 'expr' regular expression

Conflicts:
	diff.c
2010-06-22 09:35:36 -07:00
Junio C Hamano 3c656899cd Merge branch 'cc/maint-diff-CC-binary' into maint
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-22 09:27:07 -07:00
Junio C Hamano cb2af93ac1 Merge branch 'bw/diff-metainfo-color' into maint
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-21 05:40:10 -07:00
Junio C Hamano 60335534a6 Merge branch 'rs/diff-no-minimal' into maint
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-21 05:38:50 -07:00
Junio C Hamano 5977744d04 Merge branch 'cc/maint-diff-CC-binary'
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano 98ad90fbab Merge branch 'by/diff-graph'
* by/diff-graph:
  Make --color-words work well with --graph
  graph.c: register a callback for graph output
  Emit a whole line in one go
  diff.c: Output the text graph padding before each diff line
  Output the graph columns at the end of the commit message
  Add a prefix output callback to diff output

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano 18fd805583 Merge branch 'jh/diff-index-line-abbrev'
* jh/diff-index-line-abbrev:
  diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s

Conflicts:
	diff.c
2010-06-18 11:16:56 -07:00
Junio C Hamano 2621ac50cc Merge branch 'ec/diff-noprefix-config'
* ec/diff-noprefix-config:
  diff: add configuration option for disabling diff prefixes.
2010-06-18 11:16:55 -07:00
Junio C Hamano 448598b508 Merge branch 'bw/diff-metainfo-color'
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-13 11:21:25 -07:00
Junio C Hamano 39b5977b13 Merge branch 'rs/diff-no-minimal'
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-13 11:20:46 -07:00
Jens Lehmann dd44d419d3 Add optional parameters to the diff option "--ignore-submodules"
In some use cases it is not desirable that the diff family considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree. An example for that are scripts
which just want to check for submodule commits while ignoring any changes
to the work tree. Also users having large submodules known not to change
might want to use this option, as the - sometimes substantial - time it
takes to scan the submodule work tree(s) is saved.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:33:17 -07:00
Axel Bonnet a788d7d58b textconv: make the API public
The textconv functionality allows one to convert a file into text before
running diff. But this functionality can be useful to other features
such as blame.

Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr>
Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:17:57 -07:00
Christian Couder 296c6bb21a diff: fix "git show -C -C" output when renaming a binary file
A bug was introduced in 3e97c7c6af
(No diff -b/-w output for all-whitespace changes, Nov 19 2009)
that made the lines:

  diff --git a/bar b/sub/bar
  similarity index 100%
  rename from bar
  rename to sub/bar

disappear from "git show -C -C" output when file bar is a binary
file.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-06 15:14:27 -07:00
Bo Yang 4297c0aeb5 Make --color-words work well with --graph
'--color-words' algorithm can be described as:

  1. collect a the minus/plus lines of a diff hunk, divided into
     minus-lines and plus-lines;

  2. break both minus-lines and plus-lines into words and
     place them into two mmfile_t with one word for each line;

  3. use xdiff to run diff on the two mmfile_t to get the words level diff;

And for the common parts of the both file, we output the plus side text.
diff_words->current_plus is used to trace the current position of the plus file
which printed. diff_words->last_minus is used to trace the last minus word
printed.

For '--graph' to work with '--color-words', we need to output the graph prefix
on each line of color words output. Generally, there are two conditions on
which we should output the prefix.

  1. diff_words->last_minus == 0 &&
     diff_words->current_plus == diff_words->plus.text.ptr

     that is: the plus text must start as a new line, and if there is no minus
     word printed, a graph prefix must be printed.

  2. diff_words->current_plus > diff_words->plus.text.ptr &&
     *(diff_words->current_plus - 1) == '\n'

     that is: a graph prefix must be printed following a '\n'

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:20 -07:00
Bo Yang 2efcc97764 Emit a whole line in one go
Since the graph prefix will be printed when calling
emit_line, so the functions should be used to emit a
complete line out once a time. No one should call
emit_line to just output some strings instead of a
complete line.
Use a strbuf to compose the whole line, and then
call emit_line to output it once.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:04 -07:00
Bo Yang 7be5761073 diff.c: Output the text graph padding before each diff line
Change output from diff with -p/--dirstat/--binary/--numstat/--stat/
--shortstat/--check/--summary options to align with graph paddings.

Thanks Jeff King <peff@peff.net> for reporting the '--summary' bug and
his initial patch.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Bo Yang a3c158d4a5 Add a prefix output callback to diff output
The callback can be used to add some prefix string to each line of
diff output.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Johan Herland 3e5a188f1d diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s
In the metainfo section of git diffs there's an "index" line providing
abbreviated (unless --full-index is used) blob SHA1s from the
pre-/post-images used to generate the diff. These provide hints that
can be used to reconstruct a 3-way merge when applying the patch
(see the --3way option to 'git am' for more details).

In order for this to work, however, the blob SHA1s must not be
abbreviated into ambiguity.

This patch eliminates the possible ambiguity by using find_unique_abbrev()
to produce the abbreviated SHA1s (instead of blind abbreviation by way of
"%.*s").

A testcase verifying the fix is also included.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 17:44:01 -07:00
Junio C Hamano 82c531b3b6 Merge branch 'by/log-follow'
* by/log-follow:
  tests: rename duplicate t4205
  Make git log --follow find copies among unmodified files.
  Make diffcore_std only can run once before a diff_flush
  Add a macro DIFF_QUEUE_CLEAR.
2010-05-21 04:02:23 -07:00
Junio C Hamano 1bdd46cd3a Merge branch 'tr/word-diff'
* tr/word-diff:
  diff: add --word-diff option that generalizes --color-words

Conflicts:
	diff.c
2010-05-21 04:02:17 -07:00
Bert Wesarg 374664478f diff: fix coloring of extended diff headers
Coloring the extended headers where done as a whole not per line. less with
option -R (which is the default from git) does not support this coloring
mode because of performance reasons. The -r option would be an alternative
but has problems with lines that are longer than the screen. Therefore
stick to the idiom to color each line separately. The problem is, that the
result of ill_metainfo() will also be used as an parameter to an external
diff driver, so we need to disable coloring in this case.

Because coloring is now done inside fill_metainfo() we can simply add this
string to the diff header and therefore keep the last newline in the
extended header. This results also into the fact that the external diff
driver now gets this last newline too. Which is a change in behavior
but a good one.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-19 21:06:40 -07:00
Will Palmer 1c9eecff97 diff-options: make --patch a synonym for -p
Here we simply make --patch a synonym for -p, whose mnemonic was "patch"
all along.

Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:50:03 -07:00
Eli Collins f89504ddb9 diff: add configuration option for disabling diff prefixes.
With new configuration "diff.noprefix", "git diff" does not show a source or destination prefix ala "git diff --no-prefix".

Signed-off-by: Eli Collins <eli@cloudera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:31:51 -07:00
Junio C Hamano dd75d07899 Merge branch 'jk/cached-textconv'
* jk/cached-textconv:
  diff: avoid useless filespec population
  diff: cache textconv output
  textconv: refactor calls to run_textconv
  introduce notes-cache interface
  make commit_tree a library function
2010-05-08 22:33:08 -07:00
Bo Yang 1da6175d43 Make diffcore_std only can run once before a diff_flush
When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.

And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.

So, I think we really don't need to run diffcore_std more
than one time.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:28 -07:00
Bo Yang 9ca5df9061 Add a macro DIFF_QUEUE_CLEAR.
Refactor the diff_queue_struct code, this macro help
to reset the structure.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:27 -07:00
Junio C Hamano 6b6f5d4664 Merge branch 'maint-1.7.0' into maint
* maint-1.7.0:
  remove ecb parameter from xdi_diff_outf()
2010-05-04 15:20:47 -07:00
René Scharfe dfea79004c remove ecb parameter from xdi_diff_outf()
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in.  It's no surprise then that all
callers pass a pointer to an uninitialized structure.  They also don't
read it after the call, so the parameter is neither used for input nor
for output.   Turn it into a local variable of xdi_diff_outf().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04 15:19:14 -07:00
René Scharfe 582aa00bdf git diff too slow for a file
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL.  It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.

This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.

SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without.  Rebasing became really slow.

This patch removes this flag from all callers.  The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.

A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-02 07:59:50 -07:00
Junio C Hamano 4fd8145c0c Merge branch 'jk/maint-diffstat-overflow' into maint
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-22 22:29:13 -07:00
Junio C Hamano bc32d342c2 Merge branch 'jk/maint-diffstat-overflow'
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-18 21:31:50 -07:00
Jeff King 0974c117ff diff: use large integers for diffstat calculations
The diffstat "added" and "changed" fields generally store
line counts; however, for binary files, they store file
sizes. Since we store and print these values as ints, a
diffstat on a file larger than 2G can show a negative size.
Instead, let's use uintmax_t, which should be at least 64
bits on modern platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-17 11:30:21 -07:00
Thomas Rast 882749a04f diff: add --word-diff option that generalizes --color-words
This teaches the --color-words engine a more general interface that
supports two new modes:

* --word-diff=plain, inspired by the 'wdiff' utility (most similar to
  'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+}

* --word-diff=porcelain, which generates an ad-hoc machine readable
  format:
  - each diff unit is prefixed by [-+ ] and terminated by newline as
    in unified diff
  - newlines in the input are output as a line consisting only of a
    tilde '~'

Both of these formats still support color if it is enabled, using it
to highlight the differences.  --color-words becomes a synonym for
--word-diff=color, which is the color-only format.  Also adds some
compatibility/convenience options.

Thanks to Junio C Hamano and Miles Bader for good ideas.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-14 10:56:53 -07:00
Junio C Hamano daaf2e8892 Merge branch 'jc/conflict-marker-size' into maint
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-09 22:38:34 -07:00
Junio C Hamano 7ec1eb93f7 Merge early parts of jk/cached-textconv 2010-04-08 23:31:51 -07:00
Junio C Hamano aed6ca52e7 diff.c: work around pointer constness warnings
The textconv leak fix introduced two invocations of free() to release
memory pointed by "const char *", which get annoying compiler warning.
2010-04-08 23:30:49 -07:00
Junio C Hamano 3f3f8d9d09 Merge branch 'jc/conflict-marker-size'
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-06 14:50:46 -07:00
Jeff King b337398266 diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.

These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.

This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases.  There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).

Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.

  [before]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

  [after]
  $ time git show >/dev/null
  real    0m0.009s
  user    0m0.004s
  sys     0m0.004s

And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:11:20 -07:00
Jeff King d9bae1a178 diff: cache textconv output
Running a textconv filter can take a long time. It's
particularly bad for a large file which needs to be spooled
to disk, but even for small files, the fork+exec overhead
can add up for something like "git log -p".

This patch uses the notes-cache mechanism to keep a fast
cache of textconv output. Caches are stored in
refs/notes/textconv/$x, where $x is the userdiff driver
defined in gitattributes.

Caching is enabled only if diff.$x.cachetextconv is true.

In my test repo, on a commit with 45 jpg and avi files
changed and a textconv to show their exif tags:

  [before]
  $ time git show >/dev/null
  real    0m13.724s
  user    0m12.057s
  sys     0m1.624s

  [after, first run]
  $ git config diff.mfo.cachetextconv true
  $ time git show >/dev/null
  real    0m14.252s
  user    0m12.197s
  sys     0m1.800s

  [after, subsequent runs]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

So for a slight (3.8%) cost on the first run, we achieve an
almost 40x speed up on subsequent runs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:05:31 -07:00
Jeff King 840383b2c2 textconv: refactor calls to run_textconv
This patch adds a fill_textconv wrapper, which centralizes
some minor logic like error checking and handling the case
of no-textconv.

In addition to dropping the number of lines, this will make
it easier in future patches to handle multiple types of
textconv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:01:57 -07:00
Jeff King b76c056b95 fix textconv leak in emit_rewrite_diff
We correctly free() for the normal diff case, but leak for
rewrite diffs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-01 23:49:29 -07:00
Junio C Hamano 890a13a452 Sync with 1.7.0.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-31 15:14:27 -07:00
Johannes Sixt da1fbed3ff diff: fix textconv error zombies
To make the code simpler, run_textconv lumps all of its
error checking into one conditional. However, the
short-circuit means that an error in reading will prevent us
from calling finish_command, leaving a zombie child.
Clean up properly after errors.

Based-on-work-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-30 14:46:33 -07:00
Junio C Hamano a757c646ee diff --check: honor conflict-marker-size attribute
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-24 19:35:34 -07:00
Junio C Hamano b6a7a06aa6 Merge branch 'jl/submodule-diff-dirtiness'
* jl/submodule-diff-dirtiness:
  git status: ignoring untracked files must apply to submodules too
  git status: Fix false positive "new commits" output for dirty submodules
  Refactor dirty submodule detection in diff-lib.c
  git status: Show detailed dirty status of submodules in long format
  git diff --submodule: Show detailed dirty status of submodules
2010-03-24 16:25:43 -07:00
Jens Lehmann 85adbf2f75 git status: Fix false positive "new commits" output for dirty submodules
Testing if the output "new commits" should appear in the long format of
"git status" is done by comparing the hashes of the diffpair. This always
resulted in printing "new commits" for submodules that contained untracked
or modified content, even if they did not contain new commits. The reason
was that match_stat_with_submodule() did set the "changed" flag for dirty
submodules, resulting in two->sha1 being set to the null_sha1 at the call
sites, which indicates that new commits are present. This is changed so
that when no new commits are present, the same object names are in the
sha1 field for both sides of the filepair, and the working tree side will
have the "dirty_submodule" flag set when appropriate. For a submodule to
be seen as modified even when it just has a dirty work tree, some
conditions had to be extended to also check for the "dirty_submodule"
flag.

Unfortunately the test case that should have found this bug had been
changed incorrectly too. It is fixed and extended to test for other
combinations too.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:24 -08:00
Jens Lehmann ae6d5c1b6f Refactor dirty submodule detection in diff-lib.c
Moving duplicated code into the new function match_stat_with_submodule().
Replacing the implicit activation of detailed checks for the dirtiness of
submodules when DIFF_FORMAT_PATCH was selected with explicitly setting
the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done().

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:17 -08:00