Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.
This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system. A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).
Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/crlf:
Teach core.autocrlf to 'git apply'
t0020: add test for auto-crlf
Make AutoCRLF ternary variable.
Lazy man's auto-CRLF
* jc/apply-config:
t4119: test autocomputing -p<n> for traditional diff input.
git-apply: guess correct -p<n> value for non-git patches.
git-apply: notice "diff --git" patch again
Fix botched "leak fix"
t4119: add test for traditional patch and different p_value
apply: fix memory leak in prefix_one()
git-apply: require -p<n> when working in a subdirectory.
git-apply: do not lose cwd when run from a subdirectory.
Teach 'git apply' to look at $HOME/.gitconfig even outside of a repository
Teach 'git apply' to look at $GIT_DIR/config
The settings in /etc/gitconfig can be overridden in ~/.gitconfig,
which in turn can be overridden in .git/config.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to do:
[core]
AutoCRLF = input
and it should do only the CRLF->LF translation (ie it simplifies CRLF only
when reading working tree files, but when checking out files, it leaves
the LF alone, and doesn't turn it into a CRLF).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.
Anyway, BY DEFAULT it is off regardless, because it requires a
[core]
AutoCRLF = true
in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).
But you can actually enable it on UNIX, and it will cause:
- "git update-index" will write blobs without CRLF
- "git diff" will diff working tree files without CRLF
- "git checkout" will write files to the working tree _with_ CRLF
and things work fine.
Funnily, it actually shows an odd file in git itself:
git clone -n git test-crlf
cd test-crlf
git config core.autocrlf true
git checkout
git diff
shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.
Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).
I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.
NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.
The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The "git-config --rename-section" implementation would match sections
that are substrings of the section name to be renamed.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/bare:
Disallow working directory commands in a bare repository.
git-fetch: allow updating the current branch in a bare repository.
Introduce is_bare_repository() and core.bare configuration variable
Move initialization of log_all_ref_updates
Suggested by Jakub Narebski <jnareb@gmail.com> on the list.
When we send a value to store_write_pair(), make sure that the value
that gets read out matches the one passed in. This means that for any
value that contains leading or trailing whitespace or any comment
character (# and ;), we need to surround it in quotes.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We need to check that the writes we perform during the update of
the users configuration work. Convert to using write_in_full().
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the old is_bare_git_dir(const char *) to ask if a
directory, if it is a GIT_DIR, is a bare repository, and
replaces it with is_bare_repository(void *). The function looks
at core.bare configuration variable if exists but uses the old
heuristics: if it is ".git" or ends with "/.git", then it does
not look like a bare repository, otherwise it does.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* master:
Documentation/config.txt (and repo-config manpage): mark-up fix.
Teach Git how to parse standard power of 2 suffixes.
Use /dev/null for update hook stdin.
Redirect update hook stdout to stderr.
Remove unnecessary argc parameter from run_command_v.
Automatically detect a bare git repository.
Replace "GIT_DIR" with GIT_DIR_ENVIRONMENT.
Use PATH_MAX constant for --bare.
Force core.filemode to false on Cygwin.
Fix formatting for urls section of fetch, pull, and push manpages
Fix yet another subtle xdl_merge() bug
i18n: drop "encoding" header in the output after re-coding.
commit-tree: cope with different ways "utf-8" can be spelled.
Move commit reencoding parameter parsing to revision.c
Documentation: minor rewording for git-log and git-show pages.
Documentation: i18n commit log message notes.
t3900: test log --encoding=none
commit re-encoding: fix confusion between no and default conversion.
Sometimes its necessary to supply a value as a power of two in a
configuration parameter. In this case the user may want to use the
standard suffixes such as K, M, or G to indicate that the numerical
value should be multiplied by a constant base before being used.
Shell scripts/etc. can also benefit from this automatic option
parsing with `git repo-config --int`.
[jc: with a couple of test and a slight input tightening]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In some cases we did not even bother to check the return value of
mmap() and just assume it worked. This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.
We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location. If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt. If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We cannot allow a window to be smaller than 2 system pages.
This limitation is necessary to support the feature of use_pack()
where we always supply at least 20 bytes after the offset to help
the object header and delta base parsing routines.
If packedGitWindowSize were allowed to be as small as 1 system page
then we would be completely unable to access an object header which
spanned over a page as we would never be able to arrange a mapping
such that the header was contiguous in virtual memory.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This finally turns on the sliding window behavior for packfile data
access by mapping limited size windows and chaining them under the
packed_git->windows list.
We consider a given byte offset to be within the window only if there
would be at least 20 bytes (one hash worth of data) accessible after
the requested offset. This range selection relates to the contract
that use_pack() makes with its callers, allowing them to access
one hash or one object header without needing to call use_pack()
for every byte of data obtained.
In the worst case scenario we will map the same page of data twice
into memory: once at the end of one window and once again at the
start of the next window. This duplicate page mapping will happen
only when an object header or a delta base reference is spanned
over the end of a window and is always limited to just one page of
duplication, as no sane operating system will ever have a page size
smaller than a hash.
I am assuming that the possible wasted page of virtual address
space is going to perform faster than the alternatives, which
would be to copy the object header or ref delta into a temporary
buffer prior to parsing, or to check the window range on every byte
during header parsing. We may decide to revisit this decision in
the future since this is just a gut instinct decision and has not
actually been proven out by experimental testing.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Rather than hardcoding the maximum number of bytes which can be
mmapped from pack files we should make this value configurable,
allowing the end user to increase or decrease this limit on a
per-repository basis depending on the size of the repository
and the capabilities of their operating system.
In general users should not need to manually tune such a low-level
setting within the core code, but being able to artifically limit
the number of bytes which we can mmap at once from pack files will
make it easier to craft test cases for the new mmap sliding window
implementation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8). Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.
The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/clone:
Move "no merge candidate" warning into git-pull
Use preprocessor constants for environment variable names.
Do not create $GIT_DIR/remotes/ directory anymore.
Introduce GIT_TEMPLATE_DIR
Revert "fix testsuite: make sure they use templates freshly built from the source"
fix testsuite: make sure they use templates freshly built from the source
git-clone: lose the traditional 'no-separate-remote' layout
git-clone: lose the artificial "first" fetch refspec
git-pull: refuse default merge without branch.*.merge
git-clone: use wildcard specification for tracking branches
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
We broke the discipline Linus set up to allow compiler help us
avoid typos in environment names in the early days of git over
time. This defines a handful preprocessor constants for
environment variable names used in relatively core parts of the
system.
I've left out variable names specific to subsystems such as HTTP
and SSL as I do not think they are big problems.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Given a config like this:
# A config
[very.interesting.section]
not
The command
$ git repo-config --rename-section very.interesting.section bla.1
will lead to this config:
# A config
[bla "1"]
not
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
While adding colour to the branch command it was pointed out that a
config option like "branch.color" conflicts with the pre-existing
"branch.something" namespace used for specifying default merge urls and
branches. The suggested solution was to flip the order of the
components to "color.branch", which I did for colourising branch.
This patch does the same thing for
- git-log (color.diff)
- git-status (color.status)
- git-diff (color.diff)
- pager (color.pager)
I haven't removed the old config options; but they should probably be
deprecated and eventually removed to prevent future namespace
collisions. I've done this deprecation by changing the documentation
for the config file to match the new names; and adding the "color.XXX"
options to contrib/completion/git-completion.bash.
Unfortunately git-svn reads "diff.color" and "pager.color"; which I
don't like to change unilaterally.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I need this in order to allow aliases of the same form as "ls-tree",
"rev-parse" etc, so that I can use
[alias]
my-cat=--paginate cat-file -p
to add a "git my-cat" command.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
enable/disable colored output when the pager is in use
Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The pack-file format is slightly different from the traditional git
object format, in that it has a much denser binary header encoding.
The traditional format uses an ASCII string with type and length
information, which is somewhat wasteful.
A new object format starts with uncompressed binary header
followed by compressed payload -- this will allow us later to
copy the payload straight to packfiles.
Obviously they cannot be read by older versions of git, so for
now new object files are created with the traditional format.
core.legacyheaders configuration item, when set to false makes
the code write in new format for people to experiment with.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the change in default, "git add ." on kernel dir is about
twice as fast as before, with only minimal (0.5%) change in
object size. The speed difference is even more noticeable
when committing large files, which is now up to 8 times faster.
The configurability is through setting core.compression = [-1..9]
which maps to the zlib constants; -1 is the default, 0 is no
compression, and 1..9 are various speed/size tradeoffs, 9
being slowest.
Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no)
Acked-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This cleans up the use of safe_strncpy() even more. Since it has the
same semantics as strlcpy() use this name instead. Also move the
definition from inside path.c to its own file compat/strlcpy.c, and use
it conditionally at compile time, since some platforms already has
strlcpy(). It's included in the same way as compat/setenv.c.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch is based on Pasky's, with three notable differences:
- I did not yet update the documentation
- I named it .gitconfig, not .gitrc
- git-repo-config does not barf when a unique key is overridden locally
The last means that if you have something like
[alias]
l = log --stat -M
in ~/.gitconfig, and
[alias]
l = log --stat -M next..
in $GIT_DIR/config, then
git-repo-config alias.l
returns only one value, namely the value from $GIT_DIR/config.
If you set the environment variable GIT_CONFIG, $HOME/.gitconfig is not
read, and neither $GIT_DIR/config, but $GIT_CONFIG instead.
If you set GIT_CONFIG_LOCAL instead, it is interpreted instead of
$GIT_DIR/config, but $HOME/.gitconfig is still read.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When setting a config variable, git_config_set() ignored the variables
GIT_CONFIG and GIT_CONFIG_LOCAL. Now, when GIT_CONFIG_LOCAL is set, it
will write to that file. If not, GIT_CONFIG is checked, and only as a
fallback, the change is written to $GIT_DIR/config.
Add a test for it, and also future-proof the test for the upcoming
$HOME/.gitconfig support.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add $GIT_CONFIG environment variable whose content is used instead
of .git/config if set. Also add $GIT_CONFIG_LOCAL as a
forward-compatibility cue for whenever we will finally come to support]
global configuration files (properly).
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
There were a few calls to adjust_shared_perm() that were
missing:
- init-db creates refs, refs/heads, and refs/tags before
reading from templates that could specify sharedrepository in
the config file;
- updating config file created it under user's umask without
adjusting;
- updating refs created it under user's umask without
adjusting;
- switching branches created .git/HEAD under user's umask
without adjusting.
This moves adjust_shared_perm() from sha1_file.c to path.c,
since a few SIMPLE_PROGRAM need to call repository configuration
functions which in turn need to call adjust_shared_perm().
sha1_file.c needs to link with SHA1 computation library which
is usually not linked to SIMPLE_PROGRAM.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If config parameter core.logAllRefUpdates is true or the log
file already exists then append a line to ".git/logs/refs/<ref>"
whenever git-update-ref <ref> is executed. Each log line contains
the following information:
oldsha1 <SP> newsha1 <SP> committer <LF>
where committer is the current user, date, time and timezone in
the standard GIT ident format. If the caller is unable to append
to the log file then git-update-ref will fail without updating <ref>.
An optional message may be included in the log line with the -m flag.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This updates the hierarchical section name syntax to
[section<space>+"<randomstring>"]
where the only rule for "randomstring" is that it can't contain a newline,
and if you really want to insert a double-quote, you do it with \".
It turns that into the section name "secion.randomstring". The
"section" part is still case insensitive, but the "randomstring"
part is case sensitive.
So you could use this for things like
[email "torvalds@osdl.org"]
name = Linus Torvalds
if you wanted to do the "email->name" conversion as part of the config
file format (I'm not claiming that is sensible, I'm just giving it as an
insane example). That would show up as the association
email.torvalds@osdl.org.name -> Linus Torvalds
which is easy to parse (the "." in the email _looks_ ambiguous, but it
isn't: you know that there will always be a single key-name, so you find
the key name with "strrchr(name, '.')" and things are entirely
unambiguous).
Repo-config is updated to be able to parse the new format, and also
write things out in the new format.
[jc: rolled two patches from Linus and one fix-up from Sean into one,
with additional adjustments for t/t1300 test to check the case
insensitiveness of section base and variable and case sensitiveness
of the extended section part. Then stripped some part off to make
the result applicable to the stale 1.3.X series that does not have
recent enhancements. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the variable we need to store should go into a section
that currently only has a single variable (not matching
the one we're trying to insert), we will already be into
the next section before we notice we've bypassed the correct
location to insert the variable.
To handle this case we store the current location as soon
as we find a variable matching the section of our new
variable.
This breakage was brought up by Linus.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from 9f0bb90d161edf8c43f5261d12bf83f14eb02ff4 commit)
Earlier, calling
git-repo-config core.hello
on a .git/config like this:
[core]
hello = world ; a comment
would yield "world " (i.e. with a trailing space).
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from c1aee1fd8d94da9b3c5d2dc1d4264f7e73a58f80 commit)