Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use git plumbing for upload rather than porcelain #5621

Closed
wants to merge 8 commits into from

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented Jan 2, 2019

When uploading files using the GUI, Gitea currently does:

a) A full clone - Including the whole object DB
b) A full branch checkout - every file
c) Copies the uploaded file to this cloned repository
d) Add/stages the changes
e) Commits
f) Pushes to the real repository
g) Cleans up the cloned repository - deleting the whole object db and checkout.

This is extremely inefficient.

This PR takes a different approach and, instead of using the git porcelain commands (git add and the like), uses the plumbing commands to (hopefully) significantly speed up this process. See my comments in #601 (#601 (comment)).

Currently my approach is to use the git-cli commands which can/could be migrated to go-git commands later.

Will fix #5600.

@zeripath zeripath force-pushed the make-upload-not-clone branch 2 times, most recently from c1d1704 to fbee8f8 Compare January 2, 2019 18:42
@codecov-io
Copy link

codecov-io commented Jan 2, 2019

Codecov Report

Merging #5621 into master will increase coverage by 0.11%.
The diff coverage is 44.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5621      +/-   ##
==========================================
+ Coverage   37.81%   37.93%   +0.11%     
==========================================
  Files         322      322              
  Lines       47485    47682     +197     
==========================================
+ Hits        17957    18087     +130     
- Misses      26939    26995      +56     
- Partials     2589     2600      +11
Impacted Files Coverage Δ
routers/repo/editor.go 28.51% <26.47%> (-0.51%) ⬇️
models/repo_editor.go 31.47% <45.9%> (+17.49%) ⬆️
models/repo_branch.go 58.27% <56%> (-0.46%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e90103...23b43cd. Read the comment docs.

@bkcsoft bkcsoft added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 2, 2019
@zeripath
Copy link
Contributor Author

zeripath commented Jan 2, 2019

I don't currently set the SSH_ORIGINAL_COMMAND environment variable, meaning that although the gitea post-receive hooks et al are run, they short circuit and don't do anything.

It wouldn't be hard to set this - the value isn't actually checked (it just needs to be set) - however, that will make a change from the previous settings.

@zeripath
Copy link
Contributor Author

zeripath commented Jan 2, 2019

I've completed reimplementing repo_editor.go. So it's good to look at but I don't think it's ready for merging.

@zeripath
Copy link
Contributor Author

zeripath commented Jan 2, 2019

My simple tests show a 10x speed up for preview diff of a very small file in a very small repo, from 340ms to 34ms. I saw similar speed ups for repouploadfile. I didn't actually check the timings for repouploadfiles but expect it to be similar.

Next steps are to move most, if not all of the helper functionality into the git module.

Having to do git ls-tree -r | git update-index --index-info (or thereabouts) seems wasteful and I think it should be possible to further optimise this. However, the timings above seem to indicate that there's still significant speed ups even with this.


OK, this command can be replaced by:

git read-tree HEAD with suitable options.

Further if you place your working directory as the bare repository then commands become a lot simpler.

I've therefore rebased my work on this.

@techknowlogick techknowlogick added the pr/wip This PR is not ready for review label Jan 3, 2019
Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath zeripath changed the title WIP: Use git plumbing for upload rather than porcelain Use git plumbing for upload rather than porcelain Jan 4, 2019
@techknowlogick techknowlogick added type/refactoring Existing code has been cleaned up. There should be no new functionality. and removed pr/wip This PR is not ready for review labels Jan 5, 2019
@yasuokav
Copy link
Contributor

yasuokav commented Jan 6, 2019

Will this PR fix #4778?

@zeripath
Copy link
Contributor Author

zeripath commented Jan 6, 2019

My current implementation doesn't account for LFS. It should be relatively easy to add though.

@zeripath
Copy link
Contributor Author

zeripath commented Jan 8, 2019

@filipnavara would it be possible for you to take a look at this and tell me if it works for you and your large repos?

@filipnavara
Copy link
Contributor

@zeripath I just finished updating my Gitea installation and called it a day... Will review and give it a try tomorrow. :-)

@lunny lunny added this to the 1.8.0 milestone Jan 9, 2019
@zeripath
Copy link
Contributor Author

zeripath commented Jan 9, 2019

@yasuokav ok, it looks like this might be possible however I need to refactor these code paths up and out models to allow us to use the LFS module, otherwise when it comes to store we have to reimplement the ContentStore. (Finding that out was this evening's work.) I might be able to take a look again at this tomorrow.

@zeripath
Copy link
Contributor Author

Ok I would have had an updated pr to push if I hadn't spent an hour debugging failing uploads because I had written if err == nil instead of != however it looks like it will work.

@zeripath
Copy link
Contributor Author

OK, I have made this work with LFS. However, in order to make this PR a bit more readable I'm going to close it and reopen with a cleaner history.

@zeripath
Copy link
Contributor Author

@filipnavara @yasuokav @xxxTonixxx @techknowlogick @lunny @typeless I've decided this PR is too busy and have reopened in #5702 with LFS support.

@lunny lunny removed this from the 1.8.0 milestone Jan 12, 2019
zeripath added a commit to zeripath/gitea that referenced this pull request Jan 20, 2019
zeripath added a commit to zeripath/gitea that referenced this pull request Jan 20, 2019
zeripath added a commit to zeripath/gitea that referenced this pull request Jan 20, 2019
zeripath added a commit to zeripath/gitea that referenced this pull request Jan 20, 2019
lafriks pushed a commit that referenced this pull request Feb 12, 2019
… LFS support (#5702)

* Use git plumbing for upload: #5621 repo_editor.go: UploadRepoFile

* Use git plumbing for upload: #5621 repo_editor.go: GetDiffPreview

* Use git plumbing for upload: #5621 repo_editor.go: DeleteRepoFile

* Use git plumbing for upload: #5621 repo_editor.go: UploadRepoFiles

* Move branch checkout functions out of repo_editor.go as they are no longer used there

* BUGFIX: The default permissions should be 100644

    This is a change from the previous code but is more in keeping
    with the default behaviour of git.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Standardise cleanUploadFilename to more closely match git

See verify_path in: https://github.com/git/git/blob/7f4e64169352e03476b0ea64e7e2973669e491a2/read-cache.c#L951

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Redirect on bad paths

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Refactor to move the uploading functions out to a module

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add LFS support

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Update upload.go attribution header

Upload.go is essentially the remnants of repo_editor.go. The remaining code is essentially unchanged from the Gogs code, hence the Gogs attribution.

* Delete upload files after session committed

* Ensure that GIT_AUTHOR_NAME etc. are valid for git

see #5774

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add in test cases per @lafriks comment

* Add space between gitea and github imports

Signed-off-by: Andrew Thornton <art27@cantab.net>

* more examples in TestCleanUploadName

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix formatting

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Set the SSH_ORIGINAL_COMMAND to ensure hooks are run

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Switch off SSH_ORIGINAL_COMMAND

Signed-off-by: Andrew Thornton <art27@cantab.net>
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. type/refactoring Existing code has been cleaned up. There should be no new functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gitea server should not checkout on upload
7 participants