-
-
Notifications
You must be signed in to change notification settings - Fork 6k
Closed
Labels
performance/speedperformance issues with slow downsperformance issues with slow downstopic/packagestype/bug
Description
Description
In addition to #25953 there is atleast on more very slow scenario for a very lot of packages when viewing all packages of an organisation. The total time was about 4m.
I discovered atleast two very slow queries:
exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 AND (pv2.id IS NULL) ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2
and
exec sp_executesql N'SELECT count(*) FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 AND (pv2.id IS NULL)',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2
The first one run 2m46s and the second one 1m20s.
There might be more slow queries.
I'm happy to help and execute queries for test if necessary
Gitea Version
Gitea version 1.21.1 built with GNU Make 4.3, go1.21.4 : bindata, sqlite, sqlite_unlock_notify
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
git version 2.39.1.windows.1
Operating System
Windows Server 2016 - Version 1607 (Build 14393.6452)
How are you running Gitea?
- Using
gitea-1.21.1-windows-4.0-amd64.exe
from your download page - Registered as a Windows service with the following command line:
D:\gitea\gitea.exe web --config D:\gitea\custom\conf\app.ini
Database
MSSQL
Metadata
Metadata
Assignees
Labels
performance/speedperformance issues with slow downsperformance issues with slow downstopic/packagestype/bug
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
lunny commentedon Nov 28, 2023
What's your Gitea version.Mik4sa commentedon Nov 28, 2023
Hi @lunny, as stated above
Gitea version 1.21.1
KN4CK3R commentedon Nov 28, 2023
Could you please check if the queries use indices? All used fields should be covered but maybe we need a combined index.
Mik4sa commentedon Nov 28, 2023
Does this help you? This is the execution plan of the first query. I hope german is fine. Tell me otherwise

Note: As far as I could see all fields have an index, yes. Also I just refreshed all statistics using the maintenance plan "Gitea"
My workmate just told me that he has created that maintenance plan. He isn't sure though ;)
Mik4sa commentedon Nov 28, 2023
When moving the condition
(pv2.id IS NULL)
from the WHERE clause to the LEFT JOIN clause the query is fast. It finishes in under 1 second. Shouldn't be this fine?So instead of
I executed this one:
KN4CK3R commentedon Nov 28, 2023
Immer doch! 😄
That should just produce a fast
SELECT * FROM package_version
result I guess.Could you provide screenshots from mouse-hovering the "Parallelität" blocks?
Mik4sa commentedon Nov 28, 2023
I'm unsure what you want to tell me with that. The changed query from above is fast. I just tested it. But personally I'm just not 100% sure whether the query is still the same and correct. So that I didn't modified it in a way so it returns different results
Here the images for the first, original query:

KazzmanK commentedon Nov 28, 2023
If you add
OPTION (MAXDOP 1)
it will remove all the parallel stuff from query plan and make it easier to tune and identify issues.
looks like query gets latest package versions. Not so good design to handle large amount of data.
Probably larger ID belongs to larger created_unix, so no need of OR, OR is evil here.
Also , select top 1 with ties + order by row_number () over(partition by version order by id desc) may perform better, but mssql specific.
KN4CK3R commentedon Dec 5, 2023
@Mik4sa Could you please test how fast/slow this query is?
SELECT * FROM package_version WHERE package_version.id IN (SELECT MAX(package_version.id) FROM package_version INNER JOIN package ON package.id = package_version.package_id WHERE package_version.is_internal=0 AND package.owner_id=2 GROUP BY package_version.package_id) ORDER BY package_version.created_unix DESC, package_version.id ASC
It should list all latest versions of packages with the owner = 2 like your query does. In my tests there are no differences in the result set but it's way faster.
Mik4sa commentedon Dec 5, 2023
Yes, the query is fast, under 1 second.
The result set is identical but (just in case you missed) not in terms of sorting and not in terms of column count (everything after the first
download_count
column is "missing").Also, you search for the latest package version by id as it seems. Should we change this so that a date is used? I mean, what if the ids get reused some day for example?
Mik4sa commentedon Feb 20, 2024
@KN4CK3R
What is the current status of this? The query was blazing fast and I would be very happy to have this in the final product.
Is there anything missing here?
Mik4sa commentedon Apr 15, 2024
@KN4CK3R Still no update? I thought we get this done very fast. It felt so when you were first replying.
Are there any open tasks?
10 remaining items