Skip to content

Performance issue when viewing all packages of an organisation #28255

@Mik4sa

Description

@Mik4sa
Contributor

Description

In addition to #25953 there is atleast on more very slow scenario for a very lot of packages when viewing all packages of an organisation. The total time was about 4m.

I discovered atleast two very slow queries:

exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 AND (pv2.id IS NULL) ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2

and

exec sp_executesql N'SELECT count(*) FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 AND (pv2.id IS NULL)',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2

The first one run 2m46s and the second one 1m20s.
There might be more slow queries.

I'm happy to help and execute queries for test if necessary

Gitea Version

Gitea version 1.21.1 built with GNU Make 4.3, go1.21.4 : bindata, sqlite, sqlite_unlock_notify

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

git version 2.39.1.windows.1

Operating System

Windows Server 2016 - Version 1607 (Build 14393.6452)

How are you running Gitea?

  • Using gitea-1.21.1-windows-4.0-amd64.exe from your download page
  • Registered as a Windows service with the following command line: D:\gitea\gitea.exe web --config D:\gitea\custom\conf\app.ini

Database

MSSQL

Activity

lunny

lunny commented on Nov 28, 2023

@lunny
Member

What's your Gitea version.

Mik4sa

Mik4sa commented on Nov 28, 2023

@Mik4sa
ContributorAuthor

Hi @lunny, as stated above Gitea version 1.21.1

KN4CK3R

KN4CK3R commented on Nov 28, 2023

@KN4CK3R
Member

Could you please check if the queries use indices? All used fields should be covered but maybe we need a combined index.

Mik4sa

Mik4sa commented on Nov 28, 2023

@Mik4sa
ContributorAuthor

Does this help you? This is the execution plan of the first query. I hope german is fine. Tell me otherwise
image

Note: As far as I could see all fields have an index, yes. Also I just refreshed all statistics using the maintenance plan "Gitea"
My workmate just told me that he has created that maintenance plan. He isn't sure though ;)

Mik4sa

Mik4sa commented on Nov 28, 2023

@Mik4sa
ContributorAuthor

When moving the condition (pv2.id IS NULL) from the WHERE clause to the LEFT JOIN clause the query is fast. It finishes in under 1 second. Shouldn't be this fine?

So instead of

exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 AND (pv2.id IS NULL) ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2

I executed this one:

exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 AND (pv2.id IS NULL) INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2
KN4CK3R

KN4CK3R commented on Nov 28, 2023

@KN4CK3R
Member

I hope german is fine.

Immer doch! 😄

I executed this one:

exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 AND (pv2.id IS NULL) INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2

That should just produce a fast SELECT * FROM package_version result I guess.

Could you provide screenshots from mouse-hovering the "Parallelität" blocks?

Mik4sa

Mik4sa commented on Nov 28, 2023

@Mik4sa
ContributorAuthor

I executed this one:

exec sp_executesql N'SELECT * FROM [package_version] LEFT JOIN [package_version] [pv2] ON (package_version.package_id = pv2.package_id AND (package_version.created_unix < pv2.created_unix OR (package_version.created_unix = pv2.created_unix AND package_version.id < pv2.id))) AND pv2.is_internal=@p1 AND (pv2.id IS NULL) INNER JOIN [package] ON package.id = package_version.package_id WHERE package_version.is_internal=@p2 AND package.owner_id=@p3 ORDER BY [package_version].[created_unix] DESC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY',N'@p1 bit,@p2 bit,@p3 bigint',@p1=0,@p2=0,@p3=2

That should just produce a fast SELECT * FROM package_version result I guess.

I'm unsure what you want to tell me with that. The changed query from above is fast. I just tested it. But personally I'm just not 100% sure whether the query is still the same and correct. So that I didn't modified it in a way so it returns different results

Could you provide screenshots from mouse-hovering the "Parallelität" blocks?

Here the images for the first, original query:
image


image


image


image


image

KazzmanK

KazzmanK commented on Nov 28, 2023

@KazzmanK
Contributor

If you add
OPTION (MAXDOP 1)
it will remove all the parallel stuff from query plan and make it easier to tune and identify issues.

looks like query gets latest package versions. Not so good design to handle large amount of data.
Probably larger ID belongs to larger created_unix, so no need of OR, OR is evil here.
Also , select top 1 with ties + order by row_number () over(partition by version order by id desc) may perform better, but mssql specific.

KN4CK3R

KN4CK3R commented on Dec 5, 2023

@KN4CK3R
Member

@Mik4sa Could you please test how fast/slow this query is?
SELECT * FROM package_version WHERE package_version.id IN (SELECT MAX(package_version.id) FROM package_version INNER JOIN package ON package.id = package_version.package_id WHERE package_version.is_internal=0 AND package.owner_id=2 GROUP BY package_version.package_id) ORDER BY package_version.created_unix DESC, package_version.id ASC
It should list all latest versions of packages with the owner = 2 like your query does. In my tests there are no differences in the result set but it's way faster.

Mik4sa

Mik4sa commented on Dec 5, 2023

@Mik4sa
ContributorAuthor

Yes, the query is fast, under 1 second.
The result set is identical but (just in case you missed) not in terms of sorting and not in terms of column count (everything after the first download_count column is "missing").

Also, you search for the latest package version by id as it seems. Should we change this so that a date is used? I mean, what if the ids get reused some day for example?

Mik4sa

Mik4sa commented on Feb 20, 2024

@Mik4sa
ContributorAuthor

@KN4CK3R
What is the current status of this? The query was blazing fast and I would be very happy to have this in the final product.
Is there anything missing here?

Mik4sa

Mik4sa commented on Apr 15, 2024

@Mik4sa
ContributorAuthor

@KN4CK3R Still no update? I thought we get this done very fast. It felt so when you were first replying.
Are there any open tasks?

10 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @lunny@KN4CK3R@Mik4sa@KazzmanK

      Issue actions

        Performance issue when viewing all packages of an organisation · Issue #28255 · go-gitea/gitea