Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Dashboard is very slow after upgrading to 2.17.0 (Response time increased alot) #9069

Open
narendraalla opened this issue Dec 17, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@narendraalla
Copy link

Describe the bug
After upgrading to opensearch 2.17.0 the UI became very slow, i.e the spinning ball keeps on rotating and loads the data very slow(this happens even on 15 mins time frame)
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'Discover'
  2. Click on 'any index'
    The response time is very slow

Expected behavior
The Dashboard on 2.6.0 is very quick(Tested this with the second cluster which has same data and same resources), however the upgraded cluster dashboard is very slow, you can notice the spinnig wheel keeps on rotating.

OpenSearch Version
Please list the version of OpenSearch being used.
2.17.0
Dashboards Version
Please list the version of OpenSearch Dashboards being used.
2.17.0
Plugins
only using s3 plugin and security plugin apart from that no plugs are installed/newly added after ugprade
Please list all plugins currently enabled.
s3 plugin and security
Screenshots
Screenshot 2024-12-17 at 12 52 36 PM

If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):
These clusters are running on LKE with opensearch operator version 2.6.1 running on cluster which has 2.17.0 opensearch and opensearch operator 2.3.2 on the cluster which has 2.6.0 opensearch version.

  • OS: [e.g. iOS]
  • Browser and version [e.g. 22] Tried on Chrome, firefox, safari all the browsers its same.

Additional context
We are using the opensearch cluster on LKE Env the cluster is a big cluster with around 25 hot, 38 warm and 4 coordinator nodes(may be around 300-400 TB of data), jvm is given max allowed i.e 31 Gb on hot and warm nodes, for dasboard pods we gave 8 gb mem, 8 core cpu each and we have now have 3 dashboard pods, i don't see any pressure on mem or cpu in the pod usage still the UI is very slow when compared with other cluster which has 2.6.0 (both clusters have almost same resources)
Add any other context about the problem here.
When we tried tailing dashboard logs on both the clusters i.e 2.17.0 and 2.6.0 i see a huge increase in response time in the dashboards 2.17.0 where as a similar cluster with opensearch 2.6.0 takes very little response time.
Example logs:

On. 2.17.0

{"type":"response","@timestamp":"2024-12-17T05:55:46Z","tags":[],"pid":1,"method":"post","statusCode":200,"req":{"url":"/internal/search/opensearch-with-long-numerals","method":"post","headers":{"host":"opensearch-dashboard.access.com","x-request-id":"2b5b05f8c9dd0042abfec639b8131083","x-real-ip":"IPADDRESS","x-forwarded-for":"IPADDRESS","x-forwarded-host":"opensearch-dashboard.access.com","x-forwarded-port":"443","x-forwarded-proto":"https","x-forwarded-scheme":"https","x-scheme":"https","x-original-forwarded-for":"IPADDRESS","content-length":"933","referer":"https://opensearch-dashboard.access.com/app/data-explorer/discover","origin":"https://opensearch-dashboard.access.com","x-proxy-remote-user":"xxxx","sec-ch-ua":"\"Chromium\";v=\"128\", "Not;A=Brand";v="24", "Google Chrome";v="128"","dnt":"1","sec-ch-ua-mobile":"?0","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36","osd-version":"2.17.0","content-type":"application/json","osd-xsrf":"osd-fetch","sec-ch-ua-platform":""macOS"","accept":"/","sec-fetch-site":"same-origin","sec-fetch-mode":"cors","sec-fetch-dest":"empty","accept-encoding":"gzip, deflate, zstd","accept-language":"en-GB,en;q=0.9,en-US;q=0.8,bn;q=0.7","x-ray-id":"15985731603335575049","x-ray-path":"10.2.5.5,unix:/var/run/nginx/dialin2642060_1,127.0.0.1"},"remoteAddress":"10.2.11.130","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36","referer":"https://opensearch-dashboard.access.com/app/data-explorer/discover"},"res":{"statusCode":200,"responseTime":13401,"contentLength":9},"message":"POST /internal/search/opensearch-with-long-numerals 200 13401ms - 9.0B"}

On 2.6.0

{"type":"response","@timestamp":"2024-12-17T07:14:59Z","tags":[],"pid":1,"method":"get","statusCode":200,"req":{"url":"/ui/fonts/roboto_mono/RobotoMono-Regular.ttf","method":"get","headers":{"host":"opensearch-1-dashboard.access.com","x-request-id":"aeeafc11c36345dce8f3e1c4f5cb83d0","x-real-ip":"IPADDRESS","x-forwarded-for":"IPADDRESS","x-forwarded-host":"opensearch-1-dashboard.access.com","x-forwarded-port":"443","x-forwarded-proto":"https","x-forwarded-scheme":"https","x-scheme":"https","x-original-forwarded-for":"IPADDRESS","referer":"https://opensearch-1-dashboard.access.com/app/home","origin":"https://opensearch-1-dashboard.access.com","x-proxy-remote-user":"xxxx","sec-ch-ua":"\"Chromium\";v=\"128\", "Not;A=Brand";v="24", "Google Chrome";v="128"","dnt":"1","sec-ch-ua-mobile":"?0","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36","sec-ch-ua-platform":""macOS"","accept":"/","sec-fetch-site":"same-origin","sec-fetch-mode":"cors","sec-fetch-dest":"font","accept-encoding":"gzip, deflate, zstd","accept-language":"en-GB,en;q=0.9,en-US;q=0.8,bn;q=0.7","x-ray-id":"17306355174587786316","x-ray-path":"IPADDRESS,unix:/var/run/nginx/dialin99_1,127.0.0.1"},"remoteAddress":"IPADDRESS","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36","referer":"https://opensearch-1-dashboard.access.com/app/home"},"res":{"statusCode":200,"responseTime":8,"contentLength":9},"message":"GET /ui/fonts/roboto_mono/RobotoMono-Regular.ttf 200 8ms - 9.0B"}

Note: Both these clusters gets same data from the Logstash nodes.

@narendraalla narendraalla added bug Something isn't working untriaged labels Dec 17, 2024
@Hailong-am
Copy link
Collaborator

looks very similar with this issue #8417

@narendraalla
Copy link
Author

@Hailong-am Thankyou! are there any work arounds till a fix is released?

@narendraalla
Copy link
Author

@Hailong-am Quick Question Is this issue confined to only UI or do we face this when we run queries from api/cli?

@kinseii
Copy link

kinseii commented Dec 20, 2024

It seems to me that opensearch-with-long-numerals is the problem, since the logs provided for version 2.6.0 do not have this endpoint.

@Hailong-am
Copy link
Collaborator

@Hailong-am Quick Question Is this issue confined to only UI or do we face this when we run queries from api/cli?

this is UI only, opensearch-with-long-numerals is the api exists in OpenSearchDashboards

@dblock dblock removed the untriaged label Jan 6, 2025
@dblock
Copy link
Member

dblock commented Jan 6, 2025

[Catch All Triage - 1, 2, 3, 4, 5, 6]

@narendraalla
Copy link
Author

@Hailong-am by when we can expect a fix release on this? would it be with 2.18.1 version or 2.17.2 ?

@Hailong-am
Copy link
Collaborator

@Hailong-am by when we can expect a fix release on this? would it be with 2.18.1 version or 2.17.2 ?

@narendraalla can you try with RC4 build for 2.19 to see whether the issue is fixed or not?opensearch-project/opensearch-build#5152 (comment)

@narendraalla
Copy link
Author

@Hailong-am sorry for the delay in my response, i have upgraded lower env clusters and UI is loading better now, once I upgrade our production cluster we will know the response times, lower clusters does not have the same amount of data as in production clusters.

@Hailong-am
Copy link
Collaborator

@Hailong-am sorry for the delay in my response, i have upgraded lower env clusters and UI is loading better now, once I upgrade our production cluster we will know the response times, lower clusters does not have the same amount of data as in production clusters.

@narendraalla sounds great, BTW 2.19 been released, you can use release version instead RC4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants