Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPN-6854: fix data throughput Glean issues #10296

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mcleinman
Copy link
Collaborator

@mcleinman mcleinman commented Feb 22, 2025

Description

We record the data transfer telemetry from wireguard. This wireguard data only resets every time there is a new peer (VPN activation, silent server switch, location switch, etc.). We were sending it in every timer ping - which means we were counting data multiple times, as the subsequent timer/end ping would report the cumulative data since the user connected to the peer.

While there are probably ways to make fancy data science dashboards that only use the latest one reported, it seems easiest to fix this by only reporting data transfer when the peer is changed (or the VPN is turned off). (This also has a benefit of allowing adhoc metrics to be run more easily.)

Desktop - We just need to remove the data transfer metrics from the timer ping.

Android - This one was a bit harder, as there were some underlying data issues I noticed (also fixed in this PR):

  1. First, same as desktop - needed to stop recording data transfer metrics on the timer ping.
  2. On debug timer, we sometimes had an extra Timer ping right after activation. I added a 1 second grace period to wasTimerJustStarted to fix this.
  3. We were sometimes missing end and timer metrics: This was because shouldRecordTimerAndEndMetrics was very, very wrong. On every activation in Android (from app, location switch, silent server switch from app, silent server switch from daemon) we run through turnOn. However, turnOff is only hit when the actual VPN is coming down - not when doing a silent server switch or location switch. Thus, we always want to record ending metrics in turnOff, and we want to skip start metrics in turnOn if it's a silent server switch or location switch. Ultimately I removed the shouldRecordTimerAndEndMetrics, as it was no longer useful.
  4. We weren't recording data metrics when switching locations or doing a silent server switch - and so we needed to capture that data.

iOS - we do not record this data on iOS, so nothing to do here.

I've run through a bunch of tests on macOS and Android -

  1. Turn on, download some data / visit a webpage, turn off.
  2. Turn on, download some data, wait for timer ping, download some data / visit a webpage, wait for another timer ping, turn off.
  3. Turn on, download some data, wait for timer ping, location switch, download some data / visit a webpage, wait for another timer ping, turn off.
  4. Turn on, download some data, wait for timer ping, silent server switch via app, download some data / visit a webpage, wait for another timer ping, turn off.
  5. (Android only) Turn on, download some data, wait for timer ping, silent server switch via daemon, download some data / visit a webpage, wait for another timer ping, turn off.

If there are any other scenarios to consider, please let me know.

Reference

VPN-6854

Checklist

  • My code follows the style guidelines for this project
  • I have not added any packages that contain high risk or unknown licenses (GPL, LGPL, MPL, etc. consult with DevOps if in question)
  • I have performed a self review of my own code
  • I have commented my code PARTICULARLY in hard to understand areas
  • I have added thorough tests where needed

@mcleinman mcleinman requested a review from strseb February 22, 2025 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant