Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native re-implementation of Web Discovery (part 1) #39439

Closed
DJAndries opened this issue Jun 27, 2024 · 11 comments · Fixed by brave/brave-core#24969, brave/brave-core#24970, brave/brave-core#24971 or brave/brave-core#27368

Comments

@DJAndries
Copy link
Collaborator

Currently, the Web Discovery Project exists as part of the embedded Brave extension. In order for Android/iOS users to opt-in and participate in Web Discovery, the client must be re-implemented natively.

Web Discovery sends three types of payloads:

  • alive: a ping to indicate to the Web Discovery servers that the user is opted in
  • query: search engine results, if the query is deemed to not contain private information
  • page: page interaction events, to measure user engagement for non-private pages

The initial re-implementation will only cover the first two payloads. The third payload will be covered in a separate issue. In addition, a separate issue will be needed for iOS in order to cover some WebKit renderer necessities.

A feature controlled by variations (BraveWebDiscoveryNative) will be used to handle the rollout to mobile users. Once the page payload is implemented, we can begin to deprecate the extension code by rolling the feature out to desktop users.

cc @anthonypkeane @rebron @bbondy @aekeus @remusao

@LaurenWags
Copy link
Member

@DJAndries is this ready for QA? If so, can you please add a test plan? Thanks!

@DJAndries
Copy link
Collaborator Author

@DJAndries is this ready for QA? If so, can you please add a test plan? Thanks!

Hey @LaurenWags , we are still waiting for the Android scraping rules to be published. Once that is complete (hopefully over the next couple of days), I will add a test plan, thank you.

cc @remusao

@LaurenWags
Copy link
Member

@DJAndries can we get an update on this please? Thanks.

cc @rebron @kjozwiak @bsclifton

@DJAndries
Copy link
Collaborator Author

@DJAndries can we get an update on this please? Thanks.

cc @rebron @kjozwiak @bsclifton

The Search team are still in the process of updating their scraping patterns, but they confirmed that they should have it wrapped up by the end of the week. I hope to have a test plan ready tomorrow or early next week.

@DJAndries
Copy link
Collaborator Author

DJAndries commented Feb 25, 2025

Here is a test plan, please hold until brave/brave-core#27788 is merged and uplifted:

  1. Start with fresh profile, configure a MITM proxy
  2. Proceed with onboarding, do NOT enable WDP when prompted
  3. Enter QA settings, enter the following command line string: --enable-logging=stderr --vmodule="*/web_discovery/*=2" --wdp-collector-host=https://collector.wdp.brave.software --enable-features="BraveWebDiscoveryNative" --wdp-patterns-url=https://djandries.github.io/patterns.gz
  4. Upon restart, wait a minute, ensure that there are no requests to collector.wdp.brave.software, quorum.wdp.brave.com or djandries.github.io logged in the proxy
  5. Go to privacy settings, enable Web Discovery
  6. Ensure three /join requests are sent to collector.wdp.brave.software, each with a HTTP 200 response.
  7. Ensure a request is sent to https://quorum.wdp.brave.com
  8. Wait a minute, ensure a request to https://djandries.github.io/patterns.gz is made
  9. Go to google.com, query "best cookie recipes"
  10. Watch adb for relevant logs. Ensure the following log appears: Double fetching search page: https://www.google.com/search?q=best+cookie+recipe
  11. Open a new tab, close the Google tab.
  12. Wait a minute. Ensure two requests to https://www.google.com/search?q=best+cookie+recipes are sent around a minute after the original request.
  13. Wait another minute, ensure another request is sent to https://collector.wdp.brave.software/ with a HTTP 200 response.
  14. Look for a log prefixed with "Preparing to report payload". Ensure that the payload structure and content is similar to payload provided below these steps.
  15. Make another query for "best cookie recipes". Wait a few minutes. Ensure a request is NOT sent to POST https://collector.wdp.brave.software/.
  16. Make another query for "best cake recipes". Wait a few minutes. Ensure a request IS sent to POST https://collector.wdp.brave.software/. Ensure that there is another "preparing" log with contents relevant to the query.

At some point, a "preparing" log with the action field set to alive will be sent. There will be a request sent to POST https://collector.wdp.brave.software/ to accompany this log.

Query payload:

{
  "action": "query",
  "anti-duplicates": 7554282,
  "channel": "brave-native-desktop",
  "payload": {
    "ctry": "ca",
    "q": "best cookie recipes",
    "qurl": "https://www.google.com/search?q=best+cookie+recipes",
    "r": {
      "0": {
        "age": null,
        "m": null,
        "t": "The Best Chocolate Chip Cookie Recipe Ever",
        "u": "https://joyfoodsunshine.com/the-most-amazing-chocolate-chip-cookies/"
      },
      "1": {
        "age": null,
        "m": null,
        "t": "The Best Soft Chocolate Chip Cookies - Pinch of Yum",
        "u": "https://pinchofyum.com/the-best-soft-chocolate-chip-cookies"
      },
      "2": {
        "age": null,
        "m": null,
        "t": "The 44 Best Cookie Recipes to Make in 2024",
        "u": "https://www.tasteofhome.com/collection/the-best-cookie-recipes/?srsltid=AfmBOopcHcF4nbQRJjmPFwJTo38-B_Q7_hkyf1azF1ku31-qbPCWbXOq"
      },
      "3": {
        "age": null,
        "m": null,
        "t": "Best Cookie Recipes of All Time",
        "u": "https://www.allrecipes.com/gallery/best-cookie-recipes-of-all-time/"
      },
      "4": {
        "age": null,
        "m": null,
        "t": "Cookie Recipes",
        "u": "https://sallysbakingaddiction.com/category/desserts/cookies/"
      },
      "5": {
        "age": null,
        "m": null,
        "t": "40 Epic Cookie Recipes From Brown Butter to Chocolate ...",
        "u": "https://www.foodandwine.com/best-cookie-recipes-6400978"
      },
      "6": {
        "age": null,
        "m": null,
        "t": "The World's Best Cookie Recipe, According to Redditors",
        "u": "https://www.simplyrecipes.com/reddit-greatest-recipe-ever-8754051"
      },
      "7": {
        "age": null,
        "m": null,
        "t": "The Best Chocolate Chip Cookie Recipe",
        "u": "https://cookiesfordays.com/chocolate-chip-cookie-recipe/"
      },
      "8": {
        "age": null,
        "m": null,
        "t": "Easy Cookie Recipes To Bake Year-Round",
        "u": "https://www.delish.com/cooking/g1956/best-cookies/"
      }
    }
  },
  "sender": "hpnv2",
  "ts": "20250225",
  "type": "wdp",
  "ver": "1.0"
}

@LaurenWags
Copy link
Member

Thanks @DJAndries - looks like this test plan is Android specific though. What should be checked for desktop since this issue has OS/Desktop label? Or are checks on desktop not needed?

cc @kjozwiak

@DJAndries
Copy link
Collaborator Author

Thanks @DJAndries - looks like this test plan is Android specific though. What should be checked for desktop since this issue has OS/Desktop label? Or are checks on desktop not needed?

cc @kjozwiak

We won't be using the native implementation on Desktop just yet. I went ahead and removed the OS/Desktop label; no QA is needed for Desktop.

@kjozwiak
Copy link
Member

So brave/brave-core#27788 was merged into master and uplifted into 1.76.x via brave/brave-core#27806. However, not removing the QA/Blocked as @DJAndries mentioned we need one more issue/uplift before being able to run through #39439 (comment). @DJAndries mind listing/mentioning the PR/issue so we can keep an eye on it?

@DJAndries
Copy link
Collaborator Author

QA is blocked until brave/brave-core#27814 is merged

@kjozwiak
Copy link
Member

QA is blocked until brave/brave-core#27814 is merged

@Uni-verse @hffvld unblocked as the above was merged/uplifted into 1.76.x. Requires 1.76.70 or higher for #39439 (comment) to be verified.

@hffvld hffvld added the QA/In-Progress Indicates that QA is currently in progress for that particular issue label Feb 27, 2025
@hffvld
Copy link
Contributor

hffvld commented Feb 27, 2025

Verified on Galaxy Tab S8 and Pixel 7 using version(s):

Device/OS: 
- Galaxy Tab S8 / gts8wifixx-user 14 UP1A.231005.007 release-keys
- Pixel 7 / panther_beta-user 16 BP22.250103.008 release-keys
Brave build: 1.76.70 
Chromium: 134.0.6998.39 (Official Build) (64-bit) 

STEPS:

  1. Follow the STR/TP from Native re-implementation of Web Discovery (part 1) #39439 (comment)
  2. Verify

ACTUAL RESULTS:

  • Verified the above behavior is correctly reproduced.
Galaxy Tab S8
Web Discovery is off Web Discovery is on: /join request 1 Web Discovery is on: /join request 2 Web Discovery is on: /join request 3
Image Image Image Image
https://quorum.wdp.brave.com request https://djandries.github.io/patterns.gz request https://www.google.com/search?q=best+cookie+recipes original request 1 https://www.google.com/search?q=best+cookie+recipes original request 2
Image Image Image Image
Double fetching search page: log https://www.google.com/search?q=best+cookie+recipes request 1 after ~1 min https://www.google.com/search?q=best+cookie+recipes request 2 after ~1 min https://collector.wdp.brave.software/ request after ~1 min
Image Image Image Image
Preparing to report payload log Preparing to report payload log Submission result: 1 log best cookie recipes repeated request
Image Image Image Image
best cake recipes request https://collector.wdp.brave.software/ request Preparing to report payload log Preparing to report payload log
Image Image Image Image
Pixel 7
Web Discovery is off Web Discovery is on: /join request 1 Web Discovery is on: /join request 2 Web Discovery is on: /join request 3
Image Image Image Image
https://quorum.wdp.brave.com request https://djandries.github.io/patterns.gz request https://www.google.com/search?q=best+cookie+recipes original request 1 https://www.google.com/search?q=best+cookie+recipes original request 2
Image Image Image Image
Double fetching search page: log https://www.google.com/search?q=best+cookie+recipes request 1 after ~1 min https://www.google.com/search?q=best+cookie+recipes request 2 after ~1 min https://collector.wdp.brave.software/ request after ~1 min
Image Image Image Image
Preparing to report payload log Preparing to report payload log Submission result: 1 log best cookie recipes repeated request
Image Image Image Image
best cake recipes request https://collector.wdp.brave.software/ request Preparing to report payload log Preparing to report payload log
Image Image Image Image

@hffvld hffvld added QA Pass - Android ARM QA Pass - Android Tab and removed QA/In-Progress Indicates that QA is currently in progress for that particular issue labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment