-
Notifications
You must be signed in to change notification settings - Fork 26
[ci rerun-skip] replace pocket interactions dataset with newtab visits and update exposure_signal #1239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
✅ Jetstream Validation is complete. Check the CI logs for this step for Query SQL and data processing estimates. |
mikewilli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add [ci rerun-skip] to the beginning of the PR title before merging so that analysis is not rerun.
I left a comment on the data source definition about the client ID, but the affected experiment here uses the profile_group_id as its analysis unit (which you can find in the Experimenter API -- this is the default for the past ~6 months on desktop). This means that you don't have to worry about the fact that the new table doesn't have a legacy ID. If this data source were used in other older experiments, you would want to consider this, but since it's only used for this one experiment it's not a problem.
definitions/firefox_desktop.toml
Outdated
| client_id_column = "legacy_telemetry_client_id" | ||
| description = "Visits-level table or Newtab Homepage Data" | ||
| friendly_name = "Newtab Visits Daily" | ||
| client_id_column = "client_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"client_id" is the default value for client_id_column (the default values are derived from this enum), so you don't need to specify it if you don't want. If you want to be explicit then feel free to leave it in.
However, you may want to specify that this is to be used as the Glean client ID like so:
glean_client_id_column = "client_id"
This is separate from the client_id_column and was added as a convenience for validating analysis before migrating from legacy to glean identifiers as the default. This parameter is null by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think it can be removed. Pushed a fix
land-edi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's ![]()
|
✅ Jetstream Validation is complete. Check the CI logs for this step for Query SQL and data processing estimates. |
The pocket interactions dataset is huge and makes experiments fragile. This replaces it with the much smaller newtab_visits datatset and updates the exposure signal accordingly