|
| 1 | +# Telemetry Channel Behavior |
| 2 | + |
| 3 | +In every ping there are two channels: |
| 4 | +- App Update Channel |
| 5 | +- Normalized Channel |
| 6 | + |
| 7 | +## Expected Channels |
| 8 | +The traditional channels we expect are: |
| 9 | +- `release` |
| 10 | +- `beta` |
| 11 | +- `aurora` (this is `dev-edition`, and [is just a beta repack](https://developer.mozilla.org/en-US/Firefox/Developer_Edition)) |
| 12 | +- `nightly` |
| 13 | +- `esr` |
| 14 | + |
| 15 | +## App Update Channel |
| 16 | +This is the channel reported by the application directly. This could really be anything, but is usually one of the |
| 17 | +expected release channels listed above. |
| 18 | + |
| 19 | +### Accessing App Update Channel |
| 20 | + |
| 21 | +#### Main Summary |
| 22 | +The field here is called `channel`, e.g. |
| 23 | +``` |
| 24 | +SELECT channel |
| 25 | +FROM main_summary |
| 26 | +WHERE submission_date_s3 = '20180823' |
| 27 | +LIMIT 10 |
| 28 | +``` |
| 29 | + |
| 30 | +#### Other SQL Tables |
| 31 | +This will only be available if the `appUpdateChannel` is available in the schema, [See here for an example](https://github.com/mozilla-services/mozilla-pipeline-schemas/blob/master/schemas/telemetry/anonymous/anonymous.4.parquetmr.txt#L10) |
| 32 | + |
| 33 | +The data will be available as follows: |
| 34 | +``` |
| 35 | +SELECT metadata.app_update_channel |
| 36 | +FROM telemetry_anonymous_parquet |
| 37 | +WHERE submission_date_s3 = '20180823' |
| 38 | +LIMIT 10 |
| 39 | +``` |
| 40 | + |
| 41 | +#### In Raw Pings (Using the Dataset API) |
| 42 | +NOTE: The querying dimension of the dataset API called `appUpdateChannel` sets any channels not |
| 43 | +in the traditional channels list above to `OTHER`. For example, the following would return |
| 44 | +no pings: |
| 45 | +``` |
| 46 | +Dataset.from_source("telemetry").where(appUpdateChannel = "non-normalized-channel-name") |
| 47 | +``` |
| 48 | + |
| 49 | +This would return any non-traditional channels: |
| 50 | +``` |
| 51 | +Dataset.from_source("telemetry").where(appUpdateChannel = "OTHER") |
| 52 | +``` |
| 53 | + |
| 54 | +This field is available in the metadata in the raw pings. |
| 55 | +``` |
| 56 | +pings = Dataset.from_source("telemetry").where(docType = "main", submissionDate = "20180823").records() |
| 57 | +pings.map(lambda x: x.get("meta", {}).get("appUpdateChannel")) |
| 58 | +``` |
| 59 | + |
| 60 | +## Normalized Channel |
| 61 | +This field is a normalization of `appUpdateChannel`. If the channel doesn't match one of those above, |
| 62 | +it is set to `Other`. The only exception is variations on `nightly-cck-*`, which become `nightly`. [See the relevant code here](https://github.com/mozilla-services/lua_sandbox_extensions/blob/14ecde17b118d6734fd70e2dd920d8a91ecf5393/moz_telemetry/modules/moz_telemetry/normalize.lua#L175-L178). |
| 63 | + |
| 64 | +### Accessing Normalized Channel |
| 65 | + |
| 66 | +#### Main Summary |
| 67 | +The field here is called `normalized_channel`, e.g. |
| 68 | +``` |
| 69 | +SELECT normalized_channel |
| 70 | +FROM main_summary |
| 71 | +WHERE submission_date_s3 = '20180823' |
| 72 | +LIMIT 10 |
| 73 | +``` |
| 74 | + |
| 75 | +#### Other SQL Tables |
| 76 | +This will only be available if the `normalizedChannel` is available in the schema, [See here for an example](https://github.com/mozilla-services/mozilla-pipeline-schemas/blob/master/schemas/telemetry/anonymous/anonymous.4.parquetmr.txt#L11) |
| 77 | + |
| 78 | +The data will be available as follows: |
| 79 | +``` |
| 80 | +SELECT metadata.normalized_channel |
| 81 | +FROM telemetry_anonymous_parquet |
| 82 | +WHERE submission_date_s3 = '20180823' |
| 83 | +LIMIT 10 |
| 84 | +``` |
| 85 | + |
| 86 | +#### In Raw Pings (Using the Dataset API) |
| 87 | +This field is available in the metadata in the raw pings. |
| 88 | + |
| 89 | +``` |
| 90 | +pings = Dataset.from_source("telemetry").where(docType = "main", submissionDate = "20180823").records() |
| 91 | +pings.map(lambda x: x.get("meta", {}).get("normalizedChannel")) |
| 92 | +``` |
| 93 | + |
| 94 | +## Raw Ping Example |
| 95 | +Given that you were looking for the channel `nightly-cck-test`, you would do the following: |
| 96 | + |
| 97 | +1. Filter for channel `OTHER`: `pings = Dataset.from_source("telemetry").where(docType = "main", appUpdateChannel = "OTHER")` |
| 98 | +2. Filter the RDD for the full channel: `pings = pings.filter(lambda x: x.get("meta", {}).get("appUpdateChannel") == "nightly-cck-test")` |
| 99 | +3. See that the normalized channels is `nightly`: `pings.map(lambda x: x.get("meta", {}).get("normalizedChannel").distinct()` |
0 commit comments