Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: How to deal with agencies with calitp ids #301

Open
vevetron opened this issue Aug 22, 2024 · 0 comments
Open

Bug: How to deal with agencies with calitp ids #301

vevetron opened this issue Aug 22, 2024 · 0 comments

Comments

@vevetron
Copy link
Contributor

vevetron commented Aug 22, 2024

A few orgs like Dumbarton Express, banning pass (City of Banning), don't have cal-itp ids. Thus they don't show up on the cal-itp-data-infra.staging.int_gtfs_quality__organization_dataset_map, the queries don't pull them down and this repo can't handle them.

To Reproduce

int_gtfs__organization_dataset_map AS (

    SELECT *
    -- FROM {{ ref('int_gtfs_quality__organization_dataset_map') }}
    FROM cal-itp-data-infra.staging.int_gtfs_quality__organization_dataset_map
    WHERE public_customer_facing_or_regional_subfeed_fixed_route

)
SELECT
name,
  organization_name,
  organization_itp_id,
  organization_source_record_id,
  organization_key,
  feed_key,
  date_trunc(
    `mart_gtfs.fct_scheduled_trips`.`service_date`,
    day
  ) AS `service_date`,
  (
    (
      CAST(
        sum(
          CASE
            WHEN `Fct_Observed_Trips`.`tu_num_distinct_message_ids` > 0 THEN 1
            ELSE 0.0
          END
        ) AS float64
      ) / CASE
        WHEN count(*) = 0 THEN NULL
        ELSE count(*)
      END
    ) * 100
  ) AS `__of_trips_with_TU_messages`
FROM
  `mart_gtfs.fct_scheduled_trips`
 
LEFT JOIN `mart_gtfs.fct_observed_trips` `Fct_Observed_Trips` ON `mart_gtfs.fct_scheduled_trips`.`trip_instance_key` = `Fct_Observed_Trips`.`trip_instance_key`
LEFT JOIN int_gtfs__organization_dataset_map
        ON int_gtfs__organization_dataset_map.schedule_feed_key = `mart_gtfs.fct_scheduled_trips`.feed_key
WHERE
  (
    `mart_gtfs.fct_scheduled_trips`.`service_date` >= date_trunc(
      date_add(
        current_date('America/Los_Angeles'),
        -- INTERVAL -30 day
        INTERVAL -1 day
      ),
      day
    )
   
   AND `mart_gtfs.fct_scheduled_trips`.`service_date` < date_trunc(current_date('America/Los_Angeles'), day)
    -- AND `mart_gtfs.fct_scheduled_trips`.`name` = 'Bay Area 511 Santa Clara Transit Schedule'
  )
GROUP BY
  1,2,3,4,5,6,7
ORDER BY
-- organization_name, `service_date` ASC
name, `service_date` ASC

Look for the nulls
Expected behavior
Somehow we can handle these cases.

Additional context
There is some long history about how we stopped using calitp ids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant