Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the HUD JSON files to pull from machine readable excel file provided by HUD #4088

Open
wants to merge 208 commits into
base: stable
Choose a base branch
from

Conversation

dtgreiner
Copy link
Contributor

@dtgreiner dtgreiner commented Mar 5, 2024

Please squash merge this PR

Description

Adding a generation process to read the new machine readable excel files for the HUD Lists and generate the JSON file used in the rest of the process.

Type of change

  • New feature (adds functionality)

Checklist before requesting review

  • I have performed a self-review of my code
  • I have run the code that is being changed under ideal conditions, and it doesn't fail
  • My code includes comments and/or descriptive variable names to help other engineers understand the intent (or not applicable)
  • My code follows the style guidelines of this project (rubocop)
  • I have updated the documentation (or not applicable)
  • If it's not obvious how to test this change, I have provided testing instructions in this PR or the related issue

@dtgreiner dtgreiner requested review from eanders and gigxz March 5, 2024 18:57
@dtgreiner
Copy link
Contributor Author

dtgreiner commented Mar 5, 2024

@gigxz
I've added a process for generating the HUD list json files used in the HUD List generation process. I've got a couple things I wanted to run by you to make sure it won't break anything in HMIS. The post-processing json file is included in this PR.

  1. Does capitalization matter (e.g. "Path Services" vs "Path services")? There are a good amount of descriptions with slightly varied names. in the excel file vs what we have in the current JSON file
  2. There are some character differences in the list items (e.g. FundingSource (2.06.1) Item 21 has the encoded & "\u0026" instead of the &, or using ’ instead of ' ) If these are going to break things down the line, I can make adjustments in the JSON generator

Thanks!

Comment on lines 151 to 152
# TODO: is there a way to better skip these?
ignore_codes = ['List', '*', '(see note)']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.

Suggested change
# TODO: is there a way to better skip these?
ignore_codes = ['List', '*', '(see note)']
# Some codes are irrelevant to this transformation, we'll ignore them based on string comparison
ignore_codes = ['List', '*', '(see note)']

Comment on lines 166 to 167
codes[code] ||= {}
(codes[code][:list_name] ||= []) << list_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's some hairy syntax, maybe?

Suggested change
codes[code] ||= {}
(codes[code][:list_name] ||= []) << list_name
codes[code] ||= { list_name: [] }
codes[code][:list_name] << list_name

end
end
# Sheet 1: List|Value|Text
# This sheet does not have names for the lists, but it may include some that are note referenced in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This sheet does not have names for the lists, but it may include some that are note referenced in
# This sheet does not have names for the lists, but it may include some that are not referenced in

Comment on lines 183 to 184
codes[code] ||= { list_name: ['Unknown'] }
(codes[code][:values] ||= {})[value] = text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something, but ?

Suggested change
codes[code] ||= { list_name: ['Unknown'] }
(codes[code][:values] ||= {})[value] = text
codes[code] ||= { list_name: ['Unknown'], values: {} }
codes[code][:values][value] = text


This will generate the JSON files. To account for some inconsistencies with the excel data, some names will be generated as "Unknown" and will need to be verified manually with the lists specified in the HUD HMIS CSV specifications.

After changes to the JSON files have been confirmed, please run the following tasks to re-generate dependent code:

```
rails code:generate_hud_lists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!!

Running this (and the line below it) shows the impact of this change. We do use the labels in the HMIS UI (as does the warehouse) so the casing changes will show up in the interface. Can you run those scripts and re-push @dtgreiner so we can review the effects of this change?

I need to take time to go through it more closely. A few items of concern jump out to me, though:

  • label change Bed Night => BedNight
  • label changes where the apostrophe becomes a different type of apostrophe. specifically Client doesn’t know I want to avoid churn on the 1.8 list (yes/no/dk/r/dnc) because we use that one for custom data elements which store the string label value). would prefer to avoid this churn
  • label changes where they got more verbose (relationships_to_hoh) - I can/will add overrides to the frontend if we want to keep it this way in the backend
  • labels where n-dash is replaced with m-dash -- feels like unnecessary churn but is probably fine
  • record_types label casing became lowercase
  • labels where instructional parenthesized text is ok -- maybe fine, but I want to go through them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also hitting this -- I had separated out all the different living situation categories in hud_lists.json so we need to keep that somehow. (probably as additional_lists)

82c816eeaedb:/app$ rails driver:hmis:dump_graphql_schema
rails aborted!
NoMethodError: undefined method `destinations' for HudUtility2024:Module

    hud_enum HudUtility2024.destinations
                           ^^^^^^^^^^^^^
Did you mean?  destination_type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still get errors when running rails driver:hmis:dump_graphql_schema due to some missing enums. Sorry I'm assuming this probably wasn't part of the original scope of this work @dtgreiner @eanders. I'm happy to take on the graphql parts of things if we want to defer that, but we probalby won't schedule work for that for several weeks, and I don't love the idea of the schema being out-of-sync for so long from the source files. Maybe we can check in about this today?

@dtgreiner
Copy link
Contributor Author

@eanders @gigxz I've updated the hud list json generator to clean the apostrophes and dashes to match what is currently in the system and included the destinations list that was missing.

I've included the updated ruby generation in this PR but I believe the goal is to get this json file close enough and then do a manual check to clean up any discrepancies before generating the ruby lists. With that said, if there are any changes needed, I think we will want to do a manual adjustment of the JSON files and then regenerate.


private def clean_json_text(str)
str = str.strip.delete("\u00A0").
gsub(/\u2019/, "'"). # replace the character U+2019 "’" could be confused with the ASCII character U+0027 "'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we're having fun!

Comment on lines +929 to +931
2 => "Head of household's child",
3 => "Head of household's spouse or partner",
4 => "Head of household's other relation member",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this is the language HUD provides. @gigxz didn't we recently override these because they didn't really make sense in the app?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah can we change these back to the old language? If you want to keep them like this in the warehouse, I can add a custom Enum for HMIS that has the labels we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I like the old language better. We should override.

# 4.14
def bed_night_options
{
200 => 'Bed Night',
200 => 'BedNight',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an explicit replacement for this BedNight => Bed Night

Comment on lines +1757 to +1759
1 => '44929',
2 => '45023',
3 => '45149',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a failure. Is this wrong in the source file? We should report it if it is.
Screenshot 2024-03-11 at 4 45 54 PM

Comment on lines +1787 to +1792
1 => 'Criminal activity/destruction of property/violence',
2 => 'Non-compliance with project rules',
3 => 'Non-payment of rent/occupancy charge',
4 => 'Reached maximum time allowed by project',
5 => 'Project terminated',
6 => 'Unknown/disappeared',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, this looks like we just had the wrong set. And this is why we wanted to automate it.

Comment on lines +2366 to +2367
2 => "Youngest child is 1 to 6 years old and/or one or more children (any age) require significant
care",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is there a \n in there somewhere. Maybe we should strip newlines from the content, I don't think any of the values should contain them.

Comment on lines -473 to -477
class NoPointsYes < Types::BaseEnum
description 'V7.1'
graphql_name 'NoPointsYes'
hud_enum HudUtility2024.no_points_yes_options
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats ok it is not used in the graphql schema. IDK what this was but question V7.1 uses the standard 1.7 list.

Copy link
Contributor

@eanders eanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dtgreiner dropped a couple notes, but generally I think this looks right. @gigxz will probably have some detailed feedback around the overall json file.

47 => 'HUD: ESG - CV',
48 => 'HUD: HOPWA - CV',
49 => 'HUD: CoC - Joint Component RRH/PSH [Deprecated]',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is fine, i.e. won't have any impact in hmis. I just want to flag that we had a convo around intentionally keeping deprecated options around, so that they show up in the HMIS as their real value, rather than just "invalid value". @eanders I think that was still the right decision. Will there be a way to do that going forward?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I think we should figure out a mechanism to preserve removed values (where we think it important, like this one.) Maybe we can provide another set of optional lists that get merged into (reverse merged maybe) the provided lists.

@@ -1194,7 +1199,7 @@ def data_collection_stages
2 => 'Update',
3 => 'Project exit',
5 => 'Annual assessment',
6 => 'Post-exit',
6 => 'Post-exit (not used in CSV)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this extra parenthesized text?

Comment on lines -1209 to +1215
2 => 'Financial assistance for Moving On (e.g., security deposit, moving expenses)',
3 => 'Non-financial assistance for Moving On (e.g., housing navigation, transition support)',
2 => 'Financial assistance for Moving On',
3 => 'Non-financial assistance for Moving On',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self @gigxz add this label back custom on the frontend. I think its helpful, but I can see why it might not be needed in the warehouse. @eanders

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need it in the warehouse, but I also don't think it hurts anything if we want to keep it.

@@ -1675,6 +1680,7 @@ def reason_not_enrolleds
{
1 => 'Client was found ineligible for PATH',
2 => 'Client was not enrolled for other reason(s)',
3 => 'Unable to locate client',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops we were missing an option here. nice


class EarlyExitReason < Types::BaseEnum
description 'R17.A'
graphql_name 'EarlyExitReason'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, it looks like we are not using this enum in the place that we should. hud_field :early_exit_reason, Types::HmisSchema::Enums::Hud::ExpelledReason should be using this enum. I can make a note to myself to fix after.

@@ -560,18 +554,6 @@ class SubsidyInformation < Types::BaseEnum
hud_enum HudUtility2024.subsidy_informations
end

class SubsidyInformationA < Types::BaseEnum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed but we need it in the schema

Copy link
Contributor

@gigxz gigxz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eanders and others added 25 commits April 19, 2024 09:15
* Possible fix for non-dev assets

* Another attempt at fixing js-bundling JS in publishing to S3

* Cleanup file name calculation
* aws_s3_fix_delete_method

bugfix for delete method used by external form submission processing,
add test

* aws_s3_fix_delete_method

fix content type

---------

Co-authored-by: Gig <[email protected]>
* Enforce exit date when calculating CH at entry

* Add task to rebuild closed CH Enrollment data
* Add VA Dashboard

* Additional cleanup
* Additional timeliness options

* Remove the 11 days or more option, not sure why we have it
* Pull CAS information from client contact records and CE assessment assessor

* Wrap default shelter contact email
Refactor how HMIS handles 'wip' project/enrollment relationship. Adds a direct relationship between enrollments and projects.

* replace conditional code and extra associations with simpler project/enrollment relation
* project.enrollment and project.households now include WIP enrollments
* simplify hmis_households db view, add direct relationship to households
* remove hmis_client_projects db view
* enhance project/enrollment validation
* move `belongs_to enrollment` to ClientProjectEnrollmentRelated mixin
* disables old importer specs
…erride (#4263)

* Apply active_homeless_status_override to active_in_cas calc

* Update language

* Add test cases

* Adding a note about CAS availability, completely unrelated to this PR

---------

Co-authored-by: Elliot Anders <[email protected]>
* PIT DV calculation

* PIT link shoud be a button to match other HUD reports

* Change DV calculation in 2024, not 2023
* Add task for one-off date updates

* Respond to PR comments

* Scope projects to hmis only

Co-authored-by: Gig <[email protected]>

* Update drivers/hmis/lib/hmis_data_cleanup/fix_enrollment_dates_20240416.rb

---------

Co-authored-by: Gig <[email protected]>
* Bed Nights dont error if entry date has a warning

* Adjust logic to keep some warnings
@dtgreiner dtgreiner changed the base branch from stable to release-113 April 19, 2024 13:33
@dtgreiner
Copy link
Contributor Author

There were some issue merging the latest data into this PR, so I've created a new PR based on the up to date release branch without the generated JSON or HudList ruby files. We are planning on manually adjusting the JSON after generation to account for any text formatting changes or other overrides from the CSV file.

Base automatically changed from release-113 to stable May 1, 2024 18:04
@eanders eanders changed the base branch from stable to release-120 June 6, 2024 18:10
@eanders eanders changed the base branch from release-120 to stable June 6, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants