NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
This commit is contained in:
Jorge Escobar 2022-08-09 16:37:10 -04:00 committed by GitHub
commit 1c448a77f9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
15 changed files with 272 additions and 3485 deletions

View file

@ -8,6 +8,7 @@ import shutil
import uuid
import zipfile
from pathlib import Path
from marshmallow import ValidationError
import urllib3
import requests
import yaml
@ -350,7 +351,13 @@ def load_yaml_dict_from_file(
# validate YAML
yaml_config_schema = class_schema(schema_class)
yaml_config_schema().load(yaml_dict)
try:
yaml_config_schema().load(yaml_dict)
except ValidationError as e:
logger.error(f"Invalid YAML config file {yaml_file_path}")
logger.error(e.normalized_messages())
sys.exit()
return yaml_dict