* Initial draft for data provenance
We want to make the data usable/available at every step of our data
pipeline. This starts te addition to the README that spells out the data
provenance and where each version of the data as it goes through our
pipeline lives.
* Update README with placeholders for next steps in data provenance
* Add coming soon placeholders for remaining data locations
Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
* Fixes#456 - Our data directory should adopt standard python package structure
* a few missed references
* updating readme
* updating requirements
* Running Black
* Fixes for flake8
* updating pylint
* Minor documentation updates, plus calenvironscreen S3 URL fix
* Update score comparison docs and code
* Add steps for running the comparison tool
* Update HUD recap ETL to ensure GEOID is imported as a string (if it is
imported as an interger by default it will strip the beginning "0" from
many IDs)
* Add note about execution time
* Move step from paragraph to list
* Update output dir in README for comp tool
Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
* initial checkin
* gitignore and docker-compose update
* readme update and error on hud
* encoding issue
* one more small README change
* data roadmap re-strcuture
* pyproject sort
* small update to score output folders
* checkpoint
* couple of last fixes