Commit graph

678 commits

Author SHA1 Message Date
Emma Nechamkin
0115239e50 updated with some matt comments 2022-09-07 15:22:13 -04:00
Emma Nechamkin
6d9e11d081 welp, i've hit the census api too many times today 2022-09-06 16:29:21 -04:00
Emma Nechamkin
b7af13b2a6 Merge branch 'emma-nechamkin/release/score-narwhal' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-31 14:29:45 -04:00
Emma Nechamkin
5201f9e457
Adding tests to ensure proper calculations (#1871)
* just testing that the boolean is preserved on gha
* checking drop tracts works
* adding a check to the agvalue calculation for nri
* updated with error messages
2022-08-31 14:26:55 -04:00
Emma Nechamkin
b0b7ff0eec
just testing that the boolean is preserved on gha (#1867)
* updated with hopefully a fix; coercing aml, fuds, hrs to booleans for the raw value to preserve null character.
2022-08-31 12:55:03 -04:00
Emma Nechamkin
7c6a9078e3 Merge branch 'emma-nechamkin/1849-calculation-tests' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-31 10:25:55 -04:00
Emma Nechamkin
6e575c6110 Merge branch 'emma-nechamkin/release/score-narwhal' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-30 14:16:00 -04:00
Emma Nechamkin
1c4d3e4142
Score tests (#1847)
* update Python version on README; tuple typing fix

* Alaska tribal points fix (#1821)

* Bump mistune from 0.8.4 to 2.0.3 in /data/data-pipeline (#1777)

Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* poetry update

* initial pass of score tests

* add threshold tests

* added ses threshold (not donut, not island)

* testing suite -- stopping for the day

* added test for lead proxy indicator

* Refactor score tests to make them less verbose and more direct (#1865)

* Cleanup tests slightly before refactor (#1846)

* Refactor score calculations tests

* Feedback from review

* Refactor output tests like calculatoin tests (#1846) (#1870)

* Reorganize files (#1846)

* Switch from lru_cache to fixture scorpes (#1846)

* Add tests for all factors (#1846)

* Mark smoketests and run as part of be deply (#1846)

* Update renamed var (#1846)

* Switch from named tuple to dataclass (#1846)

This is annoying, but pylint in python3.8 was crashing parsing the named
tuple. We weren't using any namedtuple-specific features, so I made the
type a dataclass just to get pylint to behave.

* Add default timout to requests (#1846)

* Fix type (#1846)

* Fix merge mistake on poetry.lock (#1846)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Bowen <83967628+mattbowen-usds@users.noreply.github.com>
Co-authored-by: matt bowen <matthew.r.bowen@omb.eop.gov>
2022-08-26 15:23:20 -04:00
Jorge Escobar
e539db86ab tuple type 2022-08-26 13:11:51 -04:00
Emma Nechamkin
15b4f5b617 updated error message 2022-08-26 10:12:45 -04:00
Emma Nechamkin
c5244470ed updated with error messages 2022-08-25 18:38:22 -04:00
Emma Nechamkin
16209cc029 Merge branch 'emma-nechamkin/1849-calculation-tests' of github.com:usds/justice40-tool into emma-nechamkin/1849-calculation-tests 2022-08-25 18:32:46 -04:00
Emma Nechamkin
cb13388889 Merge branch 'emma-nechamkin/1849-calculation-tests' of github.com:usds/justice40-tool into emma-nechamkin/1849-calculation-tests 2022-08-25 18:32:21 -04:00
Emma Nechamkin
41c3766a13 Merge branch 'emma-nechamkin/1849-calculation-tests' of github.com:usds/justice40-tool into emma-nechamkin/1849-calculation-tests 2022-08-25 17:15:50 -04:00
Emma Nechamkin
b63c465885 adding a check to the agvalue calculation for nri 2022-08-25 17:15:33 -04:00
Emma Nechamkin
d16d0109a4
OOPS!
Old changes persisted
2022-08-25 16:48:42 -04:00
Emma Nechamkin
9a2193d1a4 checking drop tracts works 2022-08-25 16:37:23 -04:00
Emma Nechamkin
4a25a28b0e just testing that the boolean is preserved on gha 2022-08-25 10:54:13 -04:00
Jorge Escobar
d3efcbdeb3 fix markdown 2022-08-25 10:51:19 -04:00
Emma Nechamkin
637b8c305c
updated to show T/F/null vs T/F for AML and FUDS (#1866) 2022-08-24 20:22:59 -04:00
Emma Nechamkin
6418335219
Updates backend constants to N (#1854) 2022-08-23 16:19:00 -04:00
Lucas Merrill Brown
4bf7773797
Issue 1827: Add demographics to tiles and download files (#1833)
* Adding demographics for use in sidebar and download files
2022-08-22 10:05:23 -04:00
Emma Nechamkin
e6385c172f
Update etl_score_geo.py 2022-08-19 15:07:02 -04:00
Emma Nechamkin
ad1ce2bf7f
Tiles fix (#1845)
Fixes score-geo and adds flags
2022-08-19 14:43:46 -04:00
Emma Nechamkin
d892bce6cf
Fast flag update (#1844)
Added additional flags for the front end based on our conversation in stand up this morning.
2022-08-19 13:14:44 -04:00
Emma Nechamkin
1ee26bf30d
Quick fix to kitchen or plumbing indicator
Yikes! I think I messed something up and dropped the pctile field suffix from when the KP score gets calculated. Fixing right quick.
2022-08-18 17:47:34 -04:00
Emma Nechamkin
3ba1c620f5
Update to use new FSF files (#1838)
backend is partially done!
2022-08-18 15:54:44 -04:00
Emma Nechamkin
cb4866b93f
Adding eamlis and fuds data to legacy pollution in score (#1832)
Update to add EAMLIS and FUDS data to score
2022-08-18 13:32:29 -04:00
Matt Bowen
6e41e0d9f0
Add donut hole calculation to score (#1828)
Adds adjacency index to the pipeline. Requires thorough QA
2022-08-18 12:04:46 -04:00
Emma Nechamkin
88dc2e5a8e updating to avoid conflicts 2022-08-17 14:28:02 -04:00
Emma Nechamkin
7d89d41e49
Adding NLCD data (#1826)
Adding NLCD's natural space indicator end to end to the score.
2022-08-17 14:21:28 -04:00
Emma Nechamkin
2e05b1d60c Merge branch 'emma-nechamkin/release/score-narwhal' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-17 11:34:37 -04:00
Matt Bowen
49623e4da0
Add abandoned mine lands data (#1824)
* Add notebook to generate test data (#1780)

* Add Abandoned Mine Land data (#1780)

Using a similar structure but simpler apporach compared to FUDs, add an
indicator for whether a tract has an abandonded mine.

* Adding some detail to dataset readmes

Just a thought!

* Apply feedback from revieiw (#1780)

* Fixup bad string that broke test (#1780)

* Update a string that I should have renamed (#1780)

* Reduce number of threads to reduce memory pressure (#1780)

* Try not running geo data (#1780)

* Run the high-memory sets separately (#1780)

* Actually deduplicate (#1780)

* Add flag for memory intensive ETLs (#1780)

* Document new flag for datasets (#1780)

* Add flag for new datasets fro rebase (#1780)

Co-authored-by: Emma Nechamkin <97977170+emma-nechamkin@users.noreply.github.com>
2022-08-17 11:33:59 -04:00
Emma Nechamkin
981a36cfa3 first run -- adding NCLD data to the ETL, but not yet to the score 2022-08-17 11:11:11 -04:00
Emma Nechamkin
5e378aea81
Adding first street foundation data (#1823)
Adding FSF flood and wildfire risk datasets to the score.
2022-08-17 10:14:23 -04:00
Emma Nechamkin
ebac552d75
Adding DOT composite to travel score (#1820)
This adds the DOT dataset to the ETL and to the score. Note that currently we take a percentile of an average of percentiles.
2022-08-16 14:44:39 -04:00
Vim USDS
932179841f Merge branch 'emma-nechamkin/release/score-narwhal' of https://github.com/usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-16 10:36:04 -07:00
Vim USDS
d6c04b1308 Disable markdown check for link 2022-08-16 10:35:57 -07:00
Matt Bowen
d5fbb802e8
Add FUDS ETL (#1817)
* Add spatial join method (#1871)

Since we'll need to figure out the tracts for a large number of points
in future tickets, add a utility to handle grabbing the tract geometries
and adding tract data to a point dataset.

* Add FUDS, also jupyter lab (#1871)

* Add YAML configs for FUDS (#1871)

* Allow input geoid to be optional (#1871)

* Add FUDS ETL, tests, test-datae noteobook (#1871)

This adds the ETL class for Formerly Used Defense Sites (FUDS). This is
different from most other ETLs since these FUDS are not provided by
tract, but instead by geographic point, so we need to assign FUDS to
tracts and then do calculations from there.

* Floats -> Ints, as I intended (#1871)

* Floats -> Ints, as I intended (#1871)

* Formatting fixes (#1871)

* Add test false positive GEOIDs (#1871)

* Add gdal binaries (#1871)

* Refactor pandas code to be more idiomatic (#1871)

Per Emma, the more pandas-y way of doing my counts is using np.where to
add the values i need, then groupby and size. It is definitely more
compact, and also I think more correct!

* Update configs per Emma suggestions (#1871)

* Type fixed! (#1871)

* Remove spurious import from vscode (#1871)

* Snapshot update after changing col name (#1871)

* Move up GDAL (#1871)

* Adjust geojson strategy (#1871)

* Try running census separately first (#1871)

* Fix import order (#1871)

* Cleanup cache strategy (#1871)

* Download census data from S3 instead of re-calculating (#1871)

* Clarify pandas code per Emma (#1871)
2022-08-16 13:28:39 -04:00
Vim USDS
13e79087d1 Adding back MapComparison video 2022-08-16 10:14:32 -07:00
Emma Nechamkin
481a2a05f7
updated to fix linting errors (#1818)
Cleans and updates base branch
2022-08-11 16:34:56 -04:00
Emma Nechamkin
dcda155c95 fixing rebase 2022-08-11 12:39:54 -04:00
Emma Nechamkin
94cdc47cce Update etl_score_geo.py
Yikes! Fixing merge messup!
2022-08-11 12:38:32 -04:00
Matt Bowen
97e17546cc Refactor DOE Energy Burden and COI to use YAML (#1796)
* added tribalId for Supplemental dataset (#1804)

* Setting zoom levels for tribal map (#1810)

* NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>

* Correct copy typo (#1809)

* Add basic test suite for COI (#1518)

* Update COI to use new yaml (#1518)

* Add tests for DOE energy budren (1518

* Add dataset config for energy budren (1518)

* Refactor ETL to use datasets.yml (#1518)

* Add fake GEOIDs to COI tests (#1518)

* Refactor _setup_etl_instance_and_run_extract to base (#1518)

For the three classes we've done so far, a generic
_setup_etl_instance_and_run_extract will work fine, for the moment we
can reuse the same setup method until we decide future classes need more
flexibility --- but they can also always subclass so...

* Add output-path tests (#1518)

* Update YAML to match constant (#1518)

* Don't blindly set float format (#1518)

* Add defaults for extract (#1518)

* Run YAML load on all subclasses (#1518)

* Update description fields (#1518)

* Update YAML per final format (#1518)

* Update fixture tract IDs (#1518)

* Update base class refactor (#1518)

Now that NRI is final I needed to make a small number of updates to my
refactored code.

* Remove old comment (#1518)

* Fix type signature and return (#1518)

* Update per code review (#1518)

Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>
2022-08-11 12:38:28 -04:00
Emma Nechamkin
baa591a6c6 first run through 2022-08-11 12:33:46 -04:00
Emma Nechamkin
4f6a1b5286 added indoor plumbing to score housing burden 2022-08-11 12:33:46 -04:00
Emma Nechamkin
15450cf91f added indoor plumbing to score housing burden 2022-08-11 12:33:46 -04:00
Emma Nechamkin
8c7519063a added indoor plumbing to chas 2022-08-11 12:33:46 -04:00
Emma Nechamkin
0d90ae563a Changing LHE in tiles to a boolean (#1767)
also includes merging / clean up of the release
2022-08-11 12:33:46 -04:00
Emma Nechamkin
b0a728437c adds UST indicator (#1786)
adds leaky underground storage tanks
2022-08-11 12:33:46 -04:00