Commit graph

232 commits

Author SHA1 Message Date
Emma Nechamkin
5c41c95764 Revert "Fast flag update (#1844)"
This reverts commit d892bce6cf.
2022-08-19 14:05:45 -04:00
Emma Nechamkin
d892bce6cf
Fast flag update (#1844)
Added additional flags for the front end based on our conversation in stand up this morning.
2022-08-19 13:14:44 -04:00
Emma Nechamkin
1ee26bf30d
Quick fix to kitchen or plumbing indicator
Yikes! I think I messed something up and dropped the pctile field suffix from when the KP score gets calculated. Fixing right quick.
2022-08-18 17:47:34 -04:00
Emma Nechamkin
3ba1c620f5
Update to use new FSF files (#1838)
backend is partially done!
2022-08-18 15:54:44 -04:00
Emma Nechamkin
cb4866b93f
Adding eamlis and fuds data to legacy pollution in score (#1832)
Update to add EAMLIS and FUDS data to score
2022-08-18 13:32:29 -04:00
Matt Bowen
6e41e0d9f0
Add donut hole calculation to score (#1828)
Adds adjacency index to the pipeline. Requires thorough QA
2022-08-18 12:04:46 -04:00
Emma Nechamkin
88dc2e5a8e updating to avoid conflicts 2022-08-17 14:28:02 -04:00
Emma Nechamkin
7d89d41e49
Adding NLCD data (#1826)
Adding NLCD's natural space indicator end to end to the score.
2022-08-17 14:21:28 -04:00
Emma Nechamkin
2e05b1d60c Merge branch 'emma-nechamkin/release/score-narwhal' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-17 11:34:37 -04:00
Matt Bowen
49623e4da0
Add abandoned mine lands data (#1824)
* Add notebook to generate test data (#1780)

* Add Abandoned Mine Land data (#1780)

Using a similar structure but simpler apporach compared to FUDs, add an
indicator for whether a tract has an abandonded mine.

* Adding some detail to dataset readmes

Just a thought!

* Apply feedback from revieiw (#1780)

* Fixup bad string that broke test (#1780)

* Update a string that I should have renamed (#1780)

* Reduce number of threads to reduce memory pressure (#1780)

* Try not running geo data (#1780)

* Run the high-memory sets separately (#1780)

* Actually deduplicate (#1780)

* Add flag for memory intensive ETLs (#1780)

* Document new flag for datasets (#1780)

* Add flag for new datasets fro rebase (#1780)

Co-authored-by: Emma Nechamkin <97977170+emma-nechamkin@users.noreply.github.com>
2022-08-17 11:33:59 -04:00
Emma Nechamkin
981a36cfa3 first run -- adding NCLD data to the ETL, but not yet to the score 2022-08-17 11:11:11 -04:00
Emma Nechamkin
5e378aea81
Adding first street foundation data (#1823)
Adding FSF flood and wildfire risk datasets to the score.
2022-08-17 10:14:23 -04:00
Emma Nechamkin
ebac552d75
Adding DOT composite to travel score (#1820)
This adds the DOT dataset to the ETL and to the score. Note that currently we take a percentile of an average of percentiles.
2022-08-16 14:44:39 -04:00
Vim USDS
932179841f Merge branch 'emma-nechamkin/release/score-narwhal' of https://github.com/usds/justice40-tool into emma-nechamkin/release/score-narwhal 2022-08-16 10:36:04 -07:00
Vim USDS
d6c04b1308 Disable markdown check for link 2022-08-16 10:35:57 -07:00
Matt Bowen
d5fbb802e8
Add FUDS ETL (#1817)
* Add spatial join method (#1871)

Since we'll need to figure out the tracts for a large number of points
in future tickets, add a utility to handle grabbing the tract geometries
and adding tract data to a point dataset.

* Add FUDS, also jupyter lab (#1871)

* Add YAML configs for FUDS (#1871)

* Allow input geoid to be optional (#1871)

* Add FUDS ETL, tests, test-datae noteobook (#1871)

This adds the ETL class for Formerly Used Defense Sites (FUDS). This is
different from most other ETLs since these FUDS are not provided by
tract, but instead by geographic point, so we need to assign FUDS to
tracts and then do calculations from there.

* Floats -> Ints, as I intended (#1871)

* Floats -> Ints, as I intended (#1871)

* Formatting fixes (#1871)

* Add test false positive GEOIDs (#1871)

* Add gdal binaries (#1871)

* Refactor pandas code to be more idiomatic (#1871)

Per Emma, the more pandas-y way of doing my counts is using np.where to
add the values i need, then groupby and size. It is definitely more
compact, and also I think more correct!

* Update configs per Emma suggestions (#1871)

* Type fixed! (#1871)

* Remove spurious import from vscode (#1871)

* Snapshot update after changing col name (#1871)

* Move up GDAL (#1871)

* Adjust geojson strategy (#1871)

* Try running census separately first (#1871)

* Fix import order (#1871)

* Cleanup cache strategy (#1871)

* Download census data from S3 instead of re-calculating (#1871)

* Clarify pandas code per Emma (#1871)
2022-08-16 13:28:39 -04:00
Emma Nechamkin
481a2a05f7
updated to fix linting errors (#1818)
Cleans and updates base branch
2022-08-11 16:34:56 -04:00
Emma Nechamkin
94cdc47cce Update etl_score_geo.py
Yikes! Fixing merge messup!
2022-08-11 12:38:32 -04:00
Matt Bowen
97e17546cc Refactor DOE Energy Burden and COI to use YAML (#1796)
* added tribalId for Supplemental dataset (#1804)

* Setting zoom levels for tribal map (#1810)

* NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>

* Correct copy typo (#1809)

* Add basic test suite for COI (#1518)

* Update COI to use new yaml (#1518)

* Add tests for DOE energy budren (1518

* Add dataset config for energy budren (1518)

* Refactor ETL to use datasets.yml (#1518)

* Add fake GEOIDs to COI tests (#1518)

* Refactor _setup_etl_instance_and_run_extract to base (#1518)

For the three classes we've done so far, a generic
_setup_etl_instance_and_run_extract will work fine, for the moment we
can reuse the same setup method until we decide future classes need more
flexibility --- but they can also always subclass so...

* Add output-path tests (#1518)

* Update YAML to match constant (#1518)

* Don't blindly set float format (#1518)

* Add defaults for extract (#1518)

* Run YAML load on all subclasses (#1518)

* Update description fields (#1518)

* Update YAML per final format (#1518)

* Update fixture tract IDs (#1518)

* Update base class refactor (#1518)

Now that NRI is final I needed to make a small number of updates to my
refactored code.

* Remove old comment (#1518)

* Fix type signature and return (#1518)

* Update per code review (#1518)

Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>
2022-08-11 12:38:28 -04:00
Emma Nechamkin
baa591a6c6 first run through 2022-08-11 12:33:46 -04:00
Emma Nechamkin
4f6a1b5286 added indoor plumbing to score housing burden 2022-08-11 12:33:46 -04:00
Emma Nechamkin
15450cf91f added indoor plumbing to score housing burden 2022-08-11 12:33:46 -04:00
Emma Nechamkin
8c7519063a added indoor plumbing to chas 2022-08-11 12:33:46 -04:00
Emma Nechamkin
0d90ae563a Changing LHE in tiles to a boolean (#1767)
also includes merging / clean up of the release
2022-08-11 12:33:46 -04:00
Emma Nechamkin
b0a728437c adds UST indicator (#1786)
adds leaky underground storage tanks
2022-08-11 12:33:46 -04:00
Emma Nechamkin
f6efdd4e14 Rescaling linguistic isolation (#1750)
Rescales linguistic isolation to drop puerto rico
2022-08-11 12:33:46 -04:00
Emma Nechamkin
2ab24c60fa updating ejscreen data, try two (#1747) 2022-08-11 12:33:46 -04:00
Emma Nechamkin
7559cf46f6 Emma nechamkin/holc patch (#1742)
Removing HOLC calculation from score narwhal.
2022-08-11 12:33:46 -04:00
Shelby Switzer
3071815158 Do not drop Guam and USVI from ETL (#1681)
* Remove code that drops Guam and USVI from ETL

* Add back code for dropping rows by FIPS code

We may want this functionality, so let's keep it and just make the constant currently be an empty array.

Co-authored-by: Shelby Switzer <shelbyswitzer@gmail.com>
2022-08-11 12:33:46 -04:00
Emma Nechamkin
b41a2870f3 updating 2022-08-11 12:33:46 -04:00
Shelby Switzer
05748c9fa2 Update backend for Puerto Rico (#1686)
* Update PR threshold count to 10

We now show 10 indicators for PR. See the discussion on the github issue for more info: https://github.com/usds/justice40-tool/issues/1621

* Do not use linguistic iso for Puerto Rico

Closes 1350.

Co-authored-by: Shelby Switzer <shelbyswitzer@gmail.com>
2022-08-11 12:33:46 -04:00
Emma Nechamkin
1782d022a9 Adding HOLC indicator (#1579)
Added HOLC indicator (Historic Redlining Score) from NCRC work; included 3.25 cutoff and low income as part of the housing burden category.
2022-08-11 12:33:46 -04:00
Emma Nechamkin
f047ca9d83 Imputing income using geographic neighbors (#1559)
Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.
2022-08-11 12:33:45 -04:00
Jorge Escobar
1c448a77f9
NRI dataset and initial score YAML configuration (#1534)
* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2022-08-09 16:37:10 -04:00
Jorge Escobar
1833e3e794
Setting zoom levels for tribal map (#1810) 2022-08-09 13:56:03 -04:00
Jorge Escobar
781e08f559
added tribalId for Supplemental dataset (#1804) 2022-08-08 17:42:14 -04:00
Jorge Escobar
8149ac31c5
Starting Tribal Boundaries Work (#1736)
* starting tribal pr

* further pipeline work

* bia merge working

* alaska villages and tribal geo generate

* tribal folders

* adding data full run

* tile generation

* tribal tile deploy
2022-07-30 01:13:10 -04:00
Vim
e1a61faf5d
Add a react component generator (#1745)
* Add a react component generator

* Update markdown links

* Change commented code to block comment
2022-07-15 09:54:58 -07:00
Vim
eb3004c0d5
Fix on large AK tracts that are off screen (#1740)
* Change low to high transition and global zoom

- change the low to high transition from 7 to 5. This can not go any lower as high tiles on AWS only go to zoom level 5
- reduce the zoom level globally on all census tracts

* Remove geolocation from feature flag

- geolocation is now available to all

* Add python notebook that sorts all tracts by area

- add a column of the required zoom level for the tract to be fully contained in the viewport

* Place geolocation back to behind a feature flag

* Differentiate zoom levels b/w shortcuts and tracts
2022-07-13 19:01:43 -07:00
dependabot[bot]
2992f8df0b
Bump notebook from 6.4.10 to 6.4.12 in /data/data-pipeline (#1685)
Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-07 17:10:03 -04:00
dependabot[bot]
0555d896fd
Bump lxml from 4.8.0 to 4.9.1 in /data/data-pipeline (#1719)
Bumps [lxml](https://github.com/lxml/lxml) from 4.8.0 to 4.9.1.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.8.0...lxml-4.9.1)

---
updated-dependencies:
- dependency-name: lxml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-07 17:09:49 -04:00
Kameron Kerger
7c808eb2e0
Add files via upload (#1656)
updated TSD (new - naming convention) and new TSD-es
2022-05-31 13:19:01 -04:00
Jorge Escobar
ce89214a60
Adding Technical Training Slides (#1638)
* Adding Technical Training Slides

* small update on CI/CD map staging URL
2022-05-12 15:01:26 -04:00
Jorge Escobar
2af6fca98d
Columnn headers update (#1618)
* Columnn headers update

* passing tests

* updated date stamp

* js tests
2022-05-06 14:10:15 -04:00
Kameron Kerger
303c200fbe
Add files via upload (#1612)
updated pdf
2022-05-04 10:47:54 -04:00
Jorge Escobar
eb1cb8884e
Adding a note about Scipy installation on newer MacOS 2022-05-03 17:26:05 -04:00
Emma Nechamkin
ae725f0a3e
arcgis column name fix (#1581)
eliminates duplicate column and ensures all column names are unique.
2022-04-22 14:09:12 -04:00
Jorge Escobar
fbd56e3bd5
Put the pdf back in the package and add TSD to pipeline (#1580)
* Put the pdf back in the package and add TSD to pipeline

* updated pdf with logo

* wrong path
2022-04-21 13:42:04 -04:00
Kameron Kerger
72e6dbc1dd
/1354-update-pdf (#1568)
updated pdf for the put the pdf back in the package issue
2022-04-19 11:07:31 -04:00
Emma Nechamkin
2ce4cfe80e
updated with codebook (#1573) 2022-04-18 18:12:18 -04:00