j40-cejst-2/data/data-pipeline/data_pipeline/etl/sources/national_risk_index
Matt Bowen 876655d2b2
Add tests for all non-census sources (#1899)
* Refactor CDC life-expectancy (1554)

* Update to new tract list (#1554)

* Adjust for tests (#1848)

* Add tests for cdc_places (#1848)

* Add EJScreen tests (#1848)

* Add tests for HUD housing (#1848)

* Add tests for GeoCorr (#1848)

* Add persistent poverty tests (#1848)

* Update for sources without zips, for new validation (#1848)

* Update tests for new multi-CSV but (#1848)

Lucas updated the CDC life expectancy data to handle a bug where two
states are missing from the US Overall download. Since virtually none of
our other ETL classes download multiple CSVs directly like this, it
required a pretty invasive new mocking strategy.

* Add basic tests for nature deprived (#1848)

* Add wildfire tests (#1848)

* Add flood risk tests (#1848)

* Add DOT travel tests (#1848)

* Add historic redlining tests (#1848)

* Add tests for ME and WI (#1848)

* Update now that validation exists (#1848)

* Adjust for validation (#1848)

* Add health insurance back to cdc places (#1848)

Ooops

* Update tests with new field (#1848)

* Test for blank tract removal (#1848)

* Add tracts for clipping behavior

* Test clipping and zfill behavior (#1848)

* Fix bad test assumption (#1848)

* Simplify class, add test for tract padding (#1848)

* Fix percentage inversion, update tests (#1848)

Looking through the transformations, I noticed that we were subtracting
a percentage that is usually between 0-100 from 1 instead of 100, and so
were endind up with some surprising results. Confirmed with lucasmbrown-usds

* Add note about first street data (#1848)
2022-09-19 15:17:00 -04:00
..
__init__.py Adds National Risk Index data to ETL pipeline (#549) 2021-09-07 20:51:34 -04:00
etl.py Add tests for all non-census sources (#1899) 2022-09-19 15:17:00 -04:00
README.md Adds National Risk Index data to ETL pipeline (#549) 2021-09-07 20:51:34 -04:00

FEMA National Risk Index

Description

The National Risk Index is a new, online mapping application from FEMA that identifies communities most at risk to 18 natural hazards. This application visualizes natural hazard risk metrics and includes data about expected annual losses from natural hazards, social vulnerability and community resilience.

The National Risk Index's interactive web maps are at the county and Census tract level and made available via geographic information system (GIS) services for custom analyses. For this project, we've utilized the NRI data collected at the Census tract level

Data Transformation Summary

The following transformations were applied to the NRI data during the ETL process:

  • The TRACTFIPS column was renamed to GEOID10_TRACT to match the name of columns that hold the Census Tract FIPS code in other data sets
  • The NRI score values for each Census tract were applied to each of the Census block groups inside of that Census tract so that the unit of analysis would match that of other datasets like the American Communities Survey