j40-cejst-2/data/data-pipeline/data_pipeline/etl/sources
Matt Bowen d5fbb802e8
Add FUDS ETL (#1817)
* Add spatial join method (#1871)

Since we'll need to figure out the tracts for a large number of points
in future tickets, add a utility to handle grabbing the tract geometries
and adding tract data to a point dataset.

* Add FUDS, also jupyter lab (#1871)

* Add YAML configs for FUDS (#1871)

* Allow input geoid to be optional (#1871)

* Add FUDS ETL, tests, test-datae noteobook (#1871)

This adds the ETL class for Formerly Used Defense Sites (FUDS). This is
different from most other ETLs since these FUDS are not provided by
tract, but instead by geographic point, so we need to assign FUDS to
tracts and then do calculations from there.

* Floats -> Ints, as I intended (#1871)

* Floats -> Ints, as I intended (#1871)

* Formatting fixes (#1871)

* Add test false positive GEOIDs (#1871)

* Add gdal binaries (#1871)

* Refactor pandas code to be more idiomatic (#1871)

Per Emma, the more pandas-y way of doing my counts is using np.where to
add the values i need, then groupby and size. It is definitely more
compact, and also I think more correct!

* Update configs per Emma suggestions (#1871)

* Type fixed! (#1871)

* Remove spurious import from vscode (#1871)

* Snapshot update after changing col name (#1871)

* Move up GDAL (#1871)

* Adjust geojson strategy (#1871)

* Try running census separately first (#1871)

* Fix import order (#1871)

* Cleanup cache strategy (#1871)

* Download census data from S3 instead of re-calculating (#1871)

* Clarify pandas code per Emma (#1871)
2022-08-16 13:28:39 -04:00
..
calenviroscreen Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
cdc_life_expectancy Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
cdc_places Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
cdc_svi_index Issue 1141: Definition M (#1151) 2022-01-18 14:56:55 -05:00
census Add FUDS ETL (#1817) 2022-08-16 13:28:39 -04:00
census_acs updated to fix linting errors (#1818) 2022-08-11 16:34:56 -04:00
census_acs_2010 Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
census_acs_median_income Cleaning up quick code (#1349) 2022-03-02 16:50:04 -05:00
census_decennial Issue 1075: Add refactored ETL tests to NRI (#1088) 2022-02-08 19:05:32 -05:00
child_opportunity_index Refactor DOE Energy Burden and COI to use YAML (#1796) 2022-08-11 12:38:28 -04:00
doe_energy_burden Refactor DOE Energy Burden and COI to use YAML (#1796) 2022-08-11 12:38:28 -04:00
ejscreen updating ejscreen data, try two (#1747) 2022-08-11 12:33:46 -04:00
ejscreen_areas_of_concern Issue 838: Update comparison tool to use tracts (#934) 2021-11-30 18:46:29 -05:00
energy_definition_alternative_draft Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
epa_rsei Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
geocorr Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
historic_redlining Adding HOLC indicator (#1579) 2022-08-11 12:33:46 -04:00
housing_and_transportation Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
hud_housing added indoor plumbing to score housing burden 2022-08-11 12:33:46 -04:00
hud_recap Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
mapping_for_ej Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
mapping_inequality Adding HOLC indicator (#1579) 2022-08-11 12:33:46 -04:00
maryland_ejscreen Add a react component generator (#1745) 2022-07-15 09:54:58 -07:00
michigan_ejscreen Add Michigan EJ Screen into data-pipeline's ETL and provide automated scoring and statistics outputs (#1091) 2021-12-31 15:38:52 -05:00
national_risk_index Refactor DOE Energy Burden and COI to use YAML (#1796) 2022-08-11 12:38:28 -04:00
persistent_poverty Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
tree_equity_score Run ETL processes in parallel (#1253) 2022-02-11 14:04:53 -05:00
tribal added tribalId for Supplemental dataset (#1804) 2022-08-08 17:42:14 -04:00
us_army_fuds Add FUDS ETL (#1817) 2022-08-16 13:28:39 -04:00
__init__.py Data directory should adopt standard Poetry-suggested python package structure (#457) 2021-08-05 15:35:54 -04:00
geo_utils.py Add FUDS ETL (#1817) 2022-08-16 13:28:39 -04:00