Add tests for all non-census sources (#1899)

* Refactor CDC life-expectancy (1554)

* Update to new tract list (#1554)

* Adjust for tests (#1848)

* Add tests for cdc_places (#1848)

* Add EJScreen tests (#1848)

* Add tests for HUD housing (#1848)

* Add tests for GeoCorr (#1848)

* Add persistent poverty tests (#1848)

* Update for sources without zips, for new validation (#1848)

* Update tests for new multi-CSV but (#1848)

Lucas updated the CDC life expectancy data to handle a bug where two
states are missing from the US Overall download. Since virtually none of
our other ETL classes download multiple CSVs directly like this, it
required a pretty invasive new mocking strategy.

* Add basic tests for nature deprived (#1848)

* Add wildfire tests (#1848)

* Add flood risk tests (#1848)

* Add DOT travel tests (#1848)

* Add historic redlining tests (#1848)

* Add tests for ME and WI (#1848)

* Update now that validation exists (#1848)

* Adjust for validation (#1848)

* Add health insurance back to cdc places (#1848)

Ooops

* Update tests with new field (#1848)

* Test for blank tract removal (#1848)

* Add tracts for clipping behavior

* Test clipping and zfill behavior (#1848)

* Fix bad test assumption (#1848)

* Simplify class, add test for tract padding (#1848)

* Fix percentage inversion, update tests (#1848)

Looking through the transformations, I noticed that we were subtracting
a percentage that is usually between 0-100 from 1 instead of 100, and so
were endind up with some surprising results. Confirmed with lucasmbrown-usds

* Add note about first street data (#1848)
This commit is contained in:
Matt Bowen 2022-09-19 15:17:00 -04:00 committed by GitHub
commit 876655d2b2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
88 changed files with 2032 additions and 178 deletions

View file

@ -0,0 +1,17 @@
GEOID,count_properties,mid_depth_100_year00,mid_depth_100_year30
6027000800,942,214,215
6069000802,1131,283,292
6061021322,1483,100,108
15001021010,1888,179,186
15001021101,3463,130,137
15007040603,1557,152,181
15007040700,1533,177,191
15009030100,1658,232,242
15009030201,6144,431,447
15001021402,4118,321,329
15001021800,2813,350,356
15009030402,3374,852,888
15009030800,4847,1003,1019
15003010201,2335,220,227
15007040604,5364,630,641
2290000400,1,1,1
1 GEOID count_properties mid_depth_100_year00 mid_depth_100_year30
2 6027000800 942 214 215
3 6069000802 1131 283 292
4 6061021322 1483 100 108
5 15001021010 1888 179 186
6 15001021101 3463 130 137
7 15007040603 1557 152 181
8 15007040700 1533 177 191
9 15009030100 1658 232 242
10 15009030201 6144 431 447
11 15001021402 4118 321 329
12 15001021800 2813 350 356
13 15009030402 3374 852 888
14 15009030800 4847 1003 1019
15 15003010201 2335 220 227
16 15007040604 5364 630 641
17 2290000400 1 1 1

View file

@ -0,0 +1,17 @@
GEOID10_TRACT,Count of properties eligible for flood risk calculation within tract (floor of 250),Count of properties at risk of flood today,Count of properties at risk of flood in 30 years,Share of properties at risk of flood today,Share of properties at risk of flood in 30 years
06027000800,942,214,215,0.2271762208,0.2282377919
06069000802,1131,283,292,0.2502210433,0.2581786030
06061021322,1483,100,108,0.0674308833,0.0728253540
15001021010,1888,179,186,0.0948093220,0.0985169492
15001021101,3463,130,137,0.0375397055,0.0395610742
15007040603,1557,152,181,0.0976236352,0.1162491972
15007040700,1533,177,191,0.1154598826,0.1245923027
15009030100,1658,232,242,0.1399276236,0.1459589867
15009030201,6144,431,447,0.0701497396,0.0727539062
15001021402,4118,321,329,0.0779504614,0.0798931520
15001021800,2813,350,356,0.1244223249,0.1265552791
15009030402,3374,852,888,0.2525192650,0.2631890931
15009030800,4847,1003,1019,0.2069321230,0.2102331339
15003010201,2335,220,227,0.0942184154,0.0972162741
15007040604,5364,630,641,0.1174496644,0.1195003729
02290000400,250,1,1,0.0040000000,0.0040000000
1 GEOID10_TRACT Count of properties eligible for flood risk calculation within tract (floor of 250) Count of properties at risk of flood today Count of properties at risk of flood in 30 years Share of properties at risk of flood today Share of properties at risk of flood in 30 years
2 06027000800 942 214 215 0.2271762208 0.2282377919
3 06069000802 1131 283 292 0.2502210433 0.2581786030
4 06061021322 1483 100 108 0.0674308833 0.0728253540
5 15001021010 1888 179 186 0.0948093220 0.0985169492
6 15001021101 3463 130 137 0.0375397055 0.0395610742
7 15007040603 1557 152 181 0.0976236352 0.1162491972
8 15007040700 1533 177 191 0.1154598826 0.1245923027
9 15009030100 1658 232 242 0.1399276236 0.1459589867
10 15009030201 6144 431 447 0.0701497396 0.0727539062
11 15001021402 4118 321 329 0.0779504614 0.0798931520
12 15001021800 2813 350 356 0.1244223249 0.1265552791
13 15009030402 3374 852 888 0.2525192650 0.2631890931
14 15009030800 4847 1003 1019 0.2069321230 0.2102331339
15 15003010201 2335 220 227 0.0942184154 0.0972162741
16 15007040604 5364 630 641 0.1174496644 0.1195003729
17 02290000400 250 1 1 0.0040000000 0.0040000000

View file

@ -0,0 +1,17 @@
GEOID,count_properties,Count of properties at risk of flood today,Count of properties at risk of flood in 30 years,GEOID10_TRACT,Count of properties eligible for flood risk calculation within tract (floor of 250),Share of properties at risk of flood today,Share of properties at risk of flood in 30 years
06027000800,942,214,215,06027000800,942,0.2271762208,0.2282377919
06069000802,1131,283,292,06069000802,1131,0.2502210433,0.2581786030
06061021322,1483,100,108,06061021322,1483,0.0674308833,0.0728253540
15001021010,1888,179,186,15001021010,1888,0.0948093220,0.0985169492
15001021101,3463,130,137,15001021101,3463,0.0375397055,0.0395610742
15007040603,1557,152,181,15007040603,1557,0.0976236352,0.1162491972
15007040700,1533,177,191,15007040700,1533,0.1154598826,0.1245923027
15009030100,1658,232,242,15009030100,1658,0.1399276236,0.1459589867
15009030201,6144,431,447,15009030201,6144,0.0701497396,0.0727539062
15001021402,4118,321,329,15001021402,4118,0.0779504614,0.0798931520
15001021800,2813,350,356,15001021800,2813,0.1244223249,0.1265552791
15009030402,3374,852,888,15009030402,3374,0.2525192650,0.2631890931
15009030800,4847,1003,1019,15009030800,4847,0.2069321230,0.2102331339
15003010201,2335,220,227,15003010201,2335,0.0942184154,0.0972162741
15007040604,5364,630,641,15007040604,5364,0.1174496644,0.1195003729
2290000400,1,1,1,02290000400,250,0.0040000000,0.0040000000
1 GEOID count_properties Count of properties at risk of flood today Count of properties at risk of flood in 30 years GEOID10_TRACT Count of properties eligible for flood risk calculation within tract (floor of 250) Share of properties at risk of flood today Share of properties at risk of flood in 30 years
2 06027000800 942 214 215 06027000800 942 0.2271762208 0.2282377919
3 06069000802 1131 283 292 06069000802 1131 0.2502210433 0.2581786030
4 06061021322 1483 100 108 06061021322 1483 0.0674308833 0.0728253540
5 15001021010 1888 179 186 15001021010 1888 0.0948093220 0.0985169492
6 15001021101 3463 130 137 15001021101 3463 0.0375397055 0.0395610742
7 15007040603 1557 152 181 15007040603 1557 0.0976236352 0.1162491972
8 15007040700 1533 177 191 15007040700 1533 0.1154598826 0.1245923027
9 15009030100 1658 232 242 15009030100 1658 0.1399276236 0.1459589867
10 15009030201 6144 431 447 15009030201 6144 0.0701497396 0.0727539062
11 15001021402 4118 321 329 15001021402 4118 0.0779504614 0.0798931520
12 15001021800 2813 350 356 15001021800 2813 0.1244223249 0.1265552791
13 15009030402 3374 852 888 15009030402 3374 0.2525192650 0.2631890931
14 15009030800 4847 1003 1019 15009030800 4847 0.2069321230 0.2102331339
15 15003010201 2335 220 227 15003010201 2335 0.0942184154 0.0972162741
16 15007040604 5364 630 641 15007040604 5364 0.1174496644 0.1195003729
17 2290000400 1 1 1 02290000400 250 0.0040000000 0.0040000000

View file

@ -0,0 +1,22 @@
import pathlib
from data_pipeline.tests.sources.example.test_etl import TestETL
from data_pipeline.etl.sources.fsf_flood_risk.etl import FloodRiskETL
class TestFloodRiskETL(TestETL):
_ETL_CLASS = FloodRiskETL
_SAMPLE_DATA_PATH = pathlib.Path(__file__).parents[0] / "data"
_SAMPLE_DATA_FILE_NAME = "fsf_flood/flood-tract2010.csv"
_SAMPLE_DATA_ZIP_FILE_NAME = "fsf_flood.zip"
_EXTRACT_TMP_FOLDER_NAME = "FloodRiskETL"
_FIXTURES_SHARED_TRACT_IDS = TestETL._FIXTURES_SHARED_TRACT_IDS + [
"02290000400" # A tract with 1 property
]
def setup_method(self, _method, filename=__file__):
"""Invoke `setup_method` from Parent, but using the current file name.
This code can be copied identically between all child classes.
"""
super().setup_method(_method=_method, filename=filename)