Add FUDS ETL (#1817)

* Add spatial join method (#1871)

Since we'll need to figure out the tracts for a large number of points
in future tickets, add a utility to handle grabbing the tract geometries
and adding tract data to a point dataset.

* Add FUDS, also jupyter lab (#1871)

* Add YAML configs for FUDS (#1871)

* Allow input geoid to be optional (#1871)

* Add FUDS ETL, tests, test-datae noteobook (#1871)

This adds the ETL class for Formerly Used Defense Sites (FUDS). This is
different from most other ETLs since these FUDS are not provided by
tract, but instead by geographic point, so we need to assign FUDS to
tracts and then do calculations from there.

* Floats -> Ints, as I intended (#1871)

* Floats -> Ints, as I intended (#1871)

* Formatting fixes (#1871)

* Add test false positive GEOIDs (#1871)

* Add gdal binaries (#1871)

* Refactor pandas code to be more idiomatic (#1871)

Per Emma, the more pandas-y way of doing my counts is using np.where to
add the values i need, then groupby and size. It is definitely more
compact, and also I think more correct!

* Update configs per Emma suggestions (#1871)

* Type fixed! (#1871)

* Remove spurious import from vscode (#1871)

* Snapshot update after changing col name (#1871)

* Move up GDAL (#1871)

* Adjust geojson strategy (#1871)

* Try running census separately first (#1871)

* Fix import order (#1871)

* Cleanup cache strategy (#1871)

* Download census data from S3 instead of re-calculating (#1871)

* Clarify pandas code per Emma (#1871)
This commit is contained in:
Matt Bowen 2022-08-16 13:28:39 -04:00 committed by GitHub
commit d5fbb802e8
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
22 changed files with 2534 additions and 416 deletions

View file

@ -117,6 +117,34 @@ datasets:
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- long_name: "Formerly Used Defense Sites"
short_name: "FUDS"
module_name: "us_army_fuds"
load_fields:
- short_name: "fuds_count"
df_field_name: "ELIGIBLE_FUDS_COUNT_FIELD_NAME"
long_name: "Count of eligible Formerly Used Defense Site (FUDS) properties centroids"
description_short:
"The number of FUDS marked as Eligible and Has Project in the tract."
field_type: int64
include_in_tiles: false
include_in_downloadable_files: false
- short_name: "not_fuds_ct"
df_field_name: "INELIGIBLE_FUDS_COUNT_FIELD_NAME"
long_name: "Count of ineligible Formerly Used Defense Site (FUDS) properties centroids"
description_short:
"The number of FUDS marked as Ineligible or Project in the tract."
field_type: int64
include_in_tiles: false
include_in_downloadable_files: false
- short_name: "has_fuds"
df_field_name: "ELIGIBLE_FUDS_BINARY_FIELD_NAME"
long_name: "Is there at least one Formerly Used Defense Site (FUDS) in the tract?"
description_short:
"Whether the tract has a FUDS"
field_type: bool
include_in_tiles: false
include_in_downloadable_files: false
- long_name: "Example ETL"
short_name: "Example"
module_name: "example_dataset"
@ -128,4 +156,3 @@ datasets:
field_type: float
include_in_tiles: true
include_in_downloadable_files: true

View file

@ -77,7 +77,7 @@ class DatasetsConfig:
long_name: str
short_name: str
module_name: str
input_geoid_tract_field_name: str
load_fields: List[LoadField]
input_geoid_tract_field_name: Optional[str] = None
datasets: List[Dataset]