j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-24 02:24:20 -08:00

Author	SHA1	Message	Date
Matt Bowen	d5fbb802e8	Add FUDS ETL (#1817 ) * Add spatial join method (#1871) Since we'll need to figure out the tracts for a large number of points in future tickets, add a utility to handle grabbing the tract geometries and adding tract data to a point dataset. * Add FUDS, also jupyter lab (#1871) * Add YAML configs for FUDS (#1871) * Allow input geoid to be optional (#1871) * Add FUDS ETL, tests, test-datae noteobook (#1871) This adds the ETL class for Formerly Used Defense Sites (FUDS). This is different from most other ETLs since these FUDS are not provided by tract, but instead by geographic point, so we need to assign FUDS to tracts and then do calculations from there. * Floats -> Ints, as I intended (#1871) * Floats -> Ints, as I intended (#1871) * Formatting fixes (#1871) * Add test false positive GEOIDs (#1871) * Add gdal binaries (#1871) * Refactor pandas code to be more idiomatic (#1871) Per Emma, the more pandas-y way of doing my counts is using np.where to add the values i need, then groupby and size. It is definitely more compact, and also I think more correct! * Update configs per Emma suggestions (#1871) * Type fixed! (#1871) * Remove spurious import from vscode (#1871) * Snapshot update after changing col name (#1871) * Move up GDAL (#1871) * Adjust geojson strategy (#1871) * Try running census separately first (#1871) * Fix import order (#1871) * Cleanup cache strategy (#1871) * Download census data from S3 instead of re-calculating (#1871) * Clarify pandas code per Emma (#1871)	2022-08-16 13:28:39 -04:00
Jorge Escobar	8149ac31c5	Starting Tribal Boundaries Work (#1736 ) * starting tribal pr * further pipeline work * bia merge working * alaska villages and tribal geo generate * tribal folders * adding data full run * tile generation * tribal tile deploy	2022-07-30 01:13:10 -04:00
Jorge Escobar	1730572aa6	Reducing Docker start up and adding ArcGIS URL (#1386 ) * Reducing Docker start up and adding ArcGIS URL * Updating ArcGIS URLs	2022-03-09 08:55:17 -05:00
Jorge Escobar	053dde0d40	Display score L on map (#849 ) * updates to first docker run * tile constants * frontend changes * updating pickles instructions * pickles	2021-11-05 16:26:14 -04:00
Jorge Escobar	1b17af84c8	Combine + Tilefy (#806 ) * init * score-post * added score csv s3 download; remore poetry cmds from readme * working census tile fetch * PR review * Github Actions Work	2021-11-01 18:05:05 -04:00
Jorge Escobar	a94b8e2761	final census GHA	2021-10-14 13:50:56 -04:00
Jorge Escobar	8ddfc6b305	Update application.py	2021-10-14 13:31:37 -04:00
Jorge Escobar	3b04356fb3	Data sources from S3 (#769 ) * Started 535 * Data sources from S3 * lint * renove breakpoints * PR comments * lint * census data completed * lint * renaming data source	2021-10-13 16:00:33 -04:00
Billy Daly	d1273b63c5	Add ETL Contract Checks (#619 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline * Adds new methods to ExtractTransformLoad base class: - __init__() Initializes class attributes - _get_census_fips_codes() Loads a dataframe with the fips codes for census block group and tract - validate_init() Checks that the class was initialized correctly - validate_output() Checks that the output was loaded correctly * Adds test for ExtractTransformLoad.__init__() and base.py * Fixes failing flake8 test * Changes geo_col to geoid_col and changes is_dataset to is_census in yaml * Adds test for validate_output() * Adds remaining tests * Removes is_dataset from init method * Makes CENSUS_CSV a class attribute instead of a class global: This ensures that CENSUS_CSV is only set when the ETL class is for a non-census dataset and removes the need to overwrite the value in mock_etl fixture * Re-formats files with black and fixes broken tox tests	2021-10-13 15:54:15 -04:00
Shelby Switzer	d8c73e6a02	Change downloadable file names (#708 ) * Change downloadable file names * Remove constants because we're dynamically creating these * Update to "communities" for the descriptor word based on team convo * Add timestamp in 2020-09-20-0930 format because I personally think this is the best ^.^ * Add a CLI command to run ETL Score Post so that we don't have to run the score generation just to get new downloadable files. * Also make sure the old downloadable files are cleaned up on the run of this command. * Remove unused library, thanks pylint! Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-10-01 15:04:37 -04:00
Jorge Escobar	5bd63c083b	Run all Census, ETL, Score, Combine and Tilefy in one command (#662 ) * Run all Census, ETL, Score, Combine and Tilefy in one command * docker cmd * some docker improvements * feedback updates * lint	2021-09-14 14:15:34 -04:00
Nat Hillard	6fb36ded9c	adding additional missed import (#477 )	2021-08-06 11:48:11 -04:00
Nat Hillard	9d962eb5d9	Moving from relative imports to absolute to enable poetry run python data-pipeline/application.py [command] (#476 )	2021-08-06 11:41:28 -04:00
Nat Hillard	45a8b1c026	Census ETL should use standard ETL form (#474 ) * Fixes #473 Census ETL should use standard ETL form * linter fixes	2021-08-06 11:01:51 -04:00
Nat Hillard	c1568e87c0	Data directory should adopt standard Poetry-suggested python package structure (#457 ) * Fixes #456 - Our data directory should adopt standard python package structure * a few missed references * updating readme * updating requirements * Running Black * Fixes for flake8 * updating pylint	2021-08-05 15:35:54 -04:00

15 commits