Commit graph

811 commits

Author SHA1 Message Date
Shaun Verch
2aa79a334c
Add worst case scenario rollback plan (#1069)
Fixes: https://github.com/usds/justice40-tool/issues/947

This is ensuring we at least have some way to rollback in the worst case
scenario of the data being corrupted somehow, and our upstream data
sources being down in a way that prevents us from regenerating any
datasets. See
https://github.com/usds/justice40-tool/issues/946#issuecomment-989155252
for an analysis of the different ways the pipeline can fail and their
impacts.

This guide is very brittle to pipeline changes, and should be considered
a temporary solution for the worst case. I think that someday the github
actions should run more frequently and write to different paths, perhaps
with a timestamp. That would make rolling back more straightforward, as
a previous version of the data would already exist at some path in s3.
2022-01-05 11:17:43 -05:00
Saran Ahluwalia
a4137fdc98
Add Michigan EJ Screen into data-pipeline's ETL and provide automated scoring and statistics outputs (#1091)
* draft wip

* initial commit

* clear output from notebook

* revert to 65ceb7900f

* draft wip

* initial commit

* clear output from notebook

* revert to 65ceb7900f

* make michigan prefix for readable

* standardize Michigan names and move all constants from class into field names module

* standardize Michigan names and move all constants from class into field names module

* include only pertinent columns for scoring comparison tool

* michigan EJSCREEN standardization

* final PR feedback

* added exposition and summary of Michigan EJSCREEN

* added exposition and summary of Michigan EJSCREEN

* fix typo

Co-authored-by: Saran Ahluwalia <ahlusar.ahluwalia@gmail.com>
2021-12-31 15:38:52 -05:00
Saran Ahluwalia
24f8eb93c4
Tree Equity Output: Change output from Geojson to CSV format for easier analysis (#1089)
Added Tree Equity

* draft wip

* revised documentation

* revised documentation

* revised documentation and defer to super

* change word in logger

* fix flake 8

* address nit

Co-authored-by: Saran Ahluwalia <ahlusar.ahluwalia@gmail.com>
2021-12-30 17:17:28 -05:00
Vim
356e16950f
Fix territory shortcuts when census tract is selected (#1082)
* Refactor map click event architecture

- combine territory map clickHandlers
- centers AS on the map

* Center US on the map

- make the east and west coast both viewable
- make clicking on the 48, show the same zoom/lat/long as initial map
- centers Hawaii on map

* Update link to map performance

* Explicitly show links as the links return a 403

* Removes link and spells link out
2021-12-28 15:30:22 -08:00
Lucas Merrill Brown
beb0eea5cc
Alternative definition of DACs for comparison (#1068)
* Alternative energy-related definition of DACs
2021-12-27 12:05:59 -05:00
Kameron Kerger
e15bb52bad
548-update-pdf (#1081)
latest pdf copy with links now added for each data source
2021-12-21 14:12:20 -05:00
Vim
409c7238ae
Make latest copy changes from Living Copy (#1055)
* Make latest copy changes

- update snapshots

* Update cypress test on feedback link

- update snapshot

* Update side panel and copy

- update snapshots

* Make 2nd EO link open in new tab

* Add latest changes from Living copy

* Add back HS indicator to map

* Add "X of Y thresholds exceed" to side panel

- update snapshots

* Update with latest copy

* Update to latest copy

- make BETA pill in logo bold
- correct exceed to exceeded
- update snapshots
- update page title to Meth & data

* Update total indicators to 21

* Update snapshot
2021-12-17 13:17:57 -08:00
Lucas Merrill Brown
0d57dd572b
Stop swallowing Census API errors (#1051) 2021-12-16 10:54:41 -05:00
Shaun Verch
d90e028c1b
Update documentation to make it easier for users to find the right content for them (#1016)
* First pass of updating documentation for new users

Trying to look at this from the perspective of someone new to the
project, and create some pathways to make it easier for people to get to
the content they are looking for.

* Make it clear that docker is doing the setup

* Link installation again from the main README

* Add some docs about the github actions

* Add markdown link check

* Move git installation first

* Add config for markdown link checker

* Fix some links

* Correct handling of repo root relative links

* Fix broken links in data roadmap

* Fix more broken links

* Fix more links

* Ignore link that's returning a 403 to the checker

It actually works if you go in a browser.

* Fix another broken link

* Ignore more urls that don't work

* Update the readme under docs

* Add some more dataset links

* More strongly call out the quickstart

* Try to call out even more the quickstart link

* Fix dead links

* Add note about initialization time

* Remove broken link from spanish install guide

These will be updated later with a full translation
2021-12-16 10:16:28 -05:00
Lucas Merrill Brown
0d10534725
Issue 1044: Add low HS education fields to tiles and download (#1046) 2021-12-14 15:41:06 -05:00
Vim
000da0f3ac
Place spanish content on feature ?flags=sp (#1027)
- update snapshots
2021-12-14 10:11:35 -08:00
Vim
c9caa97ce3
Remove GU and VI (#1028)
- comment out GU and VI code
- remove search from feature flag
- keep comment on search production bug when wrapping div
- add note on territories
- add territories copy to constants
2021-12-14 08:02:05 -08:00
dependabot[bot]
9dc70d48a4
Bump lxml from 4.6.3 to 4.6.5 in /data/data-pipeline (#1043)
Bumps [lxml](https://github.com/lxml/lxml) from 4.6.3 to 4.6.5.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.6.3...lxml-4.6.5)

---
updated-dependencies:
- dependency-name: lxml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 16:41:50 -05:00
Vim
8a0e3a1293
Updated side panel for score L launch (#1022)
* Remove un-needed state and useEffect vars

* Add initial Accordion UI to side panel

- abstract out Indicators to separate component
- add snapshot test
- define new indicators in EXPLORE copy
- intl copy to AreaDetail component

* Make side panel indicators styling match design

* Rename export IndicatorCategory -> CategoryCard

* Add disadvangted dots to category and indicators

- add new Category component
- add new DisadvantageDot component
- make copy corrections
- comment out send feedback link in side panel

* Integrate MapLegend's dot into component

- change color to 'blue-warm-70v'
- update map stroke to 'blue-warm-70v'

* Add new indicator names from BE

- add abbreviations and use key in json file to decode
2021-12-13 15:52:27 -05:00
Jorge Escobar
9709d08ca3
Update Side Panel Tile Data (#866)
* Update Side Panel Tile Data

* Update Side Panel Tile Data

* Correct indicator names to match csv

* Replace Score with Rate

* Comment out FEMA Loss Rate to troubleshoot

* Removes all "FEMA Loss Rate" array elements

* Revert FEMA to Score

* Remove expected loss rate

* Remove RMP and NPL from BASIC array

* Attempt to make shape mismatch align

- update README typo

* Add Score L indicators to TILE_SCORE_FLOAT_COLUMNS

* removing cbg references

* completes the ticket

* Update side panel fields

* Update index file writing to create parent dir

* Updates from linting

* fixing missing field_names for island territories 90th percentile fields

* Update downloadable fields and fix field name

* Update file fields and tests

* Update ordering of fields and leave TODO

* Update pickle after re-ordering of file

* fixing bugs in etl_score_geo

* Repeating index for diesel fix

* passing tests

* adding pytest.ini

Co-authored-by: Vim USDS <vimal.k.shah@omb.eop.gov>
Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-12-13 14:53:50 -05:00
Shaun Verch
83eb7b0982
Silence dev only vulnerabilities (#1041)
Showing obscure vulnerabilities that only exist in the dev setup creates
more noise and means that they just get ignored (because they are
probably low priority). Silencing them means when we get a vulnerable
dependency alert we know to pay attention to it.

Comes from https://github.com/dependabot/dependabot-core/issues/2521 and
501bbef578.
2021-12-13 13:54:59 -05:00
Saran Ahluwalia
ad6dbf9709
remove data roadmap directory from repository (#1034)
Removed data roadmap
2021-12-10 13:54:46 -05:00
Lucas Merrill Brown
7fcecaee42
Issue 970: reverse percentiles for AMI and life expectancy (#1018)
* switching to low

* fixing score-etl-post

* updating comments

* fixing comparison

* create separate field for clarity

* comment fix

* removing healthy food

* fixing bug in score post

* running black and adding comment

* Update pickles and add a helpful notes to README

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-12-10 10:16:22 -05:00
Vim
24bac56d9e
Methodology page update for Score L (#1010)
* Add first column of Methodology score L

- create a new component MethodologyFormula (MF)
- MF component contain text on formula calculation
- add snapshots

* Add 2nd column to Methodology page

- add LowIncome component
- add USWDS styles on download component
- add margin-top styles to global
- enable all font-sizes to theme file

* Add Categories to Methodology page

- create CategoryCard component
- create Categories component
- add snapshots

* Update datasets

- update styling to match mock
- add additional indicators
- remove additional indicators
- update snapshots

* Add links to categories to datasets

- update snapshots

* Remove additional indicator test as they now N/A

* ensure each DOM ID is unique for a11y

- update snapshots

* Add Category heading for a11y

- removes ScoreSteps tests
- comment out ScoreStep component
- update snapshots
- cypress passes all a11y

* Update to methodology copy

- based on PDF and spreadsheet and Living Copy
-updates snapshots

* Add comments around using IF, AND, ELSE constants

- make indicator constant names more explicit

* Update copy based on living doc

- update snapshots
2021-12-09 10:42:37 -08:00
Shelby Switzer
123fbf6254
Remove "infrastucture" directory (#996)
We originally used the approach in the infrastructure directory when we
thought we would be using Amazon Lambda for different parts of our
deployment pipeline. We have since then moved to using Github Actions
and no longer need this code, and keeping it in `main` has caused
confusion for onboarding new folks. This commit removes the directory
(although this will still be around in version control so we can always
view it or bring it back in the future if we want to).

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-12-09 11:25:19 -05:00
Lucas Merrill Brown
f91de51a75
Issue 1007 continued: Re-ordering fields for clarity (#1014) 2021-12-09 11:07:37 -05:00
Vim
1f5742bc5b
Modify copy on About and Explore Tool pages (#974)
* Modify copy

- update snapshots

* Fix failing cypress tests

- commented out lat/lng in URL test as it is intermittent

* Removes test on EO link

* Update copy for launch

- adds 404 page verbiage
- fixes survey button to be bottom sticky

* Update copy
2021-12-08 10:15:31 -08:00
Saran Ahluwalia
df675b231a
Update HUD Housing Burden (#1005)
* update paths

* size information added in extract function

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-08 11:57:52 -05:00
Lucas Merrill Brown
524b822651
Issue 1007: remove some recent additions to Definition L (#1008) 2021-12-08 10:26:52 -05:00
Lucas Merrill Brown
1a61026ecf
Issue 967: Calculate urban/rural percentiles (#1006) 2021-12-07 17:28:36 -05:00
Lucas Merrill Brown
780d1126ff
Creating notebook to compare two score files for differences (#984) 2021-12-07 16:20:41 -05:00
Vim
9d28f5a4c4
Add a case if data is not present in tiles (#998)
- will check each property and display N/A if null
- update snapshot
2021-12-07 09:52:50 -08:00
Lucas Merrill Brown
5706837956
Add NATA cancer risk and respiratory hazard to definition L (#1001) 2021-12-07 12:45:45 -05:00
Lucas Merrill Brown
5a6d6d8557
Issue 954: Add various data sources from Child Opportunity Index (#986)
* Adds four fields:
    * Summer days above 90F
    * Percent low access to healthy food
    * Percent impenetrable surface areas
    * Low third grade reading proficiency

* Each of these four gets added into Definition L in various factors.

* Additionally, I add college attendance fields to the ETL for Census ACS.

* This PR also introduces the notion of "reverse percentiles", relevant to ticket #970.
2021-12-07 11:33:49 -05:00
Vim
df564658a5
Add 4 additional territory buttons (#956)
* Add 4 additional territory buttons

* Fix bug where territories fires multiple times

- move territory handler from J40Map to component

* Update SVGs for all territory buttons
2021-12-06 19:58:04 -08:00
Shelby Switzer
819f3ff478
Update etl constants to use score field_names and put strings around tract IDs in downloadable CSV (#985)
* Update etl constants to use score field_names

Put strings around tract IDs in downloadable CSV

No need to modify the xls file creation because the string type is
preserved and interpreted correctly in Excel already.

One note is that this does cause the ID in the CSV to be have quotes
around it, which might be annoying. Maybe we don't want this behavior?

* Update based on PR feedback and lint needs

* Change field we're using in downloadable

This reverts the downloadable csv field list to use
MEDIAN_INCOME_AS_PERCENT_OF_STATE_FIELD instead of
MEDIAN_INCOME_AS_PERCENT_OF_AMI_FIELD in order to get the test to pass.
The point of this PR is a refactor (and a small change to the CSV
quotations), not to change the output. That will be a different PR
later.

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-12-06 13:17:17 -05:00
Jorge Escobar
bbc4a4dec0
Fix bug that dropped rows from island territories (#981)
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-12-05 20:00:53 -05:00
Saran Ahluwalia
07ee4165b4
New Create indicators for all thresholds exceeded by a community in Definition L (#980)
* added fieldnames

* todo pollution, water, health & workforce

* workforce

* work in progress

* add utility function to replace duplicate summation logic

* move fpl series into add columns - run black .

* added revisions - still a wip

* added fieldnames

* todo pollution, water, health & workforce

* workforce

* work in progress

* add utility function to replace duplicate summation logic

* move fpl series into add columns - run black .

* added revisions - still a wip

* revise workforce and water

* revise housing and add incremental counter for workforce

* last PR nit

* revise workforce

* more PR feedback in score l

* more PR feedback in score l

* more PR feedback in score l

* addd FPL_SERIES and update references in score 1

* fix bugs

* reparameterize function

* final revisions in fieldnames

* make computations all consistent so we assing with FPL_200_SERIES

* fieldnames refactor after clarification and PR review

* finalize

* finalize with no typos

* fix length

* added median income var

* swap thresholds

* remove iteration

* remove stray '

* address flake 8

* added f string formatting and fixed typos

* added f string formatting and fixed typos

* move up

* remove dupes

* reformat

* fix bugs

* fix bugs

* initialize

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-05 19:51:19 -05:00
Lucas Merrill Brown
d705a8244c
adding demographics information to ETL source data (#982) 2021-12-05 17:56:45 -05:00
Saran Ahluwalia
610343a1e3
DoE fix to address #975 (#979)
* Fixed input field name

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-05 08:26:39 -05:00
Saran Ahluwalia
19efdfeb4a
DoE LEAD References Need to be Updated (#976)
* replace temporary fieldnames that are not found and indexed

* fixed field names

* PR review

* PR review - revert

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-04 12:38:50 -05:00
Lucas Merrill Brown
c5dff6e5f7
Issue 242: Add HOLC Grades to data inputs (#978)
* Add mapping inequality data to data inputs

* Add mapping inequality data to comparison tool
2021-12-04 12:23:01 -05:00
Lucas Merrill Brown
1d101c93d2
Issue 844: Add island areas to Definition L (#957)
This ended up being a pretty large task. Here's what this PR does:

1. Pulls in Vincent's data from island areas into the score ETL. This is from the 2010 decennial census, the last census of any kind in the island areas.
2. Grabs a few new fields from 2010 island areas decennial census.
3. Calculates area median income for island areas.
4. Stops using EJSCREEN as the source of our high school education data and directly pulls that from census (this was related to this project so I went ahead and fixed it).
5. Grabs a bunch of data from the 2010 ACS in the states/Puerto Rico/DC, so that we can create percentiles comparing apples-to-apples (ish) from 2010 island areas decennial census data to 2010 ACS data. This required creating a new class because all the ACS fields are different between 2010 and 2019, so it wasn't as simple as looping over a year parameter.
6. Creates a combined population field of island areas and mainland so we can use those stats in our comparison tool, and updates the comparison tool accordingly.
2021-12-03 15:46:10 -05:00
Saran Ahluwalia
8cb9d197df
updated doe enerygy link and changed fieldnames - removed computation step as BURDEN is already ratio (#963)
Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-03 13:33:19 -05:00
Saran Ahluwalia
8cb1070d1e
Integrate proximity to waste sites into pollution factors (#959)
* add tsdf proximity into predicate to determine thresholds

* strict inequality --> inclusive

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-03 12:43:48 -05:00
Saran Ahluwalia
fdba1eb171
Revisions to FEMA measure and new link for FEMA data (#952)
* per tract collect all diaster total annual expected loss - numerator

* add updated numerators

* EALP columns are missing on tox check - this will ensure only EALP columns that exist are subet on

* EALB columns are missing on tox check - this will ensure only EALP columns that exist are subet on

* reverted to incorporate megatracts

* updated unit tests

* fix tests

* add transform

* remove print statement

* input reflects input from FEMA risks for tracts

* revise tests and update fixtures - clean up tests and main transform function

* added more records

* remove references to Blocks in keyword args in tests

* linting

* addressed latest PR feedback

* remove imports and update arguments to be compatible for 1.1.0

* remove block reference in test

* change precision to 10 digits - refactor tests to accomdate this

Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-03 12:42:07 -05:00
Vim
8e31ca032c
[Draft] Adds Nominatum search behind a feature flag (#935)
* Add intial search component

* Add nominatum simple

* Connect search field to Nominatum API

- remove react-query
- remove react-query logic from J40Map
- move searchHandler to MapSearch

* Adjust zoom and territory focus

- adjust zoom buttons in CSS to allow for search field

* Place search behind a feature flag

* Add cors to fetch and error handling

- this is to test on OMB machines

* Add error messaging and bound search results to US

- adjust controls to add error message to search
- add MapSearchMessage component for error message
- add unit tests
- add state to track if API results are empty
- add intl on two strings, placeholder and error message

* Remove warpper around MapSearch component

- reorder component import in J40Map
- remove unused CSS in MapSearch.module.scss
- remove and comment on wrapper error on MapSearch
- rename isSearchEmpty to isSearchResultsEmpty
- update snapshot

* Add error message

- if the search query returns null, show an error message
2021-12-03 07:56:15 -08:00
Vincent La
84874ee4a5
[ISS-751] Updating comments for geocorr ETL (#913) 2021-12-03 10:10:05 -05:00
Saran Ahluwalia
0873d79254
change threshold for workforce factor (#950)
Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-12-02 16:43:22 -05:00
Lucas Merrill Brown
4c04494da9
Merge pull request #924 from usds/mega-tracts
Merging mega-tracts to main
2021-12-01 11:12:38 -05:00
Lucas Merrill Brown
5c65eed28f
Issue 838: Update comparison tool to use tracts (#934)
* Updating comparison tool to use tracts, and rely more heavily on `field_names`
2021-11-30 18:46:29 -05:00
Jorge Escobar
49ce0f5911
Update combine-tilefy.yml 2021-11-30 15:03:07 -05:00
Saran Ahluwalia
6d0cb29dcd
create new copy and remove chained assignment (#939)
Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
2021-11-30 14:20:29 -05:00
Lucas Merrill Brown
d2352c6217 Fix too many tracts in join error in ACS (#933) 2021-11-30 13:49:21 -05:00
Jorge Escobar
384cfa5d70 Update generate-census.yml 2021-11-30 13:49:21 -05:00