j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-07-29 19:31:16 -07:00

History

Emma Nechamkin 9c0e1993f6 Pipeline tile tests (#1864 ) * temp update * updating with fips check * adding check on pfs * updating with pfs test * Update test_tiles_smoketests.py * Fix lint errors (#1848) * Add column names test (#1848) * Mark tests as smoketests (#1848) * Move to other score-related tests (#1848) * Recast Total threshold criteria exceeded to int (#1848) In writing tests to verify the output of the tiles csv matches the final score CSV, I noticed TC/Total threshold criteria exceeded was getting cast from an int64 to a float64 in the process of PostScoreETL. I tracked it down to the line where we merge the score dataframe with constants.DATA_CENSUS_CSV_FILE_PATH --- there where > 100 tracts in the national census CSV that don't exist in the score, so those ended up with a Total threshhold count of np.nan, which is a float, and thereby cast those columns to float. For the moment I just cast it back. * No need for low memeory (#1848) * Add additional tests of tiles.csv (#1848) * Drop pre-2010 rows before computing score (#1848) Note this is probably NOT the optimal place for this change; it might make more sense for each source to filter its own tracts down to the acceptable tract list. However, that would be a pretty invasive change, where this is central and plenty of other things are happening in score transform that could be moved to sources, so for today, here's where the change will live. * Fix typo (#1848) * Switch from filter to inner join (#1848) * Remove no-op lines from tiles (#1848) * Apply feedback from review, linter (#1848) * Check the values oeverything in the frame (#1848) * Refactor checker class (#1848) * Add test for state names (#1848) * cleanup from reviewing my own code (#1848) * Fix lint error (#1858) * Apply Emma's feedback from review (#1848) * Remove refs to national_df (#1848) * Account for new, fake nullable bools in tiles (#1848) To handle a geojson limitation, Emma converted some nullable boolean colunms to float64 in the tiles export with the values {0.0, 1.0, nan}, giving us the same expressiveness. Sadly, this broke my assumption that all columns between the score and tiles csvs would have the same dtypes, so I need to account for these new, fake bools in my test. * Use equals instead of my worse version (#1848) * Missed a spot where we called _create_score_data (#1848) * Update per safety (#1848) Co-authored-by: matt bowen <matthew.r.bowen@omb.eop.gov>		2022-09-01 13:07:14 -04:00
..
comparison_tool	Imputing income using geographic neighbors (#1559 )	2022-08-11 12:33:45 -04:00
content	updated to show T/F/null vs T/F for AML and FUDS (#1866 )	2022-08-24 20:22:59 -04:00
data	Starting Tribal Boundaries Work (#1736 )	2022-07-30 01:13:10 -04:00
etl	Pipeline tile tests (#1864 )	2022-09-01 13:07:14 -04:00
files	Add files via upload (#1656 )	2022-05-31 13:19:01 -04:00
ipython	just testing that the boolean is preserved on gha (#1867 )	2022-08-31 12:55:03 -04:00
score	tribal tiles fix (#1874 )	2022-09-01 10:19:13 -04:00
tests	Pipeline tile tests (#1864 )	2022-09-01 13:07:14 -04:00
tile	Score tests (#1847 )	2022-08-26 15:23:20 -04:00
__init__.py	Data directory should adopt standard Poetry-suggested python package structure (#457 )	2021-08-05 15:35:54 -04:00
application.py	Add FUDS ETL (#1817 )	2022-08-16 13:28:39 -04:00
config.py	Score tests (#1847 )	2022-08-26 15:23:20 -04:00
utils.py	Score tests (#1847 )	2022-08-26 15:23:20 -04:00