Imputing income using geographic neighbors (#1559)

Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.
This commit is contained in:
Emma Nechamkin 2022-04-27 15:59:10 -04:00 committed by Emma Nechamkin
commit f047ca9d83
16 changed files with 1245 additions and 81 deletions

View file

@ -40,7 +40,7 @@ def validate_new_data(
assert (
checking_df[score_col].nunique() <= 3
), f"Error: there are too many values possible in {score_col}"
assert (True in checking_df[score_col].unique()) & (
assert (True in checking_df[score_col].unique()) | (
False in checking_df[score_col].unique()
), f"Error: {score_col} should be a boolean"