Score tests (#1847)

* update Python version on README; tuple typing fix

* Alaska tribal points fix (#1821)

* Bump mistune from 0.8.4 to 2.0.3 in /data/data-pipeline (#1777)

Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* poetry update

* initial pass of score tests

* add threshold tests

* added ses threshold (not donut, not island)

* testing suite -- stopping for the day

* added test for lead proxy indicator

* Refactor score tests to make them less verbose and more direct (#1865)

* Cleanup tests slightly before refactor (#1846)

* Refactor score calculations tests

* Feedback from review

* Refactor output tests like calculatoin tests (#1846) (#1870)

* Reorganize files (#1846)

* Switch from lru_cache to fixture scorpes (#1846)

* Add tests for all factors (#1846)

* Mark smoketests and run as part of be deply (#1846)

* Update renamed var (#1846)

* Switch from named tuple to dataclass (#1846)

This is annoying, but pylint in python3.8 was crashing parsing the named
tuple. We weren't using any namedtuple-specific features, so I made the
type a dataclass just to get pylint to behave.

* Add default timout to requests (#1846)

* Fix type (#1846)

* Fix merge mistake on poetry.lock (#1846)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Bowen <83967628+mattbowen-usds@users.noreply.github.com>
Co-authored-by: matt bowen <matthew.r.bowen@omb.eop.gov>
This commit is contained in:
Emma Nechamkin 2022-08-26 15:23:20 -04:00 committed by GitHub
commit 1c4d3e4142
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
19 changed files with 1425 additions and 29 deletions

View file

@ -0,0 +1,354 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "c9fab286",
"metadata": {},
"outputs": [],
"source": [
"# %load_ext lab_black\n",
"import json\n",
"import pandas as pd\n",
"import geopandas as gpd"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "dbd84e10",
"metadata": {},
"outputs": [
{
"ename": "DriverError",
"evalue": "/mnt/e/opt/justice40-tool/data/data-pipeline/data_pipeline/data/score/csv/tiles/usa.csv: No such file or directory",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mCPLE_OpenFailedError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32mfiona/_shim.pyx\u001b[0m in \u001b[0;36mfiona._shim.gdal_open_vector\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mfiona/_err.pyx\u001b[0m in \u001b[0;36mfiona._err.exc_wrap_pointer\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mCPLE_OpenFailedError\u001b[0m: /mnt/e/opt/justice40-tool/data/data-pipeline/data_pipeline/data/score/csv/tiles/usa.csv: No such file or directory",
"\nDuring handling of the above exception, another exception occurred:\n",
"\u001b[0;31mDriverError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m/tmp/ipykernel_10603/1449522338.py\u001b[0m in \u001b[0;36m<cell line: 3>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Read in the score geojson file\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mdata_pipeline\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0metl\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscore\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconstants\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mDATA_SCORE_CSV_TILES_FILE_PATH\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mnation\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_file\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDATA_SCORE_CSV_TILES_FILE_PATH\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m~/.cache/pypoetry/virtualenvs/data-pipeline-WziHKidv-py3.8/lib/python3.8/site-packages/geopandas/io/file.py\u001b[0m in \u001b[0;36m_read_file\u001b[0;34m(filename, bbox, mask, rows, **kwargs)\u001b[0m\n\u001b[1;32m 158\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 159\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mfiona_env\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 160\u001b[0;31m \u001b[0;32mwith\u001b[0m \u001b[0mreader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpath_or_bytes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mfeatures\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 161\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[0;31m# In a future Fiona release the crs attribute of features will\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.cache/pypoetry/virtualenvs/data-pipeline-WziHKidv-py3.8/lib/python3.8/site-packages/fiona/env.py\u001b[0m in \u001b[0;36mwrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 406\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 407\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mlocal\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_env\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 408\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 409\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 410\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.cache/pypoetry/virtualenvs/data-pipeline-WziHKidv-py3.8/lib/python3.8/site-packages/fiona/__init__.py\u001b[0m in \u001b[0;36mopen\u001b[0;34m(fp, mode, driver, schema, crs, encoding, layer, vfs, enabled_drivers, crs_wkt, **kwargs)\u001b[0m\n\u001b[1;32m 262\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 263\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mmode\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m'a'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'r'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 264\u001b[0;31m c = Collection(path, mode, driver=driver, encoding=encoding,\n\u001b[0m\u001b[1;32m 265\u001b[0m layer=layer, enabled_drivers=enabled_drivers, **kwargs)\n\u001b[1;32m 266\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mmode\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m'w'\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.cache/pypoetry/virtualenvs/data-pipeline-WziHKidv-py3.8/lib/python3.8/site-packages/fiona/collection.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, ignore_fields, ignore_geometry, **kwargs)\u001b[0m\n\u001b[1;32m 160\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m'r'\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 161\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msession\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mSession\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 162\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msession\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstart\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 163\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;34m'a'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'w'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 164\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msession\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mWritingSession\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32mfiona/ogrext.pyx\u001b[0m in \u001b[0;36mfiona.ogrext.Session.start\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mfiona/_shim.pyx\u001b[0m in \u001b[0;36mfiona._shim.gdal_open_vector\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mDriverError\u001b[0m: /mnt/e/opt/justice40-tool/data/data-pipeline/data_pipeline/data/score/csv/tiles/usa.csv: No such file or directory"
]
}
],
"source": [
"# Read in the score geojson file\n",
"from data_pipeline.etl.score.constants import DATA_SCORE_CSV_TILES_FILE_PATH\n",
"nation = gpd.read_file(DATA_SCORE_CSV_TILES_FILE_PATH)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f850529",
"metadata": {},
"outputs": [],
"source": [
"nation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f342d36",
"metadata": {},
"outputs": [],
"source": [
"# get the columns of the df and sort the list:\n",
"sorted_nation = sorted(nation.columns.to_list())"
]
},
{
"cell_type": "markdown",
"id": "97aac08f",
"metadata": {},
"source": [
"CLI to covert a pbf into a json file (requires tippecannoe and jq to be installed)\n",
"\n",
"```bash\n",
"curl https://justice40-data.s3.amazonaws.com/data-pipeline-staging/1822/e6385c172f1d2adf588050375b7c0985035cfb24/data/score/tiles/high/8/67/101.pbf -o uh-1822-e638-8-67-101.pbf | tippecanoe-decode uh-1822-e638-8-67-101.pbf 8 67 101 | jq > cat uh-1822-e638-8-67-101.json\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cbe37ccb",
"metadata": {},
"outputs": [],
"source": [
"# load a random high-tile json (after decoding a pbf) file using json.loads()\n",
"with open(\"/Users/vims/Downloads/uh-1822-e638-8-67-101.json\", \"r\") as f:\n",
" random_tile_features = json.loads(f.read())\n",
"\n",
"# Flatten data around the features key:\n",
"flatten_features = pd.json_normalize(random_tile_features, record_path=[\"features\"])\n",
"\n",
"# index into the feature properties, get keys and turn into a sorted list\n",
"random_tile = sorted(list(flatten_features[\"features\"][0][0][\"properties\"].keys()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a33f5126",
"metadata": {},
"outputs": [],
"source": [
"set_dif = set(sorted_nation).symmetric_difference(set(random_tile))\n",
"list(set_dif)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d228360b",
"metadata": {},
"outputs": [],
"source": [
"nation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6925138",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f2d7ba0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOID10</th>\n",
" <th>SF</th>\n",
" <th>CF</th>\n",
" <th>HRS_ET</th>\n",
" <th>AML_ET</th>\n",
" <th>FUDS_ET</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>71</th>\n",
" <td>27061480300</td>\n",
" <td>Minnesota</td>\n",
" <td>Itasca County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75</th>\n",
" <td>27061940000</td>\n",
" <td>Minnesota</td>\n",
" <td>Itasca County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115</th>\n",
" <td>27077460400</td>\n",
" <td>Minnesota</td>\n",
" <td>Lake of the Woods County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>127</th>\n",
" <td>27123042001</td>\n",
" <td>Minnesota</td>\n",
" <td>Ramsey County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>160</th>\n",
" <td>27123033400</td>\n",
" <td>Minnesota</td>\n",
" <td>Ramsey County</td>\n",
" <td>0</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74047</th>\n",
" <td>16055000200</td>\n",
" <td>Idaho</td>\n",
" <td>Kootenai County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74068</th>\n",
" <td>16011950500</td>\n",
" <td>Idaho</td>\n",
" <td>Bingham County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74076</th>\n",
" <td>16001010503</td>\n",
" <td>Idaho</td>\n",
" <td>Ada County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74107</th>\n",
" <td>16001001000</td>\n",
" <td>Idaho</td>\n",
" <td>Ada County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74123</th>\n",
" <td>16001002100</td>\n",
" <td>Idaho</td>\n",
" <td>Ada County</td>\n",
" <td>None</td>\n",
" <td>None</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>3170 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" GEOID10 SF CF HRS_ET AML_ET FUDS_ET\n",
"71 27061480300 Minnesota Itasca County None None 0\n",
"75 27061940000 Minnesota Itasca County None None 0\n",
"115 27077460400 Minnesota Lake of the Woods County None None 0\n",
"127 27123042001 Minnesota Ramsey County None None 0\n",
"160 27123033400 Minnesota Ramsey County 0 None 0\n",
"... ... ... ... ... ... ...\n",
"74047 16055000200 Idaho Kootenai County None None 0\n",
"74068 16011950500 Idaho Bingham County None None 0\n",
"74076 16001010503 Idaho Ada County None None 0\n",
"74107 16001001000 Idaho Ada County None None 0\n",
"74123 16001002100 Idaho Ada County None None 0\n",
"\n",
"[3170 rows x 6 columns]"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_HRS_GEO = nation[['GEOID10', 'SF', 'CF', 'HRS_ET', 'AML_ET', 'FUDS_ET']]\n",
"nation_HRS_GEO.loc[nation_HRS_GEO['FUDS_ET'] == '0']"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "02eef4b5",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "678bea72",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([None, '0', '1'], dtype=object)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation['HRS_ET'].unique()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.10 ('data-pipeline-WziHKidv-py3.8')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"vscode": {
"interpreter": {
"hash": "c28609757c27a373a12dad8bc3a2aec46aa91130799a09665fba7d386f9c3756"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View file

@ -0,0 +1,496 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "27da604f",
"metadata": {},
"outputs": [],
"source": [
"# %load_ext lab_black\n",
"import json\n",
"import pandas as pd\n",
"import geopandas as gpd\n",
"\n",
"# Read in the above json file\n",
"nation=gpd.read_file(\"/Users/vims/Downloads/usa-high-1822-637b.json\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7b7083fd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 None\n",
"1 None\n",
"2 None\n",
"3 None\n",
"4 None\n",
" ... \n",
"74129 None\n",
"74130 None\n",
"74131 None\n",
"74132 None\n",
"74133 None\n",
"Name: FUDS_RAW, Length: 74134, dtype: object"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation['FUDS_RAW']"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "117477e6",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOID10</th>\n",
" <th>SF</th>\n",
" <th>CF</th>\n",
" <th>HRS_ET</th>\n",
" <th>AML_ET</th>\n",
" <th>AML_RAW</th>\n",
" <th>FUDS_ET</th>\n",
" <th>FUDS_RAW</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>27139080202</td>\n",
" <td>Minnesota</td>\n",
" <td>Scott County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>27139080204</td>\n",
" <td>Minnesota</td>\n",
" <td>Scott County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>27139080100</td>\n",
" <td>Minnesota</td>\n",
" <td>Scott County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>27139080302</td>\n",
" <td>Minnesota</td>\n",
" <td>Scott County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>27139080400</td>\n",
" <td>Minnesota</td>\n",
" <td>Scott County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74129</th>\n",
" <td>16005001601</td>\n",
" <td>Idaho</td>\n",
" <td>Bannock County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74130</th>\n",
" <td>16005001300</td>\n",
" <td>Idaho</td>\n",
" <td>Bannock County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74131</th>\n",
" <td>16005001000</td>\n",
" <td>Idaho</td>\n",
" <td>Bannock County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74132</th>\n",
" <td>16005000900</td>\n",
" <td>Idaho</td>\n",
" <td>Bannock County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>74133</th>\n",
" <td>16005000800</td>\n",
" <td>Idaho</td>\n",
" <td>Bannock County</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" <td>False</td>\n",
" <td>None</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>74134 rows × 8 columns</p>\n",
"</div>"
],
"text/plain": [
" GEOID10 SF CF HRS_ET AML_ET AML_RAW FUDS_ET \\\n",
"0 27139080202 Minnesota Scott County None False None False \n",
"1 27139080204 Minnesota Scott County None False None False \n",
"2 27139080100 Minnesota Scott County None False None False \n",
"3 27139080302 Minnesota Scott County None False None False \n",
"4 27139080400 Minnesota Scott County None False None False \n",
"... ... ... ... ... ... ... ... \n",
"74129 16005001601 Idaho Bannock County None False None False \n",
"74130 16005001300 Idaho Bannock County None False None False \n",
"74131 16005001000 Idaho Bannock County None False None False \n",
"74132 16005000900 Idaho Bannock County None False None False \n",
"74133 16005000800 Idaho Bannock County None False None False \n",
"\n",
" FUDS_RAW \n",
"0 None \n",
"1 None \n",
"2 None \n",
"3 None \n",
"4 None \n",
"... ... \n",
"74129 None \n",
"74130 None \n",
"74131 None \n",
"74132 None \n",
"74133 None \n",
"\n",
"[74134 rows x 8 columns]"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind = nation[['GEOID10', 'SF', 'CF', 'HRS_ET', 'AML_ET', 'AML_RAW','FUDS_ET', 'FUDS_RAW']]\n",
"nation_new_ind"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "0f37acf4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([None, '0', '1'], dtype=object)"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['HRS_ET'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "4ae865ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 8843\n",
"1 4045\n",
"Name: HRS_ET, dtype: int64"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['HRS_ET'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "2f0d29db",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([False, True])"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['AML_ET'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "646b3754",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False 72100\n",
"True 2034\n",
"Name: AML_ET, dtype: int64"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['AML_ET'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "0571df6d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([None, '1'], dtype=object)"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['AML_RAW'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "171fa3c9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1 2034\n",
"Name: AML_RAW, dtype: int64"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['AML_RAW'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "370b0769",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([False, True])"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['FUDS_ET'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "f8afb668",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False 72056\n",
"True 2078\n",
"Name: FUDS_ET, dtype: int64"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['FUDS_ET'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "f2e3b78a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([None, '0', '1'], dtype=object)"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['FUDS_RAW'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "b722e802",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 3170\n",
"1 2078\n",
"Name: FUDS_RAW, dtype: int64"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nation_new_ind['FUDS_RAW'].value_counts()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}