{ "cells": [ { "cell_type": "markdown", "id": "75d2705f", "metadata": {}, "source": [ "# EXPLORATION OF DONUT HOLE WATER BOUNDARY ISSUE" ] }, { "cell_type": "markdown", "id": "dea57328", "metadata": {}, "source": [ "## Imports, constants, and data load" ] }, { "cell_type": "code", "execution_count": 1, "id": "7ac0d5d9", "metadata": {}, "outputs": [], "source": [ "import geopandas as gpd\n", "import pandas as pd\n", "import numpy as np\n", "import os\n", "import sys\n", "\n", "module_path = os.path.abspath(os.path.join(\"../..\"))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", "\n", "from data_pipeline.config import settings\n", "from data_pipeline.etl.sources.geo_utils import (\n", " add_tracts_for_geometries,\n", " get_tract_geojson,\n", ")" ] }, { "cell_type": "code", "execution_count": 2, "id": "16b9ffd5", "metadata": {}, "outputs": [], "source": [ "# pull out necessary definitions from field_names.py\n", "ORIGINAL_TRACT = \"ORIGINAL_TRACT\"\n", "SCORE_N_COMMUNITIES = \"Definition N (communities)\"\n", "GEOID_TRACT_FIELD = \"GEOID10_TRACT\"\n", "ADJACENT_MEAN_SUFFIX = \" (based on adjacency index and low income alone)\"\n", "ADJACENCY_INDEX_SUFFIX = \" (average of neighbors)\"\n", "CAM_ID = '48061012305'" ] }, { "cell_type": "code", "execution_count": 3, "id": "b3c256f9", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-11-19 19:29:38,131 [ data_pipeline.etl.sources.geo_utils] DEBUG Loading tract geometry data from census ETL\n" ] }, { "data": { "text/plain": [ "Index(['STATEFP10', 'COUNTYFP10', 'TRACTCE10', 'GEOID10_TRACT', 'NAME10',\n", " 'NAMELSAD10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10',\n", " 'INTPTLAT10', 'INTPTLON10', 'geometry'],\n", " dtype='object')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# load Census data\n", "tract_data = get_tract_geojson()\n", "tract_data.columns" ] }, { "cell_type": "markdown", "id": "f829c206", "metadata": {}, "source": [ "## Look at example tract in Cameron, TX" ] }, { "cell_type": "markdown", "id": "f6d133ca", "metadata": {}, "source": [ "Tract no. 48061012305 in Cameron, TX is an example of a tract that displays the donut hole water boundary issue: all the tracts that it borders are classified as disadvantaged, and it meets the poverty threshold, yet it is being incorrectly categorized as non-disadvantaged due to its water boundary. Here we pull all adjacent tracts, compare to what we see on the map, and then look for solutions." ] }, { "cell_type": "code", "execution_count": 4, "id": "106425b6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFP10COUNTYFP10TRACTCE10ORIGINAL_TRACTNAME10NAMELSAD10MTFCC10FUNCSTAT10ALAND10AWATER10INTPTLAT10INTPTLON10geometryDefinition N (communities)
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703POLYGON ((-97.22490 26.41153, -97.22436 26.411...False
\n", "
" ], "text/plain": [ " STATEFP10 COUNTYFP10 TRACTCE10 ORIGINAL_TRACT NAME10 NAMELSAD10 \\\n", "0 48 061 012305 48061012305 123.05 Census Tract 123.05 \n", "\n", " MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 \\\n", "0 G5020 S 25920881 324996211 +26.2732070 -097.2763703 \n", "\n", " geometry \\\n", "0 POLYGON ((-97.22490 26.41153, -97.22436 26.411... \n", "\n", " Definition N (communities) \n", "0 False " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## look at census data for Cameron\n", "cam_only_df = pd.DataFrame({GEOID_TRACT_FIELD:[CAM_ID], SCORE_N_COMMUNITIES:[False]})\n", "\n", "cam_only_df: gpd.GeoDataFrame = tract_data.merge(\n", " cam_only_df, on=GEOID_TRACT_FIELD\n", " )\n", "cam_only_df = cam_only_df.rename(columns={GEOID_TRACT_FIELD: ORIGINAL_TRACT})\n", "cam_only_df" ] }, { "cell_type": "code", "execution_count": 5, "id": "87cdbbaf", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFP10_leftCOUNTYFP10_leftTRACTCE10_leftORIGINAL_TRACTNAME10_leftNAMELSAD10_leftMTFCC10_leftFUNCSTAT10_leftALAND10_leftAWATER10_left...TRACTCE10_rightGEOID10_TRACTNAME10_rightNAMELSAD10_rightMTFCC10_rightFUNCSTAT10_rightALAND10_rightAWATER10_rightINTPTLAT10_rightINTPTLON10_right
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01010048061010100101Census Tract 101G5020S365458278113041841+26.2713946-097.4414470
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...950700484899507009507Census Tract 9507G5020S1025415174377000048+26.5154317-097.5790835
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...990000484899900009900Census Tract 9900G5020S0121414926+26.5064260-097.2240134
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01270048061012700127Census Tract 127G5020S12962279967725054+25.9786218-097.2580863
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01230448061012304123.04Census Tract 123.04G5020S57612996286773+26.0631407-097.2184868
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01230148061012301123.01Census Tract 123.01G5020S137622651119430273+26.1581170-097.3180348
04806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...990000480619900009900Census Tract 9900G5020S0268272237+26.1902408-097.1473235
\n", "

7 rows × 27 columns

\n", "
" ], "text/plain": [ " STATEFP10_left COUNTYFP10_left TRACTCE10_left ORIGINAL_TRACT NAME10_left \\\n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "0 48 061 012305 48061012305 123.05 \n", "\n", " NAMELSAD10_left MTFCC10_left FUNCSTAT10_left ALAND10_left \\\n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 123.05 G5020 S 25920881 \n", "\n", " AWATER10_left ... TRACTCE10_right GEOID10_TRACT NAME10_right \\\n", "0 324996211 ... 010100 48061010100 101 \n", "0 324996211 ... 950700 48489950700 9507 \n", "0 324996211 ... 990000 48489990000 9900 \n", "0 324996211 ... 012700 48061012700 127 \n", "0 324996211 ... 012304 48061012304 123.04 \n", "0 324996211 ... 012301 48061012301 123.01 \n", "0 324996211 ... 990000 48061990000 9900 \n", "\n", " NAMELSAD10_right MTFCC10_right FUNCSTAT10_right ALAND10_right \\\n", "0 Census Tract 101 G5020 S 365458278 \n", "0 Census Tract 9507 G5020 S 1025415174 \n", "0 Census Tract 9900 G5020 S 0 \n", "0 Census Tract 127 G5020 S 129622799 \n", "0 Census Tract 123.04 G5020 S 5761299 \n", "0 Census Tract 123.01 G5020 S 137622651 \n", "0 Census Tract 9900 G5020 S 0 \n", "\n", " AWATER10_right INTPTLAT10_right INTPTLON10_right \n", "0 113041841 +26.2713946 -097.4414470 \n", "0 377000048 +26.5154317 -097.5790835 \n", "0 121414926 +26.5064260 -097.2240134 \n", "0 67725054 +25.9786218 -097.2580863 \n", "0 6286773 +26.0631407 -097.2184868 \n", "0 119430273 +26.1581170 -097.3180348 \n", "0 268272237 +26.1902408 -097.1473235 \n", "\n", "[7 rows x 27 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get all the tracts that are adjacent to cameron\n", "cam_adjacent_tracts: gpd.GeoDataFrame = cam_only_df.sjoin(\n", " tract_data, predicate=\"touches\"\n", " )\n", "cam_adjacent_tracts" ] }, { "cell_type": "code", "execution_count": 6, "id": "af3aa581", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['STATEFP10_left', 'COUNTYFP10_left', 'TRACTCE10_left', 'ORIGINAL_TRACT',\n", " 'NAME10_left', 'NAMELSAD10_left', 'MTFCC10_left', 'FUNCSTAT10_left',\n", " 'ALAND10_left', 'AWATER10_left', 'INTPTLAT10_left', 'INTPTLON10_left',\n", " 'geometry', 'Definition N (communities)', 'index_right',\n", " 'STATEFP10_right', 'COUNTYFP10_right', 'TRACTCE10_right',\n", " 'GEOID10_TRACT', 'NAME10_right', 'NAMELSAD10_right', 'MTFCC10_right',\n", " 'FUNCSTAT10_right', 'ALAND10_right', 'AWATER10_right',\n", " 'INTPTLAT10_right', 'INTPTLON10_right'],\n", " dtype='object')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cam_adjacent_tracts.columns" ] }, { "cell_type": "code", "execution_count": 7, "id": "85559ed4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ORIGINAL_TRACTGEOID10_TRACTAWATER10_leftALAND10_leftAWATER10_rightALAND10_right
0480610123054806101010032499621125920881113041841365458278
04806101230548489950700324996211259208813770000481025415174
04806101230548489990000324996211259208811214149260
048061012305480610127003249962112592088167725054129622799
048061012305480610123043249962112592088162867735761299
0480610123054806101230132499621125920881119430273137622651
04806101230548061990000324996211259208812682722370
\n", "
" ], "text/plain": [ " ORIGINAL_TRACT GEOID10_TRACT AWATER10_left ALAND10_left AWATER10_right \\\n", "0 48061012305 48061010100 324996211 25920881 113041841 \n", "0 48061012305 48489950700 324996211 25920881 377000048 \n", "0 48061012305 48489990000 324996211 25920881 121414926 \n", "0 48061012305 48061012700 324996211 25920881 67725054 \n", "0 48061012305 48061012304 324996211 25920881 6286773 \n", "0 48061012305 48061012301 324996211 25920881 119430273 \n", "0 48061012305 48061990000 324996211 25920881 268272237 \n", "\n", " ALAND10_right \n", "0 365458278 \n", "0 1025415174 \n", "0 0 \n", "0 129622799 \n", "0 5761299 \n", "0 137622651 \n", "0 0 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cam_adjacent_tracts[['ORIGINAL_TRACT', 'GEOID10_TRACT', \n", " 'AWATER10_left', 'ALAND10_left', 'AWATER10_right', 'ALAND10_right']]" ] }, { "cell_type": "code", "execution_count": 8, "id": "60d0801f", "metadata": {}, "outputs": [], "source": [ "# # compare to what I see on the map for Cameron:\n", "\n", "# 48489950700 Willacy County\n", "# 48061010100 Cameron County (contains Laguna Atascosa)\n", "# 48061012301 Cameron County (contains KPIL)\n", "# 48061012304 Cameron County (contains Port Isabel)\n", "# 48061012700 Cameron County (contains Boca Chica Village)\n", "\n", "# # what about 48489990000 and 48061990000?\n", "# both have ALAND10 = 0\n", "# and both tract IDs end in 9990000" ] }, { "cell_type": "markdown", "id": "5201e422", "metadata": {}, "source": [ "Proposed solution: remove tracts where ALAND10=10 (ie water-only tracts)" ] }, { "cell_type": "markdown", "id": "43dc09e8", "metadata": {}, "source": [ "## Compare water-only tracts to tracts with land area" ] }, { "cell_type": "code", "execution_count": 9, "id": "4045bb4f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "367" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# How many of these water-only tracts exist?\n", "len(tract_data[tract_data.ALAND10==0])" ] }, { "cell_type": "code", "execution_count": 10, "id": "37fdf29b", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFP10COUNTYFP10TRACTCE10GEOID10_TRACTNAME10NAMELSAD10MTFCC10FUNCSTAT10ALAND10AWATER10INTPTLAT10INTPTLON10geometrywater_only
80836063990000360639900009900Census Tract 9900G5020S01550634668+43.4569085-078.7926530POLYGON ((-78.46554 43.37337, -78.47053 43.372...True
1015360850089003608500890089Census Tract 89G5020S0100710+40.6469814-074.1024106POLYGON ((-74.10504 40.64808, -74.10025 40.648...True
114936085990100360859901009901Census Tract 9901G5020S080255151+40.5255512-074.1085829POLYGON ((-74.25482 40.49390, -74.25308 40.491...True
121936013990000360139900009900Census Tract 9900G5020S01015062614+42.5006582-079.5141161POLYGON ((-79.24982 42.53745, -79.24982 42.537...True
122636075990000360759900009900Census Tract 9900G5020S0724224875+43.5955454-076.4442549POLYGON ((-76.55001 43.45923, -76.55064 43.458...True
.............................................
7337923015990000230159900009900Census Tract 9900G5020S0375299033+43.8029690-069.4652520POLYGON ((-69.68695 43.81613, -69.68695 43.815...True
7339923031990100230319901009901Census Tract 9901G5020S0567248819+43.1483531-070.4998364POLYGON ((-70.22209 43.46688, -70.22432 43.464...True
7350923023990000230239900009900Census Tract 9900G5020S0161372101+43.7084463-069.7721513POLYGON ((-69.69756 43.81594, -69.69755 43.813...True
7352723013990000230139900009900Census Tract 9900G5020S01636919697+43.9538894-068.9327255POLYGON ((-68.73563 44.11917, -68.73179 44.117...True
7362923005990000230059900009900Census Tract 9900G5020S0361952519+43.6058367-070.0776892POLYGON ((-70.07870 43.66881, -70.06517 43.673...True
\n", "

367 rows × 14 columns

\n", "
" ], "text/plain": [ " STATEFP10 COUNTYFP10 TRACTCE10 GEOID10_TRACT NAME10 NAMELSAD10 \\\n", "808 36 063 990000 36063990000 9900 Census Tract 9900 \n", "1015 36 085 008900 36085008900 89 Census Tract 89 \n", "1149 36 085 990100 36085990100 9901 Census Tract 9901 \n", "1219 36 013 990000 36013990000 9900 Census Tract 9900 \n", "1226 36 075 990000 36075990000 9900 Census Tract 9900 \n", "... ... ... ... ... ... ... \n", "73379 23 015 990000 23015990000 9900 Census Tract 9900 \n", "73399 23 031 990100 23031990100 9901 Census Tract 9901 \n", "73509 23 023 990000 23023990000 9900 Census Tract 9900 \n", "73527 23 013 990000 23013990000 9900 Census Tract 9900 \n", "73629 23 005 990000 23005990000 9900 Census Tract 9900 \n", "\n", " MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 \\\n", "808 G5020 S 0 1550634668 +43.4569085 -078.7926530 \n", "1015 G5020 S 0 100710 +40.6469814 -074.1024106 \n", "1149 G5020 S 0 80255151 +40.5255512 -074.1085829 \n", "1219 G5020 S 0 1015062614 +42.5006582 -079.5141161 \n", "1226 G5020 S 0 724224875 +43.5955454 -076.4442549 \n", "... ... ... ... ... ... ... \n", "73379 G5020 S 0 375299033 +43.8029690 -069.4652520 \n", "73399 G5020 S 0 567248819 +43.1483531 -070.4998364 \n", "73509 G5020 S 0 161372101 +43.7084463 -069.7721513 \n", "73527 G5020 S 0 1636919697 +43.9538894 -068.9327255 \n", "73629 G5020 S 0 361952519 +43.6058367 -070.0776892 \n", "\n", " geometry water_only \n", "808 POLYGON ((-78.46554 43.37337, -78.47053 43.372... True \n", "1015 POLYGON ((-74.10504 40.64808, -74.10025 40.648... True \n", "1149 POLYGON ((-74.25482 40.49390, -74.25308 40.491... True \n", "1219 POLYGON ((-79.24982 42.53745, -79.24982 42.537... True \n", "1226 POLYGON ((-76.55001 43.45923, -76.55064 43.458... True \n", "... ... ... \n", "73379 POLYGON ((-69.68695 43.81613, -69.68695 43.815... True \n", "73399 POLYGON ((-70.22209 43.46688, -70.22432 43.464... True \n", "73509 POLYGON ((-69.69756 43.81594, -69.69755 43.813... True \n", "73527 POLYGON ((-68.73563 44.11917, -68.73179 44.117... True \n", "73629 POLYGON ((-70.07870 43.66881, -70.06517 43.673... True \n", "\n", "[367 rows x 14 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# make a copy of the tract data\n", "tract_explore = tract_data.copy()\n", "\n", "# add bool for whether tract has land area\n", "tract_explore['water_only'] = tract_explore.ALAND10==0\n", "\n", "# preview water tracts\n", "tract_explore[tract_explore.water_only]" ] }, { "cell_type": "code", "execution_count": 11, "id": "d0c30aef", "metadata": {}, "outputs": [], "source": [ "# confirm we can get the tract code stored in TRACTCE10 by taking last 6 chars of GEOID10_TRACT\n", "# (because input df may not contain the isolated tract code column)\n", "tract_explore['tract_code'] = tract_explore.GEOID10_TRACT.apply(lambda x: x[-6:])\n", "assert all(tract_explore.tract_code==tract_explore.TRACTCE10)" ] }, { "cell_type": "code", "execution_count": 12, "id": "5436fdd6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTTRACTCE10tract_codetract_code_num
020071958100958100958100958100
120175965600965600965600965600
220175965700965700965700965700
32004302030002030002030020300
42004302020002020002020020200
\n", "
" ], "text/plain": [ " GEOID10_TRACT TRACTCE10 tract_code tract_code_num\n", "0 20071958100 958100 958100 958100\n", "1 20175965600 965600 965600 965600\n", "2 20175965700 965700 965700 965700\n", "3 20043020300 020300 020300 20300\n", "4 20043020200 020200 020200 20200" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# convert tract codes to ints to check ranges\n", "tract_explore['tract_code_num'] = tract_explore['tract_code'].apply(lambda x: int(x))\n", "tract_explore[['GEOID10_TRACT', 'TRACTCE10', 'tract_code', 'tract_code_num']].head()" ] }, { "cell_type": "markdown", "id": "a2e1814b", "metadata": {}, "source": [ "Per Census documentation:\n", "\n", "- 000100 to 989900—Basic number range for census tracts\n", "- 990000 to 990099—Basic number for census tracts in water areas\n", "- 990100 to 998900—Basic number range for census tracts\n", "\n", "source: https://www2.census.gov/geo/pdfs/maps-data/data/tiger/tgrshp2017/TGRSHP2017_TechDoc_Ch3.pdf (page 3-26)" ] }, { "cell_type": "code", "execution_count": 13, "id": "cf09b1b6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTTRACTCE10tract_codetract_code_numtract_in_water_range
020071958100958100958100958100False
120175965600965600965600965600False
220175965700965700965700965700False
32004302030002030002030020300False
42004302020002020002020020200False
\n", "
" ], "text/plain": [ " GEOID10_TRACT TRACTCE10 tract_code tract_code_num tract_in_water_range\n", "0 20071958100 958100 958100 958100 False\n", "1 20175965600 965600 965600 965600 False\n", "2 20175965700 965700 965700 965700 False\n", "3 20043020300 020300 020300 20300 False\n", "4 20043020200 020200 020200 20200 False" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def in_water_range(x):\n", " if x >= 990000 and x <= 990099:\n", " return True\n", " return False\n", "\n", "tract_explore['tract_in_water_range'] = tract_explore['tract_code_num'].apply(in_water_range)\n", "tract_explore[['GEOID10_TRACT', 'TRACTCE10', 'tract_code', 'tract_code_num', 'tract_in_water_range']].head()" ] }, { "cell_type": "code", "execution_count": 14, "id": "1c0fab10", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTTRACTCE10tract_codetract_code_numtract_in_water_range
80836063990000990000990000990000True
1015360850089000089000089008900False
114936085990100990100990100990100False
121936013990000990000990000990000True
122636075990000990000990000990000True
..................
7337923015990000990000990000990000True
7339923031990100990100990100990100False
7350923023990000990000990000990000True
7352723013990000990000990000990000True
7362923005990000990000990000990000True
\n", "

367 rows × 5 columns

\n", "
" ], "text/plain": [ " GEOID10_TRACT TRACTCE10 tract_code tract_code_num tract_in_water_range\n", "808 36063990000 990000 990000 990000 True\n", "1015 36085008900 008900 008900 8900 False\n", "1149 36085990100 990100 990100 990100 False\n", "1219 36013990000 990000 990000 990000 True\n", "1226 36075990000 990000 990000 990000 True\n", "... ... ... ... ... ...\n", "73379 23015990000 990000 990000 990000 True\n", "73399 23031990100 990100 990100 990100 False\n", "73509 23023990000 990000 990000 990000 True\n", "73527 23013990000 990000 990000 990000 True\n", "73629 23005990000 990000 990000 990000 True\n", "\n", "[367 rows x 5 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# look at water-only tracts\n", "tract_explore[tract_explore.water_only]\\\n", " [['GEOID10_TRACT', 'TRACTCE10', 'tract_code', 'tract_code_num', 'tract_in_water_range']]" ] }, { "cell_type": "markdown", "id": "4ba84ff5", "metadata": {}, "source": [ "Some of the tracts with ALAND10=0 have tract numbers not in the water-only range." ] }, { "cell_type": "code", "execution_count": 15, "id": "320bd044", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTtract_in_water_range
countsummean
water_only
False7376700.000000
True3672270.618529
\n", "
" ], "text/plain": [ " GEOID10_TRACT tract_in_water_range \n", " count sum mean\n", "water_only \n", "False 73767 0 0.000000\n", "True 367 227 0.618529" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tract_explore.groupby('water_only').agg({'GEOID10_TRACT': 'count', \n", " 'tract_in_water_range': ['sum', 'mean']})" ] }, { "cell_type": "markdown", "id": "612c72b8", "metadata": {}, "source": [ "There are no tracts with land area that have tract codes in the water-only range. But about 38% of tracts with 0 land area have normal tract codes outside the water range." ] }, { "cell_type": "code", "execution_count": 16, "id": "e0d5fb77", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFP10COUNTYFP10TRACTCE10GEOID10_TRACTNAME10NAMELSAD10MTFCC10FUNCSTAT10ALAND10AWATER10INTPTLAT10INTPTLON10geometrywater_onlytract_codetract_code_numtract_in_water_range
1015360850089003608500890089Census Tract 89G5020S0100710+40.6469814-074.1024106POLYGON ((-74.10504 40.64808, -74.10025 40.648...True0089008900False
114936085990100360859901009901Census Tract 9901G5020S080255151+40.5255512-074.1085829POLYGON ((-74.25482 40.49390, -74.25308 40.491...True990100990100False
284036047990100360479901009901Census Tract 9901G5020S017793514+40.5649933-074.0148865POLYGON ((-74.00620 40.59437, -74.00619 40.594...True990100990100False
309536059990100360599901009901Census Tract 9901G5020S013672168+40.8790568-073.7020503POLYGON ((-73.76050 40.84184, -73.76102 40.841...True990100990100False
309636059990200360599902009902Census Tract 9902G5020S027086403+40.9078457-073.6526612POLYGON ((-73.65151 40.88524, -73.66481 40.875...True990200990200False
......................................................
6894615009990200150099902009902Census Tract 9902G5020S01643042883+20.6128423-156.5486800POLYGON ((-156.70541 20.82632, -156.70658 20.8...True990200990200False
6894915009991200150099912009912Census Tract 9912G5020S0482573725+20.9476085-156.9795251POLYGON ((-157.11121 20.87500, -157.11725 20.8...True991200991200False
7033939093990200390939902009902Census Tract 9902G5020S01094173093+41.6347746-082.1526215POLYGON ((-82.16712 41.47534, -82.16943 41.474...True990200990200False
7097939043990100390439901009901Census Tract 9901G5020S0853315204+41.5559059-082.5241364POLYGON ((-82.51199 41.39475, -82.51472 41.395...True990100990100False
7339923031990100230319901009901Census Tract 9901G5020S0567248819+43.1483531-070.4998364POLYGON ((-70.22209 43.46688, -70.22432 43.464...True990100990100False
\n", "

140 rows × 17 columns

\n", "
" ], "text/plain": [ " STATEFP10 COUNTYFP10 TRACTCE10 GEOID10_TRACT NAME10 NAMELSAD10 \\\n", "1015 36 085 008900 36085008900 89 Census Tract 89 \n", "1149 36 085 990100 36085990100 9901 Census Tract 9901 \n", "2840 36 047 990100 36047990100 9901 Census Tract 9901 \n", "3095 36 059 990100 36059990100 9901 Census Tract 9901 \n", "3096 36 059 990200 36059990200 9902 Census Tract 9902 \n", "... ... ... ... ... ... ... \n", "68946 15 009 990200 15009990200 9902 Census Tract 9902 \n", "68949 15 009 991200 15009991200 9912 Census Tract 9912 \n", "70339 39 093 990200 39093990200 9902 Census Tract 9902 \n", "70979 39 043 990100 39043990100 9901 Census Tract 9901 \n", "73399 23 031 990100 23031990100 9901 Census Tract 9901 \n", "\n", " MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 \\\n", "1015 G5020 S 0 100710 +40.6469814 -074.1024106 \n", "1149 G5020 S 0 80255151 +40.5255512 -074.1085829 \n", "2840 G5020 S 0 17793514 +40.5649933 -074.0148865 \n", "3095 G5020 S 0 13672168 +40.8790568 -073.7020503 \n", "3096 G5020 S 0 27086403 +40.9078457 -073.6526612 \n", "... ... ... ... ... ... ... \n", "68946 G5020 S 0 1643042883 +20.6128423 -156.5486800 \n", "68949 G5020 S 0 482573725 +20.9476085 -156.9795251 \n", "70339 G5020 S 0 1094173093 +41.6347746 -082.1526215 \n", "70979 G5020 S 0 853315204 +41.5559059 -082.5241364 \n", "73399 G5020 S 0 567248819 +43.1483531 -070.4998364 \n", "\n", " geometry water_only \\\n", "1015 POLYGON ((-74.10504 40.64808, -74.10025 40.648... True \n", "1149 POLYGON ((-74.25482 40.49390, -74.25308 40.491... True \n", "2840 POLYGON ((-74.00620 40.59437, -74.00619 40.594... True \n", "3095 POLYGON ((-73.76050 40.84184, -73.76102 40.841... True \n", "3096 POLYGON ((-73.65151 40.88524, -73.66481 40.875... True \n", "... ... ... \n", "68946 POLYGON ((-156.70541 20.82632, -156.70658 20.8... True \n", "68949 POLYGON ((-157.11121 20.87500, -157.11725 20.8... True \n", "70339 POLYGON ((-82.16712 41.47534, -82.16943 41.474... True \n", "70979 POLYGON ((-82.51199 41.39475, -82.51472 41.395... True \n", "73399 POLYGON ((-70.22209 43.46688, -70.22432 43.464... True \n", "\n", " tract_code tract_code_num tract_in_water_range \n", "1015 008900 8900 False \n", "1149 990100 990100 False \n", "2840 990100 990100 False \n", "3095 990100 990100 False \n", "3096 990200 990200 False \n", "... ... ... ... \n", "68946 990200 990200 False \n", "68949 991200 991200 False \n", "70339 990200 990200 False \n", "70979 990100 990100 False \n", "73399 990100 990100 False \n", "\n", "[140 rows x 17 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# view the exceptions (no land area, but tract code not in water range)\n", "water_exceptions = tract_explore[(tract_explore.water_only==True) & (tract_explore.tract_in_water_range==False)]\n", "water_exceptions" ] }, { "cell_type": "code", "execution_count": 17, "id": "b36fdc0d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "36063990000, 36013990000, 36075990000, 36055990000, 36045990001, 36029990000, 36073990000, 41007990000, 41039990000, 41019990000, 01003990000, 01097990000, 17031990000, 17097990000, 10001990000, 10005990000, 06083990000, 06053990000, 06061990000, 06015990000, 06079990000, 06013990000, 06017990000, 06001990000, 51710990000, 26157990000, 26011990000, 26001990000, 26141990000, 26019990000, 26069990000, 26053990000, 26147990000, 26153990000, 26063990000, 26159990000, 26033990000, 26005990000, 26103990000, 26061990000, 26003990000, 26095990000, 26041990000, 26031990000, 26009990000, 26139990000, 26047990000, 26109990000, 26007990000, 26021990000, 26089990000, 26017990000, 26115990000, 26151990000, 26055990000, 26127990000, 26097990000, 26121990000, 26105990000, 26101990000, 26029990000, 66010990000, 27031990000, 69100990000, 69110990000, 69120990000, 28045990000, 28059990000, 28047990000, 53031990000, 53061990002, 53027990000, 12009990000, 12017990000, 12037990000, 12011990000, 12087990000, 12081990000, 12093990000, 12029990000, 12031990000, 12071990000, 12065990000, 12033990000, 12045990000, 12086990000, 12129990000, 12103990000, 12057990000, 12115990000, 12101990000, 12123990000, 12085990000, 12051990000, 12005990000, 12035990000, 12053990000, 12043990000, 12131990000, 12099990000, 12075990000, 12015990000, 12061990000, 12095990000, 12021990000, 12113990000, 12089990000, 12111990000, 12127990000, 32031990000, 32005990000, 32510990000, 24029990000, 24017990000, 24047990000, 24041990000, 24037990000, 24035990000, 24019990000, 24003990000, 48039990000, 48489990000, 48261990000, 48355990000, 48273990000, 48321990000, 48167990000, 48061990000, 48071990000, 48007990000, 48245990000, 48057990000, 09001990000, 09009990000, 25001990000, 25023990003, 25005990000, 25007990000, 25019990000, 72059990001, 72027990000, 72005990000, 72091990025, 72037990000, 72095990000, 72145990000, 72147990000, 72111990000, 72097990000, 72031990000, 72115990000, 72133990000, 72089990001, 72011990000, 72137990000, 72065990016, 72071990000, 72103990013, 72087990000, 72151990000, 72003990000, 72017990000, 72075990001, 72023990000, 72143990000, 72055990000, 72051990021, 72109990000, 72127990000, 33015990000, 44005990000, 13039990000, 13127990000, 13051990000, 13179990000, 13191990000, 34001990000, 34029990000, 34033990000, 34011990000, 34025990000, 22071990000, 22087990000, 22109990000, 22045990000, 22023990000, 22101990000, 22113990000, 22057990000, 22075990000, 22103990000, 22051990000, 18091990000, 18089990000, 18127990000, 55031990000, 55061990000, 55051990000, 55083990000, 55059990000, 55101990000, 55007990000, 55075990000, 55003990000, 55029990000, 55079990000, 55071990000, 55117990000, 55089990000, 42049990000, 15003990001, 15001990000, 15005990000, 15009990000, 78010990000, 78020990000, 78030990000, 39007990000, 39085990000, 39095990000, 39035990000, 23029990000, 23009990000, 23015990000, 23023990000, 23013990000, 23005990000\n" ] } ], "source": [ "# print list of \"normal\" water-only IDs to put into compare tool\n", "water_normals = tract_explore[(tract_explore.water_only==True) & (tract_explore.tract_in_water_range==True)]\n", "print(', '.join(water_normals.GEOID10_TRACT.values))" ] }, { "cell_type": "code", "execution_count": 18, "id": "e9380f21", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "36063990000, 36085008900, 36085990100, 36013990000, 36075990000, 36055990000, 36045990001, 36047990100, 36059990100, 36059990200, 36059990301, 36059990302, 36059990400, 36081990100, 36029990000, 36073990000, 36103990100, 36117990100, 36011990200, 41007990000, 41057990100, 41039990000, 41041990100, 41015990101, 41011990101, 41019990000, 01003990000, 01097990000, 17031990000, 17097990000, 37031990100, 37031990200, 37053990100, 37019990100, 37055990100, 37055990200, 37129990100, 37095990100, 37095990200, 37141990100, 37137990100, 37133990100, 10001990000, 10005990000, 10003990100, 06083990000, 06053990000, 06061990000, 06059990100, 06097990100, 06045990100, 06073990100, 06041990100, 06015990000, 06081990100, 06079990000, 06013990000, 06111990100, 06017990000, 06001990000, 06075990100, 06087990100, 06023990100, 06037990200, 06037990300, 06037990100, 51103990100, 51115990100, 51650990100, 51131990100, 51735990100, 51810990100, 51133990100, 51001990100, 51001990200, 51119990100, 51199990100, 51710990000, 26157990000, 26011990000, 26163990100, 26163990200, 26001990000, 26141990000, 26011990100, 26019990000, 26069990000, 26053990000, 26147990000, 26153990000, 26063990000, 26159990000, 26033990000, 26033990100, 26131990100, 26005990000, 26103990000, 26099990100, 26061990000, 26061990100, 26003990000, 26095990000, 26041990000, 26031990000, 26009990000, 26083990100, 26139990000, 26041990100, 26047990000, 26109990000, 26007990000, 26021990000, 26089990000, 26017990000, 26115990000, 26151990000, 26013990100, 26055990000, 26127990000, 26097990000, 26121990000, 26105990000, 26101990000, 26029990000, 66010990000, 27137990100, 27031990000, 27075990100, 69100990000, 69110990000, 69120990000, 28045990000, 28059990000, 28047990000, 53061990100, 53035990100, 53057990100, 53009990100, 53067990100, 53049990100, 53031990000, 53033990100, 53061990002, 53029992201, 53027990000, 53055990100, 12009990000, 12017990000, 12037990000, 12037990100, 12011990000, 12087990000, 12081990000, 12093990000, 12029990000, 12031990000, 12071990000, 12065990000, 12033990000, 12045990000, 12103990100, 12086990000, 12129990000, 12103990000, 12057990000, 12057990100, 12115990000, 12101990000, 12109990200, 12109990100, 12123990000, 12085990000, 12085990100, 12051990000, 12005990000, 12035990000, 12053990000, 12043990000, 12131990000, 12099990000, 12075990000, 12015990000, 12099990100, 12061990000, 12095990000, 12021990000, 12113990000, 12089990000, 12111990000, 12127990000, 12091990200, 12091990100, 45013990100, 45053990100, 45019990100, 45029990100, 45043990100, 45051990100, 32031990000, 32031990100, 32005990000, 32510990000, 24029990000, 24017990000, 24047990000, 24041990000, 24039990100, 24037990000, 24035990200, 24035990100, 24035990000, 24019990000, 24003990000, 24009990100, 48039990000, 48489990000, 48261990000, 48355990000, 48273990000, 48321990000, 48167990000, 48061990000, 48071990000, 48007990000, 48245990000, 48057990000, 09001990000, 09011990100, 09007990100, 09009990000, 25001990000, 25023990003, 25005990000, 25025990101, 25009990100, 25007990000, 25019990000, 72049990501, 72059990001, 72027990000, 72033990201, 72053990103, 72113993000, 72015991500, 72005990000, 72091990025, 72013992900, 72037990000, 72095990000, 72145990000, 72117990400, 72147990000, 72111990000, 72097990000, 72079991100, 72031990000, 72115990000, 72133990000, 72089990001, 72011990000, 72123992800, 72137990000, 72065990016, 72071990000, 72103990013, 72087990000, 72069991800, 72151990000, 72003990000, 72057992600, 72017990000, 72075990001, 72119992700, 72023990000, 72143990000, 72055990000, 72051990021, 72109990000, 72127990000, 33015990000, 44009990200, 44009990100, 44005990000, 13039990000, 13127990000, 13051990000, 13179990000, 13191990000, 34001990000, 34029990000, 34033990000, 34011990000, 34009990100, 34025990000, 22071990000, 22087990000, 22109990000, 22045990000, 22023990000, 22101990000, 22113990000, 22057990000, 22051990100, 22075990000, 22103990000, 22051990000, 18091990000, 18089990000, 18127990000, 55031990000, 55061990000, 55051990000, 55083990000, 55059990000, 55101990000, 55007990000, 55025991703, 55025991702, 55075990000, 55003990000, 55029990000, 55079990000, 55071990000, 55117990000, 55089990000, 42049990000, 15003990001, 15001991700, 15001990400, 15001990000, 15001990300, 15001991100, 15001991200, 15001991500, 15001991300, 15001991400, 15001991600, 15001991000, 15001990100, 15001990500, 15001990600, 15001990900, 15001990800, 15001990700, 15005990000, 15007990100, 15007990300, 15007990200, 15009990200, 15009991200, 15009990000, 78010990000, 78020990000, 78030990000, 39007990000, 39085990000, 39093990200, 39043990100, 39095990000, 39035990000, 23029990000, 23009990000, 23015990000, 23031990100, 23023990000, 23013990000, 23005990000\n" ] } ], "source": [ "# print list of ALAND10=0 IDs to put into compare tool\n", "print(', '.join(tract_explore[tract_explore.water_only==True].GEOID10_TRACT.values))" ] }, { "cell_type": "markdown", "id": "8d82e594", "metadata": {}, "source": [ "Tract comparison tool shoes that 21 of the 140 \"water exception\" tracts are in Hawaii. California next highest with 13. Puerto Rico has 12. (No other territories have \"water exception\" tracts.)" ] }, { "cell_type": "markdown", "id": "74544236", "metadata": {}, "source": [ "## Simulate calculate_tract_adjacency_scores()" ] }, { "cell_type": "code", "execution_count": 19, "id": "3703ffb1", "metadata": {}, "outputs": [], "source": [ "def calculate_tract_adjacency_scores(\n", " df: pd.DataFrame, score_column: str, tract_data\n", ") -> pd.DataFrame:\n", " \"\"\"Calculate the mean score of each tract in df based on its neighbors\n", "\n", " Args:\n", " df (pandas.DataFrame): A dataframe with at least the following columns:\n", " * field_names.GEOID_TRACT_FIELD\n", " * score_column\n", "\n", " score_column (str): The name of the column that contains the scores\n", " to average\n", " \n", " tract_data (GeoDataFrame): tract data normally loaded in first line of \n", " function: tract_data = get_tract_geojson()\n", " Returns :\n", " tuple containing final returned df in actual function, as well as intermediates:\n", " - returned_donut_bools (pandas.DataFrame): A dataframe with two columns:\n", " * field_names.GEOID_TRACT_FIELD\n", " * {score_column}_ADJACENT_MEAN, which is the average of score_column for\n", " each tract that touches the tract identified\n", " in field_names.GEOID_TRACT_FIELD\n", " NB: this is the df that gets returned in the actual function\n", " - df (pandas.DataFrame): input df after merging with Census data\n", " - adjacent_tracts (pandas.DataFrame): adjacency df\n", " \"\"\"\n", "\n", " df: gpd.GeoDataFrame = tract_data.merge(\n", " df, on=GEOID_TRACT_FIELD\n", " )\n", " df = df.rename(columns={GEOID_TRACT_FIELD: ORIGINAL_TRACT})\n", "\n", "\n", " adjacent_tracts: gpd.GeoDataFrame = df.sjoin(\n", " tract_data, predicate=\"touches\"\n", " )\n", "\n", "\n", " returned_donut_bools = (\n", " adjacent_tracts.groupby(GEOID_TRACT_FIELD)[[score_column]]\n", " .mean()\n", " .reset_index()\n", " .rename(\n", " columns={\n", " score_column: f\"{score_column}{ADJACENCY_INDEX_SUFFIX}\",\n", " }\n", " )\n", " )\n", " \n", " return (returned_donut_bools, df, adjacent_tracts)\n" ] }, { "cell_type": "code", "execution_count": 20, "id": "d4f50253", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['48061010100',\n", " '48489950700',\n", " '48489990000',\n", " '48061012700',\n", " '48061012304',\n", " '48061012301',\n", " '48061990000']" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# save tract ids that are adjacent to cameron so we can simulate \n", "# passing them into calculate_tract_adjacency_scores() via df arg\n", "adjacent_track_id_list_cam = list(cam_adjacent_tracts.GEOID10_TRACT.unique())\n", "adjacent_track_id_list_cam" ] }, { "cell_type": "code", "execution_count": 21, "id": "4bac089c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities)
048061012305False
148061010100True
248489950700True
348489990000False
448061012700True
548061012304True
648061012301True
748061990000False
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities)\n", "0 48061012305 False\n", "1 48061010100 True\n", "2 48489950700 True\n", "3 48489990000 False\n", "4 48061012700 True\n", "5 48061012304 True\n", "6 48061012301 True\n", "7 48061990000 False" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# df is passed into the function, must contain the score column we're looking at\n", "# (in this case, SCORE_N_COMMUNITIES)\n", "# here we pass in our water boundary issue example in Cameron TX pluss all its adjacencies.\n", "# initialize disadvantaged bool as FALSE for our example, and TRUE for others.\n", "df_cam_plus = pd.DataFrame({GEOID_TRACT_FIELD:[CAM_ID]+adjacent_track_id_list_cam, \n", " SCORE_N_COMMUNITIES:[False]+([True]*len(adjacent_track_id_list_cam))})\n", "\n", "# set disadvantaged bool to False for water tracts\n", "df_cam_plus.loc[df_cam_plus[GEOID_TRACT_FIELD].isin(['48489990000', '48061990000']), SCORE_N_COMMUNITIES]=False\n", "\n", "df_cam_plus" ] }, { "cell_type": "markdown", "id": "ad9570b0", "metadata": {}, "source": [ "### Simulate v1 run of of calculate_tract_adjacency_scores()" ] }, { "cell_type": "code", "execution_count": 22, "id": "74826fc9", "metadata": { "scrolled": true }, "outputs": [], "source": [ "returned_donut_bools__v1, df_census__v1, adjacent_tracts__v1 = calculate_tract_adjacency_scores(\n", " df=df_cam_plus, score_column=SCORE_N_COMMUNITIES, tract_data=tract_data\n", ")" ] }, { "cell_type": "code", "execution_count": 23, "id": "8e69447f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFP10_leftCOUNTYFP10_leftTRACTCE10_leftORIGINAL_TRACTNAME10_leftNAMELSAD10_leftMTFCC10_leftFUNCSTAT10_leftALAND10_leftAWATER10_left...TRACTCE10_rightGEOID10_TRACTNAME10_rightNAMELSAD10_rightMTFCC10_rightFUNCSTAT10_rightALAND10_rightAWATER10_rightINTPTLAT10_rightINTPTLON10_right
048489990000484899900009900Census Tract 9900G5020S0121414926...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
248061990000480619900009900Census Tract 9900G5020S0268272237...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
44806101230448061012304123.04Census Tract 123.04G5020S57612996286773...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
54806101010048061010100101Census Tract 101G5020S365458278113041841...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
64806101270048061012700127Census Tract 127G5020S12962279967725054...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
74806101230148061012301123.01Census Tract 123.01G5020S137622651119430273...01230548061012305123.05Census Tract 123.05G5020S25920881324996211+26.2732070-097.2763703
048489990000484899900009900Census Tract 9900G5020S0121414926...950700484899507009507Census Tract 9507G5020S1025415174377000048+26.5154317-097.5790835
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...950700484899507009507Census Tract 9507G5020S1025415174377000048+26.5154317-097.5790835
54806101010048061010100101Census Tract 101G5020S365458278113041841...950700484899507009507Census Tract 9507G5020S1025415174377000048+26.5154317-097.5790835
048489990000484899900009900Census Tract 9900G5020S0121414926...990000482619900009900Census Tract 9900G5020S0394659578+26.9389899-097.3234546
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...990000482619900009900Census Tract 9900G5020S0394659578+26.9389899-097.3234546
048489990000484899900009900Census Tract 9900G5020S0121414926...950100482619501009501Census Tract 9501G5020S3777053964867877948+26.9241932-097.6694694
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...950100482619501009501Census Tract 9501G5020S3777053964867877948+26.9241932-097.6694694
048489990000484899900009900Census Tract 9900G5020S0121414926...990000480619900009900Census Tract 9900G5020S0268272237+26.1902408-097.1473235
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...990000480619900009900Census Tract 9900G5020S0268272237+26.1902408-097.1473235
64806101270048061012700127Census Tract 127G5020S12962279967725054...990000480619900009900Census Tract 9900G5020S0268272237+26.1902408-097.1473235
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...01020148061010201102.01Census Tract 102.01G5020S138787038773532+26.2865300-097.6765589
54806101010048061010100101Census Tract 101G5020S365458278113041841...01020148061010201102.01Census Tract 102.01G5020S138787038773532+26.2865300-097.6765589
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...01010048061010100101Census Tract 101G5020S365458278113041841+26.2713946-097.4414470
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01010048061010100101Census Tract 101G5020S365458278113041841+26.2713946-097.4414470
74806101230148061012301123.01Census Tract 123.01G5020S137622651119430273...01010048061010100101Census Tract 101G5020S365458278113041841+26.2713946-097.4414470
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...950600484899506009506Census Tract 9506G5020S142823217867417+26.3413027-097.6980863
54806101010048061010100101Census Tract 101G5020S365458278113041841...950600484899506009506Census Tract 9506G5020S142823217867417+26.3413027-097.6980863
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...950500484899505009505Census Tract 9505G5020S1986630411916760+26.4046271-097.7275193
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...950400484899504009504Census Tract 9504G5020S95793488440761+26.4857271-097.7274534
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...990000484899900009900Census Tract 9900G5020S0121414926+26.5064260-097.2240134
248061990000480619900009900Census Tract 9900G5020S0268272237...990000484899900009900Census Tract 9900G5020S0121414926+26.5064260-097.2240134
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...990000484899900009900Census Tract 9900G5020S0121414926+26.5064260-097.2240134
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...02430248215024302243.02Census Tract 243.02G5020S15223338310002679+26.4228640-097.9710985
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...950300484899503009503Census Tract 9503G5020S6683375575864+26.4851686-097.8236397
148489950700484899507009507Census Tract 9507G5020S1025415174377000048...02430148215024301243.01Census Tract 243.01G5020S1426811393527958+26.6175624-098.1948738
248061990000480619900009900Census Tract 9900G5020S0268272237...01270048061012700127Census Tract 127G5020S12962279967725054+25.9786218-097.2580863
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01270048061012700127Census Tract 127G5020S12962279967725054+25.9786218-097.2580863
44806101230448061012304123.04Census Tract 123.04G5020S57612996286773...01270048061012700127Census Tract 127G5020S12962279967725054+25.9786218-097.2580863
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01230448061012304123.04Census Tract 123.04G5020S57612996286773+26.0631407-097.2184868
64806101270048061012700127Census Tract 127G5020S12962279967725054...01230448061012304123.04Census Tract 123.04G5020S57612996286773+26.0631407-097.2184868
74806101230148061012301123.01Census Tract 123.01G5020S137622651119430273...01230448061012304123.04Census Tract 123.04G5020S57612996286773+26.0631407-097.2184868
34806101230548061012305123.05Census Tract 123.05G5020S25920881324996211...01230148061012301123.01Census Tract 123.01G5020S137622651119430273+26.1581170-097.3180348
44806101230448061012304123.04Census Tract 123.04G5020S57612996286773...01230148061012301123.01Census Tract 123.01G5020S137622651119430273+26.1581170-097.3180348
54806101010048061010100101Census Tract 101G5020S365458278113041841...01230148061012301123.01Census Tract 123.01G5020S137622651119430273+26.1581170-097.3180348
44806101230448061012304123.04Census Tract 123.04G5020S57612996286773...01420048061014200142Census Tract 142G5020S22948544652743725+26.0321232-097.3592760
64806101270048061012700127Census Tract 127G5020S12962279967725054...01420048061014200142Census Tract 142G5020S22948544652743725+26.0321232-097.3592760
74806101230148061012301123.01Census Tract 123.01G5020S137622651119430273...01420048061014200142Census Tract 142G5020S22948544652743725+26.0321232-097.3592760
54806101010048061010100101Census Tract 101G5020S365458278113041841...01080048061010800108Census Tract 108G5020S24214554372874+26.2018913-097.6365382
54806101010048061010100101Census Tract 101G5020S365458278113041841...01020348061010203102.03Census Tract 102.03G5020S36355923529676+26.2392001-097.7004164
54806101010048061010100101Census Tract 101G5020S365458278113041841...01220048061012200122Census Tract 122G5020S1810870879690866+26.1431884-097.4883500
74806101230148061012301123.01Census Tract 123.01G5020S137622651119430273...01220048061012200122Census Tract 122G5020S1810870879690866+26.1431884-097.4883500
54806101010048061010100101Census Tract 101G5020S365458278113041841...01140048061011400114Census Tract 114G5020S525950691716432+26.1576649-097.6030070
64806101270048061012700127Census Tract 127G5020S12962279967725054...01410048061014100141Census Tract 141G5020S587505421786306+25.8820153-097.4007135
64806101270048061012700127Census Tract 127G5020S12962279967725054...01320748061013207132.07Census Tract 132.07G5020S3884436409892+25.9278897-097.4159201
64806101270048061012700127Census Tract 127G5020S12962279967725054...01320348061013203132.03Census Tract 132.03G5020S11039573034+25.9335830-097.4379216
64806101270048061012700127Census Tract 127G5020S12962279967725054...01260748061012607126.07Census Tract 126.07G5020S3230089151343+25.9479994-097.4340712
\n", "

53 rows × 27 columns

\n", "
" ], "text/plain": [ " STATEFP10_left COUNTYFP10_left TRACTCE10_left ORIGINAL_TRACT NAME10_left \\\n", "0 48 489 990000 48489990000 9900 \n", "1 48 489 950700 48489950700 9507 \n", "2 48 061 990000 48061990000 9900 \n", "4 48 061 012304 48061012304 123.04 \n", "5 48 061 010100 48061010100 101 \n", "6 48 061 012700 48061012700 127 \n", "7 48 061 012301 48061012301 123.01 \n", "0 48 489 990000 48489990000 9900 \n", "3 48 061 012305 48061012305 123.05 \n", "5 48 061 010100 48061010100 101 \n", "0 48 489 990000 48489990000 9900 \n", "1 48 489 950700 48489950700 9507 \n", "0 48 489 990000 48489990000 9900 \n", "1 48 489 950700 48489950700 9507 \n", "0 48 489 990000 48489990000 9900 \n", "3 48 061 012305 48061012305 123.05 \n", "6 48 061 012700 48061012700 127 \n", "1 48 489 950700 48489950700 9507 \n", "5 48 061 010100 48061010100 101 \n", "1 48 489 950700 48489950700 9507 \n", "3 48 061 012305 48061012305 123.05 \n", "7 48 061 012301 48061012301 123.01 \n", "1 48 489 950700 48489950700 9507 \n", "5 48 061 010100 48061010100 101 \n", "1 48 489 950700 48489950700 9507 \n", "1 48 489 950700 48489950700 9507 \n", "1 48 489 950700 48489950700 9507 \n", "2 48 061 990000 48061990000 9900 \n", "3 48 061 012305 48061012305 123.05 \n", "1 48 489 950700 48489950700 9507 \n", "1 48 489 950700 48489950700 9507 \n", "1 48 489 950700 48489950700 9507 \n", "2 48 061 990000 48061990000 9900 \n", "3 48 061 012305 48061012305 123.05 \n", "4 48 061 012304 48061012304 123.04 \n", "3 48 061 012305 48061012305 123.05 \n", "6 48 061 012700 48061012700 127 \n", "7 48 061 012301 48061012301 123.01 \n", "3 48 061 012305 48061012305 123.05 \n", "4 48 061 012304 48061012304 123.04 \n", "5 48 061 010100 48061010100 101 \n", "4 48 061 012304 48061012304 123.04 \n", "6 48 061 012700 48061012700 127 \n", "7 48 061 012301 48061012301 123.01 \n", "5 48 061 010100 48061010100 101 \n", "5 48 061 010100 48061010100 101 \n", "5 48 061 010100 48061010100 101 \n", "7 48 061 012301 48061012301 123.01 \n", "5 48 061 010100 48061010100 101 \n", "6 48 061 012700 48061012700 127 \n", "6 48 061 012700 48061012700 127 \n", "6 48 061 012700 48061012700 127 \n", "6 48 061 012700 48061012700 127 \n", "\n", " NAMELSAD10_left MTFCC10_left FUNCSTAT10_left ALAND10_left \\\n", "0 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "2 Census Tract 9900 G5020 S 0 \n", "4 Census Tract 123.04 G5020 S 5761299 \n", "5 Census Tract 101 G5020 S 365458278 \n", "6 Census Tract 127 G5020 S 129622799 \n", "7 Census Tract 123.01 G5020 S 137622651 \n", "0 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "5 Census Tract 101 G5020 S 365458278 \n", "0 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "0 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "0 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "6 Census Tract 127 G5020 S 129622799 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "5 Census Tract 101 G5020 S 365458278 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "7 Census Tract 123.01 G5020 S 137622651 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "5 Census Tract 101 G5020 S 365458278 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "2 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "1 Census Tract 9507 G5020 S 1025415174 \n", "2 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "4 Census Tract 123.04 G5020 S 5761299 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "6 Census Tract 127 G5020 S 129622799 \n", "7 Census Tract 123.01 G5020 S 137622651 \n", "3 Census Tract 123.05 G5020 S 25920881 \n", "4 Census Tract 123.04 G5020 S 5761299 \n", "5 Census Tract 101 G5020 S 365458278 \n", "4 Census Tract 123.04 G5020 S 5761299 \n", "6 Census Tract 127 G5020 S 129622799 \n", "7 Census Tract 123.01 G5020 S 137622651 \n", "5 Census Tract 101 G5020 S 365458278 \n", "5 Census Tract 101 G5020 S 365458278 \n", "5 Census Tract 101 G5020 S 365458278 \n", "7 Census Tract 123.01 G5020 S 137622651 \n", "5 Census Tract 101 G5020 S 365458278 \n", "6 Census Tract 127 G5020 S 129622799 \n", "6 Census Tract 127 G5020 S 129622799 \n", "6 Census Tract 127 G5020 S 129622799 \n", "6 Census Tract 127 G5020 S 129622799 \n", "\n", " AWATER10_left ... TRACTCE10_right GEOID10_TRACT NAME10_right \\\n", "0 121414926 ... 012305 48061012305 123.05 \n", "1 377000048 ... 012305 48061012305 123.05 \n", "2 268272237 ... 012305 48061012305 123.05 \n", "4 6286773 ... 012305 48061012305 123.05 \n", "5 113041841 ... 012305 48061012305 123.05 \n", "6 67725054 ... 012305 48061012305 123.05 \n", "7 119430273 ... 012305 48061012305 123.05 \n", "0 121414926 ... 950700 48489950700 9507 \n", "3 324996211 ... 950700 48489950700 9507 \n", "5 113041841 ... 950700 48489950700 9507 \n", "0 121414926 ... 990000 48261990000 9900 \n", "1 377000048 ... 990000 48261990000 9900 \n", "0 121414926 ... 950100 48261950100 9501 \n", "1 377000048 ... 950100 48261950100 9501 \n", "0 121414926 ... 990000 48061990000 9900 \n", "3 324996211 ... 990000 48061990000 9900 \n", "6 67725054 ... 990000 48061990000 9900 \n", "1 377000048 ... 010201 48061010201 102.01 \n", "5 113041841 ... 010201 48061010201 102.01 \n", "1 377000048 ... 010100 48061010100 101 \n", "3 324996211 ... 010100 48061010100 101 \n", "7 119430273 ... 010100 48061010100 101 \n", "1 377000048 ... 950600 48489950600 9506 \n", "5 113041841 ... 950600 48489950600 9506 \n", "1 377000048 ... 950500 48489950500 9505 \n", "1 377000048 ... 950400 48489950400 9504 \n", "1 377000048 ... 990000 48489990000 9900 \n", "2 268272237 ... 990000 48489990000 9900 \n", "3 324996211 ... 990000 48489990000 9900 \n", "1 377000048 ... 024302 48215024302 243.02 \n", "1 377000048 ... 950300 48489950300 9503 \n", "1 377000048 ... 024301 48215024301 243.01 \n", "2 268272237 ... 012700 48061012700 127 \n", "3 324996211 ... 012700 48061012700 127 \n", "4 6286773 ... 012700 48061012700 127 \n", "3 324996211 ... 012304 48061012304 123.04 \n", "6 67725054 ... 012304 48061012304 123.04 \n", "7 119430273 ... 012304 48061012304 123.04 \n", "3 324996211 ... 012301 48061012301 123.01 \n", "4 6286773 ... 012301 48061012301 123.01 \n", "5 113041841 ... 012301 48061012301 123.01 \n", "4 6286773 ... 014200 48061014200 142 \n", "6 67725054 ... 014200 48061014200 142 \n", "7 119430273 ... 014200 48061014200 142 \n", "5 113041841 ... 010800 48061010800 108 \n", "5 113041841 ... 010203 48061010203 102.03 \n", "5 113041841 ... 012200 48061012200 122 \n", "7 119430273 ... 012200 48061012200 122 \n", "5 113041841 ... 011400 48061011400 114 \n", "6 67725054 ... 014100 48061014100 141 \n", "6 67725054 ... 013207 48061013207 132.07 \n", "6 67725054 ... 013203 48061013203 132.03 \n", "6 67725054 ... 012607 48061012607 126.07 \n", "\n", " NAMELSAD10_right MTFCC10_right FUNCSTAT10_right ALAND10_right \\\n", "0 Census Tract 123.05 G5020 S 25920881 \n", "1 Census Tract 123.05 G5020 S 25920881 \n", "2 Census Tract 123.05 G5020 S 25920881 \n", "4 Census Tract 123.05 G5020 S 25920881 \n", "5 Census Tract 123.05 G5020 S 25920881 \n", "6 Census Tract 123.05 G5020 S 25920881 \n", "7 Census Tract 123.05 G5020 S 25920881 \n", "0 Census Tract 9507 G5020 S 1025415174 \n", "3 Census Tract 9507 G5020 S 1025415174 \n", "5 Census Tract 9507 G5020 S 1025415174 \n", "0 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 9900 G5020 S 0 \n", "0 Census Tract 9501 G5020 S 3777053964 \n", "1 Census Tract 9501 G5020 S 3777053964 \n", "0 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 9900 G5020 S 0 \n", "6 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 102.01 G5020 S 138787038 \n", "5 Census Tract 102.01 G5020 S 138787038 \n", "1 Census Tract 101 G5020 S 365458278 \n", "3 Census Tract 101 G5020 S 365458278 \n", "7 Census Tract 101 G5020 S 365458278 \n", "1 Census Tract 9506 G5020 S 142823217 \n", "5 Census Tract 9506 G5020 S 142823217 \n", "1 Census Tract 9505 G5020 S 198663041 \n", "1 Census Tract 9504 G5020 S 95793488 \n", "1 Census Tract 9900 G5020 S 0 \n", "2 Census Tract 9900 G5020 S 0 \n", "3 Census Tract 9900 G5020 S 0 \n", "1 Census Tract 243.02 G5020 S 152233383 \n", "1 Census Tract 9503 G5020 S 66833755 \n", "1 Census Tract 243.01 G5020 S 1426811393 \n", "2 Census Tract 127 G5020 S 129622799 \n", "3 Census Tract 127 G5020 S 129622799 \n", "4 Census Tract 127 G5020 S 129622799 \n", "3 Census Tract 123.04 G5020 S 5761299 \n", "6 Census Tract 123.04 G5020 S 5761299 \n", "7 Census Tract 123.04 G5020 S 5761299 \n", "3 Census Tract 123.01 G5020 S 137622651 \n", "4 Census Tract 123.01 G5020 S 137622651 \n", "5 Census Tract 123.01 G5020 S 137622651 \n", "4 Census Tract 142 G5020 S 229485446 \n", "6 Census Tract 142 G5020 S 229485446 \n", "7 Census Tract 142 G5020 S 229485446 \n", "5 Census Tract 108 G5020 S 24214554 \n", "5 Census Tract 102.03 G5020 S 36355923 \n", "5 Census Tract 122 G5020 S 181087087 \n", "7 Census Tract 122 G5020 S 181087087 \n", "5 Census Tract 114 G5020 S 52595069 \n", "6 Census Tract 141 G5020 S 58750542 \n", "6 Census Tract 132.07 G5020 S 3884436 \n", "6 Census Tract 132.03 G5020 S 1103957 \n", "6 Census Tract 126.07 G5020 S 3230089 \n", "\n", " AWATER10_right INTPTLAT10_right INTPTLON10_right \n", "0 324996211 +26.2732070 -097.2763703 \n", "1 324996211 +26.2732070 -097.2763703 \n", "2 324996211 +26.2732070 -097.2763703 \n", "4 324996211 +26.2732070 -097.2763703 \n", "5 324996211 +26.2732070 -097.2763703 \n", "6 324996211 +26.2732070 -097.2763703 \n", "7 324996211 +26.2732070 -097.2763703 \n", "0 377000048 +26.5154317 -097.5790835 \n", "3 377000048 +26.5154317 -097.5790835 \n", "5 377000048 +26.5154317 -097.5790835 \n", "0 394659578 +26.9389899 -097.3234546 \n", "1 394659578 +26.9389899 -097.3234546 \n", "0 867877948 +26.9241932 -097.6694694 \n", "1 867877948 +26.9241932 -097.6694694 \n", "0 268272237 +26.1902408 -097.1473235 \n", "3 268272237 +26.1902408 -097.1473235 \n", "6 268272237 +26.1902408 -097.1473235 \n", "1 773532 +26.2865300 -097.6765589 \n", "5 773532 +26.2865300 -097.6765589 \n", "1 113041841 +26.2713946 -097.4414470 \n", "3 113041841 +26.2713946 -097.4414470 \n", "7 113041841 +26.2713946 -097.4414470 \n", "1 867417 +26.3413027 -097.6980863 \n", "5 867417 +26.3413027 -097.6980863 \n", "1 1916760 +26.4046271 -097.7275193 \n", "1 440761 +26.4857271 -097.7274534 \n", "1 121414926 +26.5064260 -097.2240134 \n", "2 121414926 +26.5064260 -097.2240134 \n", "3 121414926 +26.5064260 -097.2240134 \n", "1 10002679 +26.4228640 -097.9710985 \n", "1 75864 +26.4851686 -097.8236397 \n", "1 527958 +26.6175624 -098.1948738 \n", "2 67725054 +25.9786218 -097.2580863 \n", "3 67725054 +25.9786218 -097.2580863 \n", "4 67725054 +25.9786218 -097.2580863 \n", "3 6286773 +26.0631407 -097.2184868 \n", "6 6286773 +26.0631407 -097.2184868 \n", "7 6286773 +26.0631407 -097.2184868 \n", "3 119430273 +26.1581170 -097.3180348 \n", "4 119430273 +26.1581170 -097.3180348 \n", "5 119430273 +26.1581170 -097.3180348 \n", "4 52743725 +26.0321232 -097.3592760 \n", "6 52743725 +26.0321232 -097.3592760 \n", "7 52743725 +26.0321232 -097.3592760 \n", "5 372874 +26.2018913 -097.6365382 \n", "5 529676 +26.2392001 -097.7004164 \n", "5 9690866 +26.1431884 -097.4883500 \n", "7 9690866 +26.1431884 -097.4883500 \n", "5 1716432 +26.1576649 -097.6030070 \n", "6 1786306 +25.8820153 -097.4007135 \n", "6 409892 +25.9278897 -097.4159201 \n", "6 3034 +25.9335830 -097.4379216 \n", "6 151343 +25.9479994 -097.4340712 \n", "\n", "[53 rows x 27 columns]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adjacent_tracts__v1" ] }, { "cell_type": "code", "execution_count": 24, "id": "f3033c5d", "metadata": {}, "outputs": [], "source": [ "### check for duplicates\n", "assert len(adjacent_tracts__v1[['ORIGINAL_TRACT', 'GEOID10_TRACT']].drop_duplicates()) == len(adjacent_tracts__v1[['ORIGINAL_TRACT', 'GEOID10_TRACT']])" ] }, { "cell_type": "code", "execution_count": 25, "id": "4b1770de", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
0480610101000.666667
1480610102011.000000
2480610102031.000000
3480610108001.000000
4480610114001.000000
5480610122001.000000
6480610123010.666667
7480610123040.666667
8480610123050.714286
9480610126071.000000
10480610127000.333333
11480610132031.000000
12480610132071.000000
13480610141001.000000
14480610142001.000000
15480619900000.333333
16482150243011.000000
17482150243021.000000
18482619501000.500000
19482619900000.500000
20484899503001.000000
21484899504001.000000
22484899505001.000000
23484899506001.000000
24484899507000.333333
25484899900000.333333
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "0 48061010100 0.666667\n", "1 48061010201 1.000000\n", "2 48061010203 1.000000\n", "3 48061010800 1.000000\n", "4 48061011400 1.000000\n", "5 48061012200 1.000000\n", "6 48061012301 0.666667\n", "7 48061012304 0.666667\n", "8 48061012305 0.714286\n", "9 48061012607 1.000000\n", "10 48061012700 0.333333\n", "11 48061013203 1.000000\n", "12 48061013207 1.000000\n", "13 48061014100 1.000000\n", "14 48061014200 1.000000\n", "15 48061990000 0.333333\n", "16 48215024301 1.000000\n", "17 48215024302 1.000000\n", "18 48261950100 0.500000\n", "19 48261990000 0.500000\n", "20 48489950300 1.000000\n", "21 48489950400 1.000000\n", "22 48489950500 1.000000\n", "23 48489950600 1.000000\n", "24 48489950700 0.333333\n", "25 48489990000 0.333333" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "returned_donut_bools__v1" ] }, { "cell_type": "code", "execution_count": 26, "id": "c2aa4682", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123050.714286
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 0.714286" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check result for our example tract\n", "returned_donut_bools__v1[returned_donut_bools__v1.GEOID10_TRACT==CAM_ID]" ] }, { "cell_type": "markdown", "id": "936d0ebc", "metadata": {}, "source": [ "Demonstrates that because 2 of the 7 adjacent tracts have meaningless water-only non-disadvantaged status, the average of neighbors is less than 1. This is why our example tract isn't being classified as disadvantaged in v1." ] }, { "cell_type": "markdown", "id": "a02b6194", "metadata": {}, "source": [ "### Simulate proposed v2 implementations of calculate_tract_adjacency_scores()" ] }, { "cell_type": "markdown", "id": "5688387a", "metadata": {}, "source": [ "Testing three ways to do this: filtering tract_data, filtering df, and filtering both. For the first two ways, we'll test both land area and tract ID range methods." ] }, { "cell_type": "code", "execution_count": 27, "id": "72963a39", "metadata": {}, "outputs": [], "source": [ "def view_results(returned_donut_bools, post_merge_df, adjacent_tracts):\n", " # function to print what we need to see after each experiment\n", " \n", " print(f'Length of input df after Census merge: {len(post_merge_df)}') \n", " print(f'That is {len(df_census__v1) - len(post_merge_df)} less than in v1')\n", " \n", " print(f'\\nLength of adjacency frame: {len(adjacent_tracts)}')\n", " print(f'That is {len(adjacent_tracts__v1) - len(adjacent_tracts)} less than in v1')\n", " \n", " n_dupes = len(adjacent_tracts[['ORIGINAL_TRACT', 'GEOID10_TRACT']])\\\n", " - len(adjacent_tracts[['ORIGINAL_TRACT', 'GEOID10_TRACT']].drop_duplicates())\n", " if n_dupes>0:\n", " print(\"ALERT: duplicates present in adjacency frame!\")\n", "\n", " \n", " print(f'\\nLength of returned frame with donut bools: {len(returned_donut_bools)}')\n", " if len(returned_donut_bools__v1) == len(returned_donut_bools):\n", " print('Returning same number of final bools as in v1')\n", " else:\n", " print(f'ALERT: returning {len(returned_donut_bools__v1) - len(returned_donut_bools)} less than in v1')\n", "\n", " cam_return = returned_donut_bools[returned_donut_bools.GEOID10_TRACT==CAM_ID]\n", " display(cam_return)\n", "\n", " if cam_return['Definition N (communities) (average of neighbors)'].values[0]==1:\n", " print('SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS')\n", " else:\n", " print('ALERT: CAMERON, TX EXAMPLE IS STILL BEING MARKED NON-DISADVANTAGED')" ] }, { "cell_type": "markdown", "id": "c1712378", "metadata": {}, "source": [ "#### Implementation 1a: Filter water-only tracts out of tract_data using ALAND10" ] }, { "cell_type": "code", "execution_count": 28, "id": "9e0401e1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Length of input df after Census merge: 6\n", "That is 2 less than in v1\n", "\n", "Length of adjacency frame: 40\n", "That is 13 less than in v1\n", "\n", "Length of returned frame with donut bools: 23\n", "ALERT: returning 3 less than in v1\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123051.0
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 1.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS\n" ] } ], "source": [ "tract_data_land_area_only = tract_data.copy()\n", "tract_data_land_area_only = tract_data_land_area_only[tract_data_land_area_only.ALAND10>0]\n", "\n", "returned_donut_bools__filter_tracts_aland10,\\\n", "df_census__filter_tracts_aland10,\\\n", "adjacent_tracts__filter_tracts_aland10 =calculate_tract_adjacency_scores(\n", " df=df_cam_plus, \n", " score_column=SCORE_N_COMMUNITIES, \n", " tract_data=tract_data_land_area_only\n", ")\n", "\n", "view_results(returned_donut_bools=returned_donut_bools__filter_tracts_aland10, \n", " post_merge_df=df_census__filter_tracts_aland10, \n", " adjacent_tracts=adjacent_tracts__filter_tracts_aland10)" ] }, { "cell_type": "markdown", "id": "f277cc9a", "metadata": {}, "source": [ "#### Implementation 1b: Filter water-only tracts out of df using tract range" ] }, { "cell_type": "code", "execution_count": 29, "id": "0ea48370", "metadata": {}, "outputs": [], "source": [ "def full_geo_id_to_water_range_bool(x:str):\n", " num_x = int(x[-6:])\n", " return(in_water_range(num_x))" ] }, { "cell_type": "code", "execution_count": 30, "id": "a2ef8844", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Length of input df after Census merge: 6\n", "That is 2 less than in v1\n", "\n", "Length of adjacency frame: 40\n", "That is 13 less than in v1\n", "\n", "Length of returned frame with donut bools: 23\n", "ALERT: returning 3 less than in v1\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123051.0
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 1.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS\n" ] } ], "source": [ "tract_data_non_water_range_only = tract_data.copy()\n", "tract_data_non_water_range_only = tract_data_non_water_range_only[tract_data_non_water_range_only.GEOID10_TRACT\\\n", " .apply(full_geo_id_to_water_range_bool)\\\n", " ==False]\n", "\n", "returned_donut_bools__filter_tracts_id_range,\\\n", "df_census__filter_tracts_id_range,\\\n", "adjacent_tracts__filter_tracts_id_range =calculate_tract_adjacency_scores(\n", " df=df_cam_plus, \n", " score_column=SCORE_N_COMMUNITIES, \n", " tract_data=tract_data_non_water_range_only\n", ")\n", "\n", "view_results(returned_donut_bools=returned_donut_bools__filter_tracts_id_range, \n", " post_merge_df=df_census__filter_tracts_id_range, \n", " adjacent_tracts=adjacent_tracts__filter_tracts_id_range)" ] }, { "cell_type": "markdown", "id": "1fcee2e2", "metadata": {}, "source": [ "#### Implementation 2a: Filter water-only tracts out of df using ALAND10" ] }, { "cell_type": "markdown", "id": "477a886c", "metadata": {}, "source": [ "Note: we can't filter the input df based on ALAND10 using the standard calculate_tract_adjacency_scores() function, because it doesn't have the ALAND10 column until after it is merged with tract data. Need to write a new test function for this method." ] }, { "cell_type": "code", "execution_count": 31, "id": "9ad212d3", "metadata": {}, "outputs": [], "source": [ "def calculate_tract_adjacency_scores__filter_df_aland10(\n", " df: pd.DataFrame, score_column: str, tract_data\n", ") -> pd.DataFrame:\n", " \"\"\"Calculate the mean score of each tract in df based on its neighbors\n", "\n", " Args:\n", " df (pandas.DataFrame): A dataframe with at least the following columns:\n", " * field_names.GEOID_TRACT_FIELD\n", " * score_column\n", "\n", " score_column (str): The name of the column that contains the scores\n", " to average\n", " \n", " tract_data (GeoDataFrame): tract data normally loaded in first line of \n", " function: tract_data = get_tract_geojson()\n", " Returns :\n", " tuple containing final returned df in actual function, as well as intermediates:\n", " - returned_donut_bools (pandas.DataFrame): A dataframe with two columns:\n", " * field_names.GEOID_TRACT_FIELD\n", " * {score_column}_ADJACENT_MEAN, which is the average of score_column for\n", " each tract that touches the tract identified\n", " in field_names.GEOID_TRACT_FIELD\n", " NB: this is the df that gets returned in the actual function\n", " - df (pandas.DataFrame): input df after merging with Census data\n", " - adjacent_tracts (pandas.DataFrame): adjacency df\n", " \"\"\"\n", "\n", " df: gpd.GeoDataFrame = tract_data.merge(\n", " df, on=GEOID_TRACT_FIELD\n", " )\n", " df = df.rename(columns={GEOID_TRACT_FIELD: ORIGINAL_TRACT})\n", " \n", " \n", " # remove water areas from input frame\n", " df = df[df['ALAND10']>0]\n", "\n", "\n", " adjacent_tracts: gpd.GeoDataFrame = df.sjoin(\n", " tract_data, predicate=\"touches\"\n", " )\n", "\n", "\n", " returned_donut_bools = (\n", " adjacent_tracts.groupby(GEOID_TRACT_FIELD)[[score_column]]\n", " .mean()\n", " .reset_index()\n", " .rename(\n", " columns={\n", " score_column: f\"{score_column}{ADJACENCY_INDEX_SUFFIX}\",\n", " }\n", " )\n", " )\n", " \n", " return (returned_donut_bools, df, adjacent_tracts)\n" ] }, { "cell_type": "code", "execution_count": 32, "id": "ff9d13cd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Length of input df after Census merge: 6\n", "That is 2 less than in v1\n", "\n", "Length of adjacency frame: 45\n", "That is 8 less than in v1\n", "\n", "Length of returned frame with donut bools: 26\n", "Returning same number of final bools as in v1\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123051.0
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 1.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS\n" ] } ], "source": [ "returned_donut_bools__filter_df_aland10,\\\n", "df_census__filter_df_aland10,\\\n", "adjacent_tracts__filter_df_aland10 = calculate_tract_adjacency_scores__filter_df_aland10(\n", " df=df_cam_plus, \n", " score_column=SCORE_N_COMMUNITIES, \n", " tract_data=tract_data\n", ")\n", "\n", "view_results(returned_donut_bools=returned_donut_bools__filter_df_aland10, \n", " post_merge_df=df_census__filter_df_aland10, \n", " adjacent_tracts=adjacent_tracts__filter_df_aland10)" ] }, { "cell_type": "markdown", "id": "c7401f01", "metadata": {}, "source": [ "#### Implementation 2b: Filter water-only tracts out of df using tract range" ] }, { "cell_type": "code", "execution_count": 33, "id": "3cb027ab", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Length of input df after Census merge: 6\n", "That is 2 less than in v1\n", "\n", "Length of adjacency frame: 45\n", "That is 8 less than in v1\n", "\n", "Length of returned frame with donut bools: 26\n", "Returning same number of final bools as in v1\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123051.0
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 1.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS\n" ] } ], "source": [ "df_cam_plus_no_water = df_cam_plus.copy()\n", "df_cam_plus_no_water = df_cam_plus_no_water[df_cam_plus_no_water.GEOID10_TRACT\\\n", " .apply(full_geo_id_to_water_range_bool)\\\n", " ==False]\n", "\n", "returned_donut_bools__filter_df_id_range,\\\n", "df_census__filter_df_id_range,\\\n", "adjacent_tracts__filter_df_id_range = calculate_tract_adjacency_scores(\n", " df=df_cam_plus_no_water, \n", " score_column=SCORE_N_COMMUNITIES, \n", " tract_data=tract_data\n", ")\n", "\n", "view_results(returned_donut_bools=returned_donut_bools__filter_df_id_range, \n", " post_merge_df=df_census__filter_df_id_range, \n", " adjacent_tracts=adjacent_tracts__filter_df_id_range)" ] }, { "cell_type": "markdown", "id": "c0f94a03", "metadata": {}, "source": [ "#### Implementation 3: Filter water-only tracts out of both tract_data and df" ] }, { "cell_type": "markdown", "id": "e96c981f", "metadata": {}, "source": [ "Note: test id method for simplicity" ] }, { "cell_type": "code", "execution_count": 34, "id": "8e385e22", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Length of input df after Census merge: 6\n", "That is 2 less than in v1\n", "\n", "Length of adjacency frame: 40\n", "That is 13 less than in v1\n", "\n", "Length of returned frame with donut bools: 23\n", "ALERT: returning 3 less than in v1\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTDefinition N (communities) (average of neighbors)
8480610123051.0
\n", "
" ], "text/plain": [ " GEOID10_TRACT Definition N (communities) (average of neighbors)\n", "8 48061012305 1.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "SUCCESSFULLY RE-ASSIGNING CAMERON, TX EXAMPLE TO DISADVANTAGED STATUS\n" ] } ], "source": [ "returned_donut_bools__filter_both,\\\n", "df_census__filter_both,\\\n", "adjacent_tracts__filter_both =calculate_tract_adjacency_scores(\n", " df=df_cam_plus_no_water, \n", " score_column=SCORE_N_COMMUNITIES, \n", " tract_data=tract_data_non_water_range_only\n", ")\n", "\n", "view_results(returned_donut_bools=returned_donut_bools__filter_both, \n", " post_merge_df=df_census__filter_both, \n", " adjacent_tracts=adjacent_tracts__filter_both)" ] }, { "cell_type": "markdown", "id": "dde21850", "metadata": {}, "source": [ "#### Summary of results: All solutions successfully re-assigned our example tract to disadvantaged status. Only implementations 2a & 2b (filtering on the dataframe) also preserved the number of tracts that the function returns" ] }, { "cell_type": "markdown", "id": "04d07bdb", "metadata": {}, "source": [ "## Compare full scoring runs from different methods" ] }, { "cell_type": "code", "execution_count": 35, "id": "aa462404", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTdac_adj_avg_proddac_adj_avg_localis_donut_hole_prodis_donut_hole_localfinal_dac_prodfinal_dac_localclass_change
332120879710010.671.0FalseTrueFalseTrueTrue
370121010302020.831.0FalseTrueFalseTrueTrue
376121010303030.831.0FalseTrueFalseTrueTrue
419121270811020.751.0FalseTrueFalseTrueTrue
806260690002000.861.0FalseTrueFalseTrueTrue
812260830001000.331.0FalseTrueFalseTrueTrue
912280470014000.881.0FalseTrueFalseTrueTrue
1138390070006020.751.0FalseTrueFalseTrueTrue
1654410419503030.751.0FalseTrueFalseTrueTrue
1739480610123050.711.0FalseTrueFalseTrueTrue
1766510010906000.831.0FalseTrueFalseTrueTrue
1780511319302000.671.0FalseTrueFalseTrueTrue
1944551010001000.751.0FalseTrueFalseTrueTrue
2012720133010000.711.0FalseTrueFalseTrueTrue
2037720371601000.831.0FalseTrueFalseTrueTrue
\n", "
" ], "text/plain": [ " GEOID10_TRACT dac_adj_avg_prod dac_adj_avg_local is_donut_hole_prod \\\n", "332 12087971001 0.67 1.0 False \n", "370 12101030202 0.83 1.0 False \n", "376 12101030303 0.83 1.0 False \n", "419 12127081102 0.75 1.0 False \n", "806 26069000200 0.86 1.0 False \n", "812 26083000100 0.33 1.0 False \n", "912 28047001400 0.88 1.0 False \n", "1138 39007000602 0.75 1.0 False \n", "1654 41041950303 0.75 1.0 False \n", "1739 48061012305 0.71 1.0 False \n", "1766 51001090600 0.83 1.0 False \n", "1780 51131930200 0.67 1.0 False \n", "1944 55101000100 0.75 1.0 False \n", "2012 72013301000 0.71 1.0 False \n", "2037 72037160100 0.83 1.0 False \n", "\n", " is_donut_hole_local final_dac_prod final_dac_local class_change \n", "332 True False True True \n", "370 True False True True \n", "376 True False True True \n", "419 True False True True \n", "806 True False True True \n", "812 True False True True \n", "912 True False True True \n", "1138 True False True True \n", "1654 True False True True \n", "1739 True False True True \n", "1766 True False True True \n", "1780 True False True True \n", "1944 True False True True \n", "2012 True False True True \n", "2037 True False True True " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read in CSV with deltas from implementation 1a \n", "# (water areas filtered from tract_data via ALAND10)\n", "deltas__tract_area = pd.read_csv('../data/tmp/Comparator/Score/deltas__tract_area.csv')\n", "\n", "# rename key columns\n", "deltas__tract_area.rename(columns = {'Unnamed: 0': 'GEOID10_TRACT',\n", " 'Definition N community, including adjacency index tracts': 'final_dac_prod',\n", " 'Definition N community, including adjacency index tracts.1': 'final_dac_local',\n", " 'Definition N (communities) (average of neighbors)': 'dac_adj_avg_prod',\n", " 'Definition N (communities) (average of neighbors).1': 'dac_adj_avg_local',\n", " 'Is the tract surrounded by disadvantaged communities?': 'is_donut_hole_prod',\n", " 'Is the tract surrounded by disadvantaged communities?.1': 'is_donut_hole_local'\n", " },\n", " inplace=True) \n", "\n", "# drop first two rows, which old column information\n", "deltas__tract_area.drop(index=[0,1], inplace=True)\n", "\n", "# create bool to store whether final DAC designation was updated\n", "deltas__tract_area['class_change'] = deltas__tract_area.final_dac_prod!=deltas__tract_area.final_dac_local\n", "\n", "# set class change bool to false where both designations are null\n", "deltas__tract_area.loc[((deltas__tract_area.final_dac_prod.isna()==True)\\\n", " & (deltas__tract_area.final_dac_local.isna()==True)), \n", " 'class_change'] = False\n", "\n", "# view tracts that had their status updated with this method\n", "deltas__tract_area[deltas__tract_area.class_change][['GEOID10_TRACT', \n", " 'dac_adj_avg_prod', 'dac_adj_avg_local',\n", " 'is_donut_hole_prod', 'is_donut_hole_local',\n", " 'final_dac_prod', 'final_dac_local',\n", " 'class_change']]" ] }, { "cell_type": "code", "execution_count": 36, "id": "3dc9500a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTdac_adj_avg_proddac_adj_avg_localis_donut_hole_prodis_donut_hole_localfinal_dac_prodfinal_dac_localclass_change
312120879710010.671.0FalseTrueFalseTrueTrue
346121010302020.831.0FalseTrueFalseTrueTrue
352121010303030.831.0FalseTrueFalseTrueTrue
388121270811020.751.0FalseTrueFalseTrueTrue
741260690002000.861.0FalseTrueFalseTrueTrue
747260830001000.331.0FalseTrueFalseTrueTrue
836280470014000.881.0FalseTrueFalseTrueTrue
1044390070006020.751.0FalseTrueFalseTrueTrue
1559410419503030.751.0FalseTrueFalseTrueTrue
1643480610123050.711.0FalseTrueFalseTrueTrue
1668510010906000.831.0FalseTrueFalseTrueTrue
1679511319302000.671.0FalseTrueFalseTrueTrue
1828551010001000.751.0FalseTrueFalseTrueTrue
1894720133010000.711.0FalseTrueFalseTrueTrue
1919720371601000.831.0FalseTrueFalseTrueTrue
\n", "
" ], "text/plain": [ " GEOID10_TRACT dac_adj_avg_prod dac_adj_avg_local is_donut_hole_prod \\\n", "312 12087971001 0.67 1.0 False \n", "346 12101030202 0.83 1.0 False \n", "352 12101030303 0.83 1.0 False \n", "388 12127081102 0.75 1.0 False \n", "741 26069000200 0.86 1.0 False \n", "747 26083000100 0.33 1.0 False \n", "836 28047001400 0.88 1.0 False \n", "1044 39007000602 0.75 1.0 False \n", "1559 41041950303 0.75 1.0 False \n", "1643 48061012305 0.71 1.0 False \n", "1668 51001090600 0.83 1.0 False \n", "1679 51131930200 0.67 1.0 False \n", "1828 55101000100 0.75 1.0 False \n", "1894 72013301000 0.71 1.0 False \n", "1919 72037160100 0.83 1.0 False \n", "\n", " is_donut_hole_local final_dac_prod final_dac_local class_change \n", "312 True False True True \n", "346 True False True True \n", "352 True False True True \n", "388 True False True True \n", "741 True False True True \n", "747 True False True True \n", "836 True False True True \n", "1044 True False True True \n", "1559 True False True True \n", "1643 True False True True \n", "1668 True False True True \n", "1679 True False True True \n", "1828 True False True True \n", "1894 True False True True \n", "1919 True False True True " ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read in CSV with deltas from implementation 2a \n", "# (water areas filtered from input df via ALAND10)\n", "deltas__df_area = pd.read_csv('../data/tmp/Comparator/Score/deltas__df_area.csv')\n", "\n", "# rename key columns\n", "deltas__df_area.rename(columns = {'Unnamed: 0': 'GEOID10_TRACT',\n", " 'Definition N community, including adjacency index tracts': 'final_dac_prod',\n", " 'Definition N community, including adjacency index tracts.1': 'final_dac_local',\n", " 'Definition N (communities) (average of neighbors)': 'dac_adj_avg_prod',\n", " 'Definition N (communities) (average of neighbors).1': 'dac_adj_avg_local',\n", " 'Is the tract surrounded by disadvantaged communities?': 'is_donut_hole_prod',\n", " 'Is the tract surrounded by disadvantaged communities?.1': 'is_donut_hole_local'\n", " },\n", " inplace=True) \n", "\n", "# drop first two rows, which old column information\n", "deltas__df_area.drop(index=[0,1], inplace=True)\n", "\n", "# create bool to store whether final DAC designation was updated\n", "deltas__df_area['class_change'] = deltas__df_area.final_dac_prod!=deltas__df_area.final_dac_local\n", "\n", "# set class change bool to false where both designations are null\n", "deltas__df_area.loc[((deltas__df_area.final_dac_prod.isna()==True)\\\n", " & (deltas__df_area.final_dac_local.isna()==True)), \n", " 'class_change'] = False\n", "\n", "# view tracts that had their status updated with this method\n", "deltas__df_area[deltas__df_area.class_change][['GEOID10_TRACT', \n", " 'dac_adj_avg_prod', 'dac_adj_avg_local',\n", " 'is_donut_hole_prod', 'is_donut_hole_local',\n", " 'final_dac_prod', 'final_dac_local',\n", " 'class_change']]" ] }, { "cell_type": "code", "execution_count": 37, "id": "5187abcb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOID10_TRACTdac_adj_avg_proddac_adj_avg_localis_donut_hole_prodis_donut_hole_localfinal_dac_prodfinal_dac_localclass_change
228120879710010.671.0FalseTrueFalseTrueTrue
261121010302020.831.0FalseTrueFalseTrueTrue
267121010303030.831.0FalseTrueFalseTrueTrue
292121270811020.751.0FalseTrueFalseTrueTrue
570260690002000.861.0FalseTrueFalseTrueTrue
648280470014000.881.0FalseTrueFalseTrueTrue
755390070006020.751.0FalseTrueFalseTrueTrue
1281480610123050.711.0FalseTrueFalseTrueTrue
1388551010001000.751.0FalseTrueFalseTrueTrue
1470720371601000.831.0FalseTrueFalseTrueTrue
\n", "
" ], "text/plain": [ " GEOID10_TRACT dac_adj_avg_prod dac_adj_avg_local is_donut_hole_prod \\\n", "228 12087971001 0.67 1.0 False \n", "261 12101030202 0.83 1.0 False \n", "267 12101030303 0.83 1.0 False \n", "292 12127081102 0.75 1.0 False \n", "570 26069000200 0.86 1.0 False \n", "648 28047001400 0.88 1.0 False \n", "755 39007000602 0.75 1.0 False \n", "1281 48061012305 0.71 1.0 False \n", "1388 55101000100 0.75 1.0 False \n", "1470 72037160100 0.83 1.0 False \n", "\n", " is_donut_hole_local final_dac_prod final_dac_local class_change \n", "228 True False True True \n", "261 True False True True \n", "267 True False True True \n", "292 True False True True \n", "570 True False True True \n", "648 True False True True \n", "755 True False True True \n", "1281 True False True True \n", "1388 True False True True \n", "1470 True False True True " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read in CSV with deltas from implementation 2b \n", "# (water areas filtered from input df via IDs)\n", "deltas__df_id = pd.read_csv('../data/tmp/Comparator/Score/deltas__df_id.csv')\n", "\n", "# rename key columns\n", "deltas__df_id.rename(columns = {'Unnamed: 0': 'GEOID10_TRACT',\n", " 'Definition N community, including adjacency index tracts': 'final_dac_prod',\n", " 'Definition N community, including adjacency index tracts.1': 'final_dac_local',\n", " 'Definition N (communities) (average of neighbors)': 'dac_adj_avg_prod',\n", " 'Definition N (communities) (average of neighbors).1': 'dac_adj_avg_local',\n", " 'Is the tract surrounded by disadvantaged communities?': 'is_donut_hole_prod',\n", " 'Is the tract surrounded by disadvantaged communities?.1': 'is_donut_hole_local'\n", " },\n", " inplace=True) \n", "\n", "# drop first two rows, which old column information\n", "deltas__df_id.drop(index=[0,1], inplace=True)\n", "\n", "# create bool to store whether final DAC designation was updated\n", "deltas__df_id['class_change'] = deltas__df_id.final_dac_prod!=deltas__df_id.final_dac_local\n", "\n", "# set class change bool to false where both designations are null\n", "deltas__df_id.loc[((deltas__df_id.final_dac_prod.isna()==True)\\\n", " & (deltas__df_id.final_dac_local.isna()==True)), \n", " 'class_change'] = False\n", "\n", "# view tracts that had their status updated with this method\n", "deltas__df_id[deltas__df_id.class_change][['GEOID10_TRACT', \n", " 'dac_adj_avg_prod', 'dac_adj_avg_local',\n", " 'is_donut_hole_prod', 'is_donut_hole_local',\n", " 'final_dac_prod', 'final_dac_local',\n", " 'class_change']]" ] }, { "cell_type": "code", "execution_count": 38, "id": "6d1c7b0e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12087971001, 12101030202, 12101030303, 12127081102, 26069000200, 26083000100, 28047001400, 39007000602, 41041950303, 48061012305, 51001090600, 51131930200, 55101000100, 72013301000, 72037160100\n" ] } ], "source": [ "# print list of tract IDs where status was updated by method 1a\n", "updated_ids__tract_area = deltas__tract_area[deltas__tract_area.class_change]['GEOID10_TRACT'].values\n", "print(', '.join(updated_ids__tract_area))" ] }, { "cell_type": "code", "execution_count": 39, "id": "a1ea1384", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12087971001, 12101030202, 12101030303, 12127081102, 26069000200, 26083000100, 28047001400, 39007000602, 41041950303, 48061012305, 51001090600, 51131930200, 55101000100, 72013301000, 72037160100\n" ] } ], "source": [ "# print list of tract IDs where status was updated by method 2a\n", "updated_ids__df_area = deltas__df_area[deltas__df_area.class_change]['GEOID10_TRACT'].values\n", "print(', '.join(updated_ids__df_area))" ] }, { "cell_type": "code", "execution_count": 40, "id": "f6516263", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12087971001, 12101030202, 12101030303, 12127081102, 26069000200, 28047001400, 39007000602, 48061012305, 55101000100, 72037160100\n" ] } ], "source": [ "# print list of tract IDs where status was updated by method 2b\n", "updated_ids__df_id = deltas__df_id[deltas__df_id.class_change]['GEOID10_TRACT'].values\n", "print(', '.join(updated_ids__df_id))" ] }, { "cell_type": "code", "execution_count": 41, "id": "f540133f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# how many tracts updated by method 1a?\n", "len(updated_ids__tract_area)" ] }, { "cell_type": "code", "execution_count": 42, "id": "ffb41759", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# how many tracts updated by method 2a?\n", "len(updated_ids__df_area)" ] }, { "cell_type": "code", "execution_count": 43, "id": "4e5099ff", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# how many tracts updated by method 2b?\n", "len(updated_ids__df_id)" ] }, { "cell_type": "code", "execution_count": 44, "id": "8534563c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# how many tracts are changed by both 1a and 2a?\n", "len(set(updated_ids__tract_area) & set(updated_ids__df_area))" ] }, { "cell_type": "code", "execution_count": 45, "id": "b8270c9a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# how many tracts are changed by all 3?\n", "len(set(updated_ids__tract_area) & set(updated_ids__df_id))" ] }, { "cell_type": "markdown", "id": "6494e4c3", "metadata": {}, "source": [ "Implementations 1a & 2a result in the same designations. They makes all the same changes as implementation 2b, plus 5 more." ] }, { "cell_type": "code", "execution_count": 46, "id": "8c80b1d6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'26083000100', '41041950303', '51001090600', '51131930200', '72013301000'}" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# view the tracts that are changed with 1a/2a but not with 2b\n", "\n", "set(updated_ids__tract_area) - set(updated_ids__df_id)" ] }, { "cell_type": "code", "execution_count": 47, "id": "bb4a3c4b", "metadata": {}, "outputs": [], "source": [ "### VA:\n", "# 51001090600\n", "# https://screeningtool.geoplatform.gov/en/#9.26/37.7027/-75.8891\n", "# 51131930200\n", "# https://screeningtool.geoplatform.gov/en/#9.84/37.3186/-75.9043\n", "\n", "### MI: \n", "# 26083000100\n", "# https://screeningtool.geoplatform.gov/en/#8.87/47.3424/-88.1965\n", "\n", "### OR:\n", "# 41041950303\n", "# https://screeningtool.geoplatform.gov/en/#11.94/45.015/-123.98189\n", "\n", "### PR:\n", "# 72013301000\n", "# https://screeningtool.geoplatform.gov/en/#13.45/18.47558/-66.75585" ] }, { "cell_type": "markdown", "id": "9c3e7297", "metadata": {}, "source": [ "These all look like they should be re-classified. Indicates we should not go with 2b." ] }, { "cell_type": "code", "execution_count": 48, "id": "67bbb703", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'12087971001',\n", " '12101030202',\n", " '12101030303',\n", " '12127081102',\n", " '26069000200',\n", " '28047001400',\n", " '39007000602',\n", " '48061012305',\n", " '55101000100',\n", " '72037160100'}" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check the rest of the changed tracts\n", "set(updated_ids__tract_area) & set(updated_ids__df_id)" ] }, { "cell_type": "code", "execution_count": 49, "id": "b51cfa18", "metadata": {}, "outputs": [], "source": [ "### FL:\n", "# 12087971001\n", "# https://screeningtool.geoplatform.gov/en/#12.3/24.73756/-81.0061\n", "# 12101030202\n", "# https://screeningtool.geoplatform.gov/en/#13.28/28.34631/-82.71162\n", "# 12101030303\n", "# https://screeningtool.geoplatform.gov/en/#12.8/28.22897/-82.75138\n", "# 12127081102\n", "# https://screeningtool.geoplatform.gov/en/#13.54/29.24789/-81.02269\n", "\n", "### MI:\n", "# 26069000200\n", "# https://screeningtool.geoplatform.gov/en/#9.71/44.3623/-83.5055\n", "\n", "### MS:\n", "# 28047001400\n", "# https://screeningtool.geoplatform.gov/en/#11.54/30.3705/-89.0511\n", "\n", "### OH:\n", "# 39007000602\n", "# https://screeningtool.geoplatform.gov/en/#12.33/41.89305/-80.82678\n", "\n", "### TX:\n", "# 48061012305\n", "# https://screeningtool.geoplatform.gov/en/#9.76/26.2291/-97.2455\n", "\n", "### WI:\n", "# 55101000100\n", "# https://screeningtool.geoplatform.gov/en/#13.44/42.7299/-87.77935\n", "\n", "### PR:\n", "# 72037160100\n", "# https://screeningtool.geoplatform.gov/en/#11.59/18.2405/-65.6222" ] }, { "cell_type": "markdown", "id": "3410f026", "metadata": {}, "source": [ "These all look like legitimate donut holes." ] }, { "cell_type": "markdown", "id": "327894d4", "metadata": {}, "source": [ "#### Final recommendation: Use implementation 2a (filter the input df using ALAND10). This will allow us to update the status of all 15 of the above Census tracts, and will also let the function continue to return rows for the water tracts (as we currently do in prod)." ] }, { "cell_type": "code", "execution_count": null, "id": "4455847a", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }