Update diagrams in architecture docs (#790)

* Update diagrams in architecture docs

* Updates overall diagram to represent current arch and process
(including vision for data selection)
* Updates geo data pipleline arch diagram and removes geoplatform
version since we only have one version of this for the foreseeable
future and we're using geoplatform infradstructure

* Update diagram to remove something we do not yet do

* Updating Diagram

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: GitHub Action <action@github.com>
This commit is contained in:
Shelby Switzer 2021-10-08 13:12:03 -04:00 committed by GitHub
commit 1f78920f63
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 45 additions and 71 deletions

View file

@ -4,14 +4,10 @@ The below is a general architecture of our proposed system:
![Architecture](architecture-mmd.svg) ![Architecture](architecture-mmd.svg)
The following is a more detailed diagram of the geo data pipeline architecture (the Data Pipeline and Server boxes in the general architecture diagram above). The following is a more detailed diagram of the data pipeline architecture utilizing S3 buckets for file/data hosting on Geoplatform.gov.
![Geo Data Pipeline](geodata-pipeline-arch-mmd.svg) ![Geo Data Pipeline](geodata-pipeline-arch-mmd.svg)
We are partnering with Geoplatform to turn some of these pieces into open source shared services that they would own. The following is a modified diagram showing which pieces would tentatively be owned by Geoplatform.
![Geo Data Pipeline](geodata-pipeline-arch-geoplatform-mmd.png)
## Updating the Diagram ## Updating the Diagram
**Note: Do Not directly modify the svg file, it is generated automatically!** **Note: Do Not directly modify the svg file, it is generated automatically!**

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 30 KiB

After

Width:  |  Height:  |  Size: 30 KiB

Before After
Before After

View file

@ -3,37 +3,29 @@ graph LR
input["Community Input"] input["Community Input"]
end end
subgraph ds["Data Selection"] subgraph ds["Data Selection (vision)"]
input --> Intake input --> Intake
input --> Evolution input --> Evolution
input --> Voting input --> Voting
Intake --> Evolution --> Voting Intake --> Evolution --> Voting
end end
subgraph s["Geoplatform.gov"] subgraph s["Hosted by Geoplatform.gov"]
subgraph dp["Data Pipeline (Justice40 Repo)"] subgraph dp["Data Pipeline (Justice40 Repo)"]
Voting --> a["Approved Datasets"] Voting --> a["Approved Datasets"]
a --> Properties a -- ETL --> ncsv["Normalized CSVs"]
a --> Geometries ncsv--"Score Generation"--> ScoreCSV["Full CSV with Data and Score"]
Properties --> Processing ScoreCSV-->GeoJSON
Geometries --> Processing GeoJSON-->MVT["Uncompressed MVT Tiles"]
input --> Processing
end end
subgraph Server subgraph j40["Justice40 Client"]
Processing --> GeoJSON MVT --"API (S3 Access)"--> vl["Justice40 Visualization Library (MapLibre)"]
GeoJSON --> db[("Feature Database")] vl --> fe["Justice40 Static Site Frontend (Gatsby)"]
db --> tileserv["Tile Server"]
end
subgraph j40["Justice40 Client"]
tileserv --> vl
ts["Tile Styling"] --> vl["Justice40 Visualization Library"]
vl --> fe["Justice40 Static Site Frontend"]
end end
end end
subgraph oc["Other Clients"] subgraph oc["Other Clients"]
tileserv -- API --> 3p["Third Party Apps"] ScoreCSV --"API (S3 Access)" --> DS["Data Scientists"]
GeoJSON -- API --> 3p GeoJSON -- "API (S3 Access)" --> 3["Third Party Apps"]
db -- API --> 3p MVT -- "API (S3 Access)" --> 3["Third Party Apps"]
end end

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 35 KiB

View file

@ -1,28 +0,0 @@
graph TD
Dataset1["Dataset 1"]-->Score
Dataset2["Dataset 2"]-->Score
Census["Census TIGER Data"]-->CGTiger
subgraph "Owned by Geoplatform"
CGTiger["Create GeoJSON from Shapefile with osgeo/gdal"]-->TS3
end
TS3("TIGER GeoJSON (S3)")-->CGJ
Score["Create Score CSV"]--Event Notification-->CSV
CSV("CSV (S3)")--"Event Notification (Geoplatform)"-->CGJ
subgraph "Owned by Geoplatform"
CGJ["Combine (ogr2ogr)+ Create GeoJSON"]--Event Notification-->GeoJSON
end
GeoJSON("GeoJSON (S3)")-->Tip
GeoJSON--"Access non-geo data"-->Client
subgraph "Owned by Geoplatform"
Tip[/Tippecanoe/]-->CreateMVT["Create and Send MVT"]
end
subgraph production
CreateMVT-->MBTiles
MBTiles-->Uncompressed("Uncompressed MVT (Geoplatform S3)")
end
subgraph development
CreateMVT-->Compressed("MBTiles (GeoPlatform S3)")-->TS[/Tileserver-GL/]
end
TS--"XYZ URL"-->Client
Uncompressed--"XYZ URL"-->Client["Gatsby+OpenLayers Client"]

Binary file not shown.

Before

Width:  |  Height:  |  Size: 69 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 34 KiB

View file

@ -1,22 +1,36 @@
graph TD graph TD
Dataset1["Dataset 1"]-->Score Dataset1["Dataset 1"]-->ETL1
Dataset2["Dataset 2"]-->Score Dataset2["Dataset 2"]-->ETL2
Census["Census TIGER Data"]-->CGTiger subgraph "ETL and Score Generation"
CGTiger["Create GeoJSON from Shapefile with osgeo/gdal"]-->TS3 ETL1["ETL for Dataset 1"]-->ncsv1("Normalized CSV (S3)")
TS3("TIGER GeoJSON (S3)")-->CGJ ETL2["ETL for Dataset 2"]-->ncsv2("Normalized CSV (S3)")
Score["Create Score CSV"]-->CSV ncsv1-->Score
CSV("CSV (S3)")-->CGJ ncsv2-->Score
CGJ["Combine (ogr2ogr)+ Create GeoJSON"]-->GeoJSON Score-->DL("Downloadable zip")
GeoJSON("GeoJSON (S3)")-->Tip Score["Generate Score (score-run)"]-->CSV
GeoJSON--"Access non-geo data"-->Client
subgraph "Generate MVT"
Tip[/Tippecanoe/]-->CreateMVT["Create and Send MVT"]
end end
DL-->Client
Census["Census TIGER Data Shapefiles (hosted by Census)"]-->CGTiger
subgraph "Census Data ETL"
CGTiger["Create GeoJSON from Shapefile with ogr2ogr"]-->TS3
TS3("TIGER GeoJSON State Files(S3)")-->CombineCensus["Combine Census State Files with Geopandas"]
CombineCensus-->NCS3("National Census GeoJSON (S3)")
end
CSV("Full CSV (S3)")-->CGJ
NCS3-->CGJ
CGJ["Combine with ogr2ogr + Create GeoJSON (score-geo)"]-->GeoJSON
GeoJSON("GeoJSON files (high and low zoom) (S3)")-->Tip
Tip["Create and Send Tiles using Tippecanoe"]-->Uncompressed
Tip-->Compressed
subgraph production subgraph production
CreateMVT-->Uncompressed("Uncompressed MVT (S3)") Uncompressed("Uncompressed MVT high and low directories (S3)")
end end
subgraph development subgraph development
CreateMVT-->Compressed("MBTiles (S3)")-->TS[/Tileserver-GL/] Local("Locally stored tiles")--"Option 1"-->TS
Compressed("Compressed high and low .mbtile files (S3)")--"Option 2"-->TS[/Tileserver-GL/]
end end
TS--"XYZ URL"-->Client TS--"XYZ URL"-->Client
Uncompressed--"XYZ URL"-->Client["Gatsby+OpenLayers"] Uncompressed--"XYZ URL"-->Client["Gatsby+MapLibre"]