mirror of
https://github.com/DOI-DO/j40-cejst-2.git
synced 2025-02-23 01:54:18 -08:00
* initial checkin * gitignore and docker-compose update * readme update and error on hud * encoding issue * one more small README change * data roadmap re-strcuture * pyproject sort * small update to score output folders * checkpoint * couple of last fixes
39 lines
2.4 KiB
YAML
39 lines
2.4 KiB
YAML
# There is no method for adding field descriptions to `yamale` schemas.
|
|
# Therefore, we've created a dictionary here of fields and their descriptions.
|
|
name: A short name of the data set.
|
|
source: The URL pointing towards the data set itself or more information about the
|
|
data set.
|
|
relevance_to_environmental_justice: It's useful to spell out why this data is
|
|
relevant to EJ issues and/or can be used to identify EJ communities.
|
|
spatial_resolution: Dev team needs to know if the resolution is granular enough to be useful
|
|
public_status: Whether a dataset has already gone through public release process
|
|
(like Census data) or may need a lengthy review process (like Medicaid data).
|
|
sponsor: Whether there's a federal agency or non-governmental agency that is working
|
|
to provide and maintain this data.
|
|
subjective_rating_of_data_quality: Sometimes we don't have statistics on data
|
|
quality, but we know it is likely to be accurate or not. How much has it been
|
|
vetted by an agency; is this the de facto data set for the topic?
|
|
estimated_margin_of_error: Estimated margin of error on measurement, if known. Often
|
|
more narrow geographic measures have a higher margin of error due to a smaller sample
|
|
for each measurement.
|
|
known_data_quality_issues: It can be helpful to write out known problems.
|
|
geographic_coverage_percent: We want to think about data that is comprehensive across
|
|
America.
|
|
geographic_coverage_description: A verbal description of geographic coverage.
|
|
data_formats: Developers need to know what formats the data is available in
|
|
last_updated_date: When was the data last updated / refreshed? (In format YYYY-MM-DD.
|
|
If exact date is not known, use YYYY-01-01.)
|
|
frequency_of_updates: How often is this data updated? Is it updated on a reliable
|
|
cadence?
|
|
documentation: Link to docs. Also, is the documentation good enough? Can we get the
|
|
info we need?
|
|
data_can_go_in_cloud: Some datasets can not legally go in the cloud
|
|
|
|
discussion: Review of other topics, such as
|
|
peer review (Overview or links out to peer review done on this dataset),
|
|
where and how data is available (e.g., Geoplatform.gov? Is it available from multiple
|
|
sources?),
|
|
risk assessment of the data (e.g. a vendor-processed version of the dataset might not
|
|
be open or good enough),
|
|
legal considerations (Legal disclaimers, assumption of risk, proprietary?),
|
|
accreditation (Is this source accredited?)
|