UW-CALMA-datarescue/how-to-start/track-2-data-assessment.md
2025-01-22 07:02:03 +00:00

6.4 KiB

🔍 Track 2 (Data Assessment)

This track focuses on finding and evaluating valuable and relevant at-risk data.This helps others be able to complete capturing tasks as they can depend on your “peer reviewed” assessment.

Tech Skill Level: Intermediate

Time Commitment: ~2 hours

Tasks Include:

  1. Identify & submit at-risk web pages
  2. Collect at-risk individual web pages

Tools Required (vary across tasks):

  1. Wayback Machine extension/add-on
    1. Chrome Extension
    2. Firefox add-on
    3. Safari Extension
    4. iOS app
    5. Android app
  2. Spreadsheet editor (excel, google sheets)

Breakdown of Task Sections
🚁 (helicopter emoji) gives summary of task
🗂️ (index dividers) outlines specific steps needed to complete task
🛠️ (hammer & wrench emoji) details skills & tools needed for task

SUGGESTED TASKS & INSTRUCTIONS

1. Identify & suggest at-risk web pages

🚁Summary: Volunteers will search through the web for web pages, single files, and other online information that may be considered at-risk data from sources like federal agencies, state and regional offices, or national or local environmental organizations and groups.

🗂️Workflow:

  • Review established collecting criteria to see if the webpage/website/dataset falls within this Data Rescue scope
    • NOTE: suggestions will be captured and/or deposited to 1 of the following repositories
      • End of Term (EoT) : Mostly includes and collects webpages and files related to specific presidential administrations
      • UW Web Archives: More focused on Northwest regional history, University of Washington related work and studies, and labor relations
      • Internet Archive: more broad collecting scope. Tends to capture web content (meaning webpages or part of websites) Harder to upload actual files.
  • Research, name, and document web pages (individual pages or a small batch of pages within a website.
    • For a large quantity of web pages or complex large websites, see Track 3.
  • Submit basic information about the web page on this Data Tracking Form.

🛠️Skills Needed: Be able to browse through web pages to assess value or significance of content then be able to access a google form to submit assessment and basic metadata (details about the webpage/website)

2. Describe collected webpages/records

🚁Summary: Description (at times also called metadata) helps with the management and access to digital records. Information about the content and context helps with identification, assessment, and verification of authenticity.

🗂️Workflow

  • Read dublin core basic manual to learn about the use of metadata for digital records (for explanation on selected metadata fields, see section 4. Elements)
  • Navigate to the Data Tracking List sheet (stay on the first tab “EDIT TO THIS TABLE”)
  • Select a row with an item that is marked as "Needs Metadata" (this info found in column K)
  • Fill-in descriptive text for your select row for all 5 metadata columns (these are columns P through U colored in teal blue)
  • Use Data Tracking sheet for title and URL to find other information to create description
  • When metadata has been entered, change status of row in column K to "Needs Checksum"

🛠️Skills Needed Have reading and writing literacy in English language to create metadata in English. Ability to distinguish between different types of metadata fields (who, what, when. where, how)

3. Contribute capture suggestions for select repositories (End of Term (EoT) or Internet Archive) collection

🚁Summary: Participants in this track will browse a specific set of suggested at-risk federal webpages to search for ones that need to be preserved.

🗂️Workflow:

  • Reference this Data Tracking List - Data Rescue 2025 (Responses) that are ready for archiving
  • Claim a row
  • Change row Status to “In-progress”
  • To capture, decide if it goes to the Internet Archive or the End of Term project
    • Internet Archive- use Chrome, Firefox, or Safari browser extension to capture webpage OR entr URL directly via the Wayack Machine (navigate to "Save Page Now" option)
    • End of Term - use the project's nomination form to add URL, title, & government data type
  • Update row status to “Captured" on the spreadsheet
  • Repeat process for a new submission

🛠️Skills Needed: Be able to browse through web pages and use a browser extension button (listed in above Tools Required) to notify the Internet Archive or the End of Term project, which has been preserving federal webpages since 2008. Once the URL to the web page has been submitted to Internet Archive, the EoT will automatically process the webpage for long term preservation into their repository.