mirror of
https://github.com/UW-CALMA/datarescue.git
synced 2025-02-22 09:41:30 -08:00
GITBOOK-4: No subject
This commit is contained in:
parent
c631f43177
commit
353c90cf17
11 changed files with 439 additions and 1 deletions
32
README.md
32
README.md
|
@ -1 +1,31 @@
|
||||||
# datarescue
|
---
|
||||||
|
description: WELCOME!
|
||||||
|
---
|
||||||
|
|
||||||
|
# Data Rescues 2025
|
||||||
|
|
||||||
|
In response to political threats to social, environmental, health, and personal data, the [University of Washington Center for Advances in Libraries, Museums, and Archives (CALMA)](https://calma.ischool.uw.edu/) in collaboration with Seattle-based BKS Studio, is hosting a series of DATA RESCUE efforts. Modeled after the 2016-2017 DataRescue movement that responded to hostile conditions towards environmental and climate science data, these ad hoc digital archiving volunteer events invite community members to apply their technical skills and social values in response to new threats to important data. 
|
||||||
|
|
||||||
|
While Data Rescues can focus on any type of at-risk public and government data, many have focused on environmental issues given[ documented threats and modifications](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0246450) by recent political agendas. 
|
||||||
|
|
||||||
|
This Gitbook is a knowledge base and living document for the 2025 Seattle-based Data Rescues.Open to public suggestions and contributions, this is a living document that is created informed by documentation and literature from the 2017-2018 Data Rescues and by community-informed brainstorming discussions. 
|
||||||
|
|
||||||
|
The following webpages include:
|
||||||
|
|
||||||
|
* [Background](https://docs.google.com/document/d/1WzwaEl0BReGwFT-sQW_DM5lD9bKVtw7duoNbCMBvvsw/edit?tab=t.0#heading=h.mjrh0wfiipbo) information on the history of Data Rescue efforts
|
||||||
|
* Guide on [How To Start](https://docs.google.com/document/d/1WzwaEl0BReGwFT-sQW_DM5lD9bKVtw7duoNbCMBvvsw/edit?tab=t.0#heading=h.c4ageapgvmi0) contributing to Data Rescues
|
||||||
|
* More detailed instruction on contributing to each [Task Workflow](https://docs.google.com/document/d/1WzwaEl0BReGwFT-sQW_DM5lD9bKVtw7duoNbCMBvvsw/edit?tab=t.0#heading=h.ww1afpx0mzsl)
|
||||||
|
* A growing list of [Learning Resources](https://docs.google.com/document/d/1WzwaEl0BReGwFT-sQW_DM5lD9bKVtw7duoNbCMBvvsw/edit?tab=t.0#heading=h.5uoktqwf0zoj) (readings and tools)
|
||||||
|
* [Contact information](https://docs.google.com/document/d/1WzwaEl0BReGwFT-sQW_DM5lD9bKVtw7duoNbCMBvvsw/edit?tab=t.0#heading=h.kh20zm7am2l2) to connect with wider community on local, national, and international levels
|
||||||
|
|
||||||
|
Community is invited to comment and suggest ideas and updates to this site. By submitting Change Requests through the Gitbook interface. More on how to submit updates here.
|
||||||
|
|
||||||
|
***
|
||||||
|
|
||||||
|
Let’s get started! How would you like to get started?
|
||||||
|
|
||||||
|
**I’m new to Data Rescues** → Read a bit more about the [background](what-are-data-rescues.md) to understand why, when, where, how on this movement
|
||||||
|
|
||||||
|
**I’ve rescued data before** → First, look through our [Collecting Scope](collecting-scope.md) page for the Seattle Data Rescues vision then move to contribute to community efforts
|
||||||
|
|
||||||
|
\
|
||||||
|
|
12
SUMMARY.md
Normal file
12
SUMMARY.md
Normal file
|
@ -0,0 +1,12 @@
|
||||||
|
# Table of contents
|
||||||
|
|
||||||
|
* [Data Rescues 2025](README.md)
|
||||||
|
* [What are Data Rescues](what-are-data-rescues.md)
|
||||||
|
* [Community Agreements](community-agreements.md)
|
||||||
|
* [Collecting Scope](collecting-scope.md)
|
||||||
|
* [How To Start](how-to-start/README.md)
|
||||||
|
* [Track 1 (Communications)](how-to-start/track-1-communications.md)
|
||||||
|
* [Track 2 (Data Assessment)](how-to-start/track-2-data-assessment.md)
|
||||||
|
* [Track 3 (Technical)](how-to-start/track-3-technical.md)
|
||||||
|
* [Resources & Tools](resources-and-tools.md)
|
||||||
|
* [Stay in Touch](stay-in-touch.md)
|
32
collecting-scope.md
Normal file
32
collecting-scope.md
Normal file
|
@ -0,0 +1,32 @@
|
||||||
|
---
|
||||||
|
description: What are we collecting? Why?
|
||||||
|
---
|
||||||
|
|
||||||
|
# Collecting Scope
|
||||||
|
|
||||||
|
In an effort to continue advancing environmental protection and actions, the Data Rescues in Seattle will focus on Climate Data with the following data themes and types:
|
||||||
|
|
||||||
|
1. Geographic Parameters
|
||||||
|
1. Pacific Northwest, Alaska, and Antarctic regions
|
||||||
|
2. Timeframe
|
||||||
|
1. 2020 - present day
|
||||||
|
3. Record Types
|
||||||
|
1. Public Resources (example: PSA about cooling centers)
|
||||||
|
2. Agency Reports and assessments (example: reports of pollutants in water systems) 
|
||||||
|
3. Grant funded project datasets from researchers or consultants (grant reports, funding proposals, or publication samples)
|
||||||
|
4. Data types
|
||||||
|
1. Image files(JPG, TIFF, etc.)
|
||||||
|
2. Digital documents (DOCX, PDF)
|
||||||
|
3. Web files (recordings of web content or WARC files)
|
||||||
|
5. Rights & Restrictions: 
|
||||||
|
1. Only public records (information created or maintained by a government agency that is available to the public upon request) created via public funding
|
||||||
|
6. Languages
|
||||||
|
1. English primary; other languages as volunteer skills are available
|
||||||
|
|
||||||
|
#### Rationale
|
||||||
|
|
||||||
|
We, the host of the 2025 Seattle Data Rescue, acknowledge the various threats many government data types face today and across time, but in order to keep the intention and scope manageable and relevant to our immediate communities, we selected the above criteria. In particular, many researchers, students, and educators in Washington place a heavy emphasis on environmental and climate data due to our local histories including agriculture, trade, cultural beliefs, and social activities. We attempt through our select scope to appeal to local volunteers as well as to utilize local knowledge to the best of our abilities. We aren’t exclusively focusing on topics like medical government records, but see clear connections across the topics of environmental conditions and public health, community wellbeing, and environmental and economic prosperity. We have also chosen to prioritize federal government data due to the highly vocal apprehension from federal officials. We also assume that given Washington State’s more supportive attitude towards environmental and climate science initiatives, there are fewer immediate threats compared to the federal government or other less supportive states. 
|
||||||
|
|
||||||
|
We encourage others to branch this effort to address data rescue efforts for other highly vulnerable and crucial topics including but not limited to gender affirming care, public health, etc. 
|
||||||
|
|
||||||
|
\
|
16
community-agreements.md
Normal file
16
community-agreements.md
Normal file
|
@ -0,0 +1,16 @@
|
||||||
|
---
|
||||||
|
description: READ BEFORE STARTING YOUR CONTRIBUTION
|
||||||
|
---
|
||||||
|
|
||||||
|
# Community Agreements
|
||||||
|
|
||||||
|
At Data Rescue events, we strive to support and create an environment that encourages all participants—organizers, supporters, any and all stakeholders to hold safe, engaging, and respectful environments where each person is able to focus as much of their energy on the Data Rescue’s intent.
|
||||||
|
|
||||||
|
The goal is to have community members collectively contribute a list of agreements, expectations, and a code of conduct that we can all agree upon with mutual respect. Please reference the below links for samples on Community Agreements that can be an inspiration to the ways we can hold space for and with each other during Data Rescue events.
|
||||||
|
|
||||||
|
We invite community members to suggest and/or share any other resources or samples that might be relevant and helpful as we build a welcoming and respectful environment towards our collective goals.
|
||||||
|
|
||||||
|
* [Code of Conduct and Community Agreement – Collective Responsibility](https://laborforum.diglib.org/code-of-conduct-and-community-agreement/) 
|
||||||
|
* [Community Agreements and Code of Conduct | Lighting the Way - Spotlight at Stanford](https://exhibits.stanford.edu/lightingtheway/about/community-agreements-and-code-of-conduct)
|
||||||
|
* [Social rules - Recurse Center](https://www.recurse.com/social-rules)
|
||||||
|
* [Group Agreements - Seeds for Change](https://www.seedsforchange.org.uk/groupagree)
|
39
how-to-start/README.md
Normal file
39
how-to-start/README.md
Normal file
|
@ -0,0 +1,39 @@
|
||||||
|
---
|
||||||
|
description: Step-by-step instructions
|
||||||
|
---
|
||||||
|
|
||||||
|
# How To Start
|
||||||
|
|
||||||
|
Hello & welcome!
|
||||||
|
|
||||||
|
Thank you for volunteering your time for this year’s Data Rescue (2025)! We appreciate your energy, enthusiasm, or whatever motivator brought you here. Now that you’ve read through the purpose, goals, and background of past and present data Rescues, we can get you started on contributing.
|
||||||
|
|
||||||
|
Whether you are just now thinking about responding to current data risks and threats or you are involved in long term data preservation work, the possibilities for digital archival preservation abound. No matter your skill set, field of expertise, or interests, there is a digital archiving task for you!
|
||||||
|
|
||||||
|
Now that you’ve reviewed the general expectations and background information, we ask all volunteers to following these steps to start:
|
||||||
|
|
||||||
|
1. Read the community agreements page (LINK)
|
||||||
|
2. Review track listing (below) to pick a path 
|
||||||
|
3. Review track task options and pick 1 to start
|
||||||
|
4. Add your info on [Running Tasks spreadsheet](https://docs.google.com/spreadsheets/d/1fSQuVpfgralyFP0eZXnqrJiNmhP0umm7fL7IbmUOx8M/edit?usp=sharing) for an empty row in selected track sheet
|
||||||
|
5. Work on selected task (ask for clarification or help from Coordinators in purple💜)
|
||||||
|
6. Once task has been completed, add completion status and relevant notes
|
||||||
|
7. REPEAT steps 2-6 if you want to take on another task (or just say bye 👋🏼 if you need to leave)
|
||||||
|
|
||||||
|
### Track List Options
|
||||||
|
|
||||||
|
The following options can be considered a sort of “choose your own adventure” with each area focused on a different stage of digital preservation. As Data Rescue events are set up to welcome all skills, each track has a core focus but includes various tasks that can be completed usually within an hour.
|
||||||
|
|
||||||
|
#### [Track 1 (Communications)](track-1-communications.md)
|
||||||
|
|
||||||
|
This track focuses on creating, revising, and sharing information about at-risk data. This includes adding context about collected data in order to improve management and findability of data for future purposes.
|
||||||
|
|
||||||
|
#### [Track 2 (Data Assessment)](track-2-data-assessment.md)
|
||||||
|
|
||||||
|
This track focuses on finding and evaluating valuable and relevant at-risk data.These tasks help others be able to complete more time intensive capturing tasks as they can use your “peer reviewed” assessment.
|
||||||
|
|
||||||
|
#### [Track 3 (Technical)](track-3-technical.md)
|
||||||
|
|
||||||
|
This track focuses on the actual capture of at-risk data in a variety of formats. As these tasks require the most technical knowledge, skills, and equipment, volunteers are encouraged to take this track when they are able to dedicate more time.
|
||||||
|
|
||||||
|
\
|
79
how-to-start/track-1-communications.md
Normal file
79
how-to-start/track-1-communications.md
Normal file
|
@ -0,0 +1,79 @@
|
||||||
|
# Track 1 (Communications)
|
||||||
|
|
||||||
|
This track focuses on creating, revising, and sharing information about at-risk data. This includes adding context about collected data in order to improve management and findability of data for future purposes. The below are our initial goals coming from CALMA / Seattle-based efforts, but all participants are free to suggest and pursue other contributions.
|
||||||
|
|
||||||
|
**Tech Skill Level:** Beginner
|
||||||
|
|
||||||
|
**Time Commitment:** minimal (\~1 hour)
|
||||||
|
|
||||||
|
**Tasks Include:**
|
||||||
|
|
||||||
|
1. Write letters of support for relevant proposed legislation (federal, state, or local level)
|
||||||
|
2. Share updates, news, or general takeaways with close friends, family, and peers
|
||||||
|
3. Contribute description to captured or identified datasets
|
||||||
|
4. Provide translation of existing metadata in different languages
|
||||||
|
5. Contribute to research on public records laws & legislation
|
||||||
|
|
||||||
|
**Tools Required (vary across tasks)**
|
||||||
|
|
||||||
|
1. Spreadsheet editor (excel, google sheets)
|
||||||
|
2. Word processor (word, google docs, notepad)
|
||||||
|
3. Image repository (wikimedia, library/archives public domain digital collections) 
|
||||||
|
4. Social media/communication portals (bluesky, facebook, instagram, tiktok, YouTube, etc.)
|
||||||
|
|
||||||
|
### **TASKS BREAKDOWN (see below for summary of tasks)**
|
||||||
|
|
||||||
|
#### 1. Legislation Letters of Support
|
||||||
|
|
||||||
|
**Summary:** Help support the passing of the “[Public Archives Resiliency Act](https://www.markey.senate.gov/news/press-releases/sens-markey-hirono-and-rep-adams-introduce-legislation-to-promote-conservation-and-preservation-of-government-and-historic-records)” proposed by Senator Edward J. Markey (D-Mass.), Senator Mazie Hirono (D-Hawaii), and Congresswoman Alma Adams (NC-12) by contacting your elected representative today. 
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Customize template with your details
|
||||||
|
2. Send/distribute to representatives, policymakers, community to send to their own reps
|
||||||
|
3. Share info with networks, friends, or family
|
||||||
|
|
||||||
|
**Skills Needed:** Basic understanding of English and public policy advocacy. You will need to be a registration eligible voter in the United States and be able to identify your elected official on the state and federal level.
|
||||||
|
|
||||||
|
#### 2. Research Team (topic of public records preservation and access)
|
||||||
|
|
||||||
|
**Summary:** While the importance and complexities of digital preservation are well documented and researched, there is less information on the relationship between legislation, public awareness, and threats to government data. This task helps with surfacing some existing findings and research.
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Read the following reflection from the 2016-2017 Data Rescues 
|
||||||
|
1. Article link: [https://papers.ssrn.com/sol3/papers.cfm?abstract\_id=3163616](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3163616) 
|
||||||
|
2. Conduct research on the following questions or add relevant questions to shared document
|
||||||
|
1. How do public records laws address data destruction or at-risk data?
|
||||||
|
2. How does legislation discuss or address data destruction by political parties and systems (across levels of government like local, state, federal)?
|
||||||
|
3. What legislation exists that deals with preservation of at-risk data (USa or beyond)?
|
||||||
|
4. How does the American public perceive government data loss? 
|
||||||
|
5. How does this compare with other countries?
|
||||||
|
3. Write a paragraph on your findings, reflections, or lingering questions onto the shared research google doc: 
|
||||||
|
|
||||||
|
**Skills Needed:** Basic command of English to either read or write summaries of findings. Intermediate to advanced levels of reading and writing in order to find, assess, and summarize findings for a general United States audience and English reader. Additional language skills would be useful for translation or finding global readings and data. 
|
||||||
|
|
||||||
|
#### 3. Create promotional text/images
|
||||||
|
|
||||||
|
**Summary:** In order to better communicate the importance of digital preservation of at-risk data, we need different ways to communicate complex or obscure ideas and situations. By creating different representations, we can reach audiences across age, areas of interests, and ways of learning. 
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Help tell the story of rescuing at-risk data by creating infographics, social media posts, diagrams, or opinion pieces
|
||||||
|
1. This is perfect for those who are interested in social media, photography, blogging and any other form of storytelling.
|
||||||
|
2. Using your preferred creating software (word, google docs, canva, snappa, lucidchart, etc) create an informational piece on the following topics (or something related)
|
||||||
|
1. How can the public assess risks towards data?
|
||||||
|
2. What happens when new administrations take over agencies and their data?
|
||||||
|
3. Why should the public care about at-risk data?
|
||||||
|
|
||||||
|
**Skills Needed:** Depending on the creation and software used, creative skills will vary. For visual creations, volunteers will need a beginner to intermediate graphic or art design skillset. For written creations, volunteers are asked to write in English first with translation as a secondary option. 
|
||||||
|
|
||||||
|
#### 4. Describe collected webpages/records (Save for Day 2 January 22)
|
||||||
|
|
||||||
|
**Summary:** Description (at times also called metadata) helps with the management and access to digital records. Information about the content and context helps with identification, assessment, and verification of authenticity. 
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Create a descriptive record for each data set in 1 of the following repositories (hyperlink to come)
|
||||||
|
|
||||||
|
**Skills** Needed: Basic command of English for first layer of information with additional language skills used for translation of English metadata. Ability to distinguish between different types of information.
|
53
how-to-start/track-2-data-assessment.md
Normal file
53
how-to-start/track-2-data-assessment.md
Normal file
|
@ -0,0 +1,53 @@
|
||||||
|
# Track 2 (Data Assessment)
|
||||||
|
|
||||||
|
This track focuses on finding and evaluating valuable and relevant at-risk data.This helps others be able to complete capturing tasks as they can depend on your “peer reviewed” assessment.
|
||||||
|
|
||||||
|
**Tech Skill Level:** Intermediate
|
||||||
|
|
||||||
|
**Time Commitment:** \~2 hours
|
||||||
|
|
||||||
|
**Tasks Include:**
|
||||||
|
|
||||||
|
1. Identify & submit at-risk web pages
|
||||||
|
2. Provide brief statement on value for identified at-risk data
|
||||||
|
3. Collect at-risk individual web pages
|
||||||
|
|
||||||
|
**Tools Required (vary across tasks):**
|
||||||
|
|
||||||
|
1. Wayback Machine extension/add-on
|
||||||
|
1. [Chrome Extension](https://chromewebstore.google.com/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak?pli=1)
|
||||||
|
2. [Firefox add-on](https://web.archive.org/web/20230212035050/https://addons.mozilla.org/en-US/firefox/addon/wayback-machine_new/)
|
||||||
|
3. [Safari Extension](https://web.archive.org/web/20230212035050/https://apps.apple.com/us/app/wayback-machine/id1472432422)
|
||||||
|
4. [iOS app](https://web.archive.org/web/20230212035050/https://itunes.apple.com/us/app/wayback-machine/id1201888313)
|
||||||
|
5. [Android app](https://web.archive.org/web/20230212035050/https://play.google.com/store/apps/details?id=com.archive.waybackmachine)
|
||||||
|
2. Spreadsheet editor (excel, google sheets)
|
||||||
|
|
||||||
|
### TASKS BREAKDOWN
|
||||||
|
|
||||||
|
#### 1. Identify & submit at-risk web pages
|
||||||
|
|
||||||
|
**Summary:** Volunteers will search through federal webpages for web pages, single files, and other online information that may be considered at-risk data.
|
||||||
|
|
||||||
|
**Workflow:** 
|
||||||
|
|
||||||
|
* Review established collecting criteria as a assessment guide
|
||||||
|
* Research, name, and document web pages (individual pages or a small batch of pages within a website. 
|
||||||
|
* For a large quantity of web pages or complex large websites, see Track 3.
|
||||||
|
* Submit basic information about the web page on [this Data Tracking Form](https://docs.google.com/forms/d/e/1FAIpQLSfII-rl4yUcGPJlPWk9knWMhC_qBueJLEPcC7vphPeVisLhHA/viewform?usp=sf_link). 
|
||||||
|
|
||||||
|
**Skills Needed:** Be able to browse through web pages and use a browser extension button to notify the Internet Archive which pages to save to its End of Term (EoT) project, which has been preserving federal webpages since 2008. 
|
||||||
|
|
||||||
|
#### 2. Contribute capture suggestions for select repositories ([EoT](https://eotarchive.org/), [Internet Archive](https://archive.org/), or [UW web archives](https://archive-it.org/organizations/729/?show=Collections)) collection
|
||||||
|
|
||||||
|
**Summary:** Participants in this track will browse a specific set of federal webpages to search for ones that need to be preserved. 
|
||||||
|
|
||||||
|
**Workflow:** 
|
||||||
|
|
||||||
|
* Reference this [Data Tracking List - Data Rescue 2025 (Responses)](https://docs.google.com/spreadsheets/d/1tOS7B3lgK-8wdgyhY81ntfICMIkGwAiHfeV63hi3UzU/edit?usp=drive_link) that are ready for archiving
|
||||||
|
* Claim a row
|
||||||
|
* Change row Status to “In-progress”
|
||||||
|
* Use the Internet Archive browser extension to grab and save webpages
|
||||||
|
* Update row Status to “Submitted to EoT”
|
||||||
|
* Repeat process for a new submission
|
||||||
|
|
||||||
|
**Skills Needed:** Be able to browse through web pages and use a browser extension button (listed in above Tools Required) to notify the Internet Archive which pages to save to its End of Term project, which has been preserving federal webpages since 2008. Once the URL to the web page has been submitted to Internet Archive, the EoT will automatically process the webpage for long term preservation into their repository.
|
83
how-to-start/track-3-technical.md
Normal file
83
how-to-start/track-3-technical.md
Normal file
|
@ -0,0 +1,83 @@
|
||||||
|
# Track 3 (Technical)
|
||||||
|
|
||||||
|
This track focuses on the actual capture of at-risk data in a variety of formats. As these tasks require the most technical knowledge, skills, and equipment, volunteers are encouraged to take this track when they are able to dedicate more time.
|
||||||
|
|
||||||
|
**Tech Skill Level:** Advanced
|
||||||
|
|
||||||
|
**Time Commitment:** \~2-3 hours
|
||||||
|
|
||||||
|
**Tools Required (vary across tasks):**
|
||||||
|
|
||||||
|
* Web capture tools ([Conifer](https://guide.conifer.rhizome.org/), [Archive-It](https://archive-it.org/), [Webrecorder](https://webrecorder.io/), [wget](https://www.gnu.org/software/wget/). [More information on web archiving](https://bits.ashleyblewer.com/blog/2017/09/20/how-do-web-archiving-frameworks-work/))
|
||||||
|
* Data quality check system
|
||||||
|
* Spreadsheet editor (excel, google sheets)
|
||||||
|
* Web monitoring tool
|
||||||
|
* Storage (available internal memory, external hard drive)
|
||||||
|
|
||||||
|
**Tasks Include:**
|
||||||
|
|
||||||
|
1. Setup website monitoring systems
|
||||||
|
2. Capture website content
|
||||||
|
3. Harvesting public datasets
|
||||||
|
4. Review data authenticity and quality
|
||||||
|
5. Program or conduct comprehensive data/website crawl 
|
||||||
|
|
||||||
|
### TASKS BREAKDOWN
|
||||||
|
|
||||||
|
#### 1. Set up monitoring API tracker to document changes to government websites
|
||||||
|
|
||||||
|
**Summary:** Given the previous removal of content and subtle revision to federal government environmental websites, many 
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Read or skim the following report of website monitoring by EDGI
|
||||||
|
1. Report Link: [https://envirodatagov.org/publication/changing-digital-climate/](https://envirodatagov.org/publication/changing-digital-climate/) 
|
||||||
|
2. Download the a monitoring tool like:
|
||||||
|
1. HTTP API tracker [https://github.com/edgi-govdata-archiving/web-monitoring-db](https://github.com/edgi-govdata-archiving/web-monitoring-db) 
|
||||||
|
2. Comprehensive list of other tools here: [https://github.com/edgi-govdata-archiving/awesome-website-change-monitoring](https://github.com/edgi-govdata-archiving/awesome-website-change-monitoring) 
|
||||||
|
3. Identify website to track using [this Data Tracking List](https://docs.google.com/spreadsheets/d/1tOS7B3lgK-8wdgyhY81ntfICMIkGwAiHfeV63hi3UzU/edit?usp=drive_link) 
|
||||||
|
4. Deploy tracker for selected website 
|
||||||
|
5. Submit information about tracked website to [the Data Tracking form](https://docs.google.com/forms/d/e/1FAIpQLSfII-rl4yUcGPJlPWk9knWMhC_qBueJLEPcC7vphPeVisLhHA/viewform?usp=sf_link)
|
||||||
|
|
||||||
|
**Skills Needed:** Advanced understanding of software deployment, APIs, and technical git repositories. 
|
||||||
|
|
||||||
|
#### 2. Capture web files/data
|
||||||
|
|
||||||
|
**Summary:** The collecting of web archives (meaning webpages and the content with them) can be complex, but necessary. Using more user friendly software, non-digital preservationist can help capture select content of websites without worrying about collecting the entire structure of a website.
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Identify a web file ready to[ ready to be archived](https://docs.google.com/spreadsheets/d/1tOS7B3lgK-8wdgyhY81ntfICMIkGwAiHfeV63hi3UzU/edit?usp=drive_link) 
|
||||||
|
2. Comment on the Status cell that you are working on that row
|
||||||
|
3. Using web capture software (like [Conifer](https://guide.conifer.rhizome.org/)) pick an at-risk website that includes at-risk data
|
||||||
|
4. Comment on the same Status cell that the web file/data has been archived
|
||||||
|
|
||||||
|
**Skills Needed:** Intermediate understanding of software deployment and website navigation. 
|
||||||
|
|
||||||
|
#### 3. Harvest public datasets available online
|
||||||
|
|
||||||
|
**Summary:** 
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
1. Search for public funding project repositories (NIH [RePORTER](https://reporter.nih.gov/), US Government Awards [USASpending](https://www.usaspending.gov/search), Federal Audit Clearinghouse [FAC](https://app.fac.gov/dissemination/search/))
|
||||||
|
2. Verify that downloadable datasets contain enough descriptive information (data files, interactive maps, etc.) 
|
||||||
|
3. Capture dataset(s) to internal storage (temporary place)
|
||||||
|
4. Submit and upload the dataset(s) to [this Data Tracking Form](https://docs.google.com/forms/d/e/1FAIpQLSfII-rl4yUcGPJlPWk9knWMhC_qBueJLEPcC7vphPeVisLhHA/viewform?usp=sf_link) 
|
||||||
|
5. You can delete dataset after successful submission via form
|
||||||
|
|
||||||
|
**Skills Needed:** Intermediate understanding of different dataset types and file formats. Comfort with downloading and saving larger files.
|
||||||
|
|
||||||
|
#### 4. Create Bag/Create checksum (save for Data Rescue Day 2 - Jan 22)
|
||||||
|
|
||||||
|
**Summary:** This helps short and long term preservation effort to verify the integrity (fixity) of stored files nd datasets. Creating checksums or reviewing them helps detect errors or signs of tampering.
|
||||||
|
|
||||||
|
**Workflow**
|
||||||
|
|
||||||
|
* Read through the [digital preservation manual chapter on fixity and checksums by the Digital Preservation Coalition](https://www.dpconline.org/handbook/technical-solutions-and-tools/fixity-and-checksums) 
|
||||||
|
* Download a fixity or checksum verification tool like
|
||||||
|
* [Md5summer](https://md5summer.org/): An application for Windows machines that will generate and verify md5 checksums.
|
||||||
|
* Identify dataset to create checksum using this [Data Tracking List - Data Rescue 2025 (Responses)](https://docs.google.com/spreadsheets/d/1tOS7B3lgK-8wdgyhY81ntfICMIkGwAiHfeV63hi3UzU/edit?usp=drive_link)
|
||||||
|
* Run a check on the selected data to create the supplemental checksum value
|
||||||
|
|
||||||
|
**Skills Needed:** Best for those with basic data or web archiving experience, or have both strong tech skills and attention to detail.
|
49
resources-and-tools.md
Normal file
49
resources-and-tools.md
Normal file
|
@ -0,0 +1,49 @@
|
||||||
|
---
|
||||||
|
description: Readings and tools available online
|
||||||
|
---
|
||||||
|
|
||||||
|
# Resources & Tools
|
||||||
|
|
||||||
|
### Tools
|
||||||
|
|
||||||
|
* Making signed BagIt files: [https://github.com/harvard-lil/bag-nabit](https://github.com/harvard-lil/bag-nabit)
|
||||||
|
* https://github.com/climate-mirror/climate-mirror-tools
|
||||||
|
* [https://www.datalumos.org/](https://www.datalumos.org/) - has simple drag and drop add tags and basic metadata
|
||||||
|
* https://www.sucho.org/ This is another initiative which was focused on Ukrainian Digital Cultural Heritage. It was kind of modelled off of Data rescue v01 but a little more broad in terms of what to “save” and there were different threats, because the physical infrastructure was also in danger
|
||||||
|
* Harvard Library Innovation Lab seeking Government Datasets for Preservation form [https://docs.google.com/forms/d/11qyuKUEkbh0OPNyAyMVCXviiSDTYYA4RLdIlTnWNiTE/edit](https://docs.google.com/forms/d/11qyuKUEkbh0OPNyAyMVCXviiSDTYYA4RLdIlTnWNiTE/edit)
|
||||||
|
|
||||||
|
### References
|
||||||
|
|
||||||
|
ClimateWire, S. W. (n.d.). Climate Web Pages Erased and Obscured under Trump. Scientific American. Retrieved January 3, 2025, from[ https://www.scientificamerican.com/article/climate-web-pages-erased-and-obscured-under-trump/](https://www.scientificamerican.com/article/climate-web-pages-erased-and-obscured-under-trump/)
|
||||||
|
|
||||||
|
Dillon, L., Walker, D., Shapiro, N., Underhill, V., Martenyi, M., Wylie, S., Lave, R., Murphy, M., Brown, P., & Environmental Data and Governance Initiative. (2017). Environmental Data Justice and the Trump Administration: Reflections from the Environmental Data and Governance Initiative. Environmental Justice, 10(6), 186–192.[ https://doi.org/10.1089/env.2017.0020](https://doi.org/10.1089/env.2017.0020)
|
||||||
|
|
||||||
|
Earthjustice. (2024, November 12). What Project 2025 Would Do to the Environment – and How We Will Respond. Earthjustice.[ https://earthjustice.org/article/what-project-2025-would-do-to-the-environment-and-how-we-will-respond](https://earthjustice.org/article/what-project-2025-would-do-to-the-environment-and-how-we-will-respond)
|
||||||
|
|
||||||
|
Environmental Data and Governance Initiative. (n.d.-a). Changing the Digital Climate: How Climate Change Web Content is Being Censored Under the Trump Administration,. Retrieved January 3, 2025, from[ https://envirodatagov.org/publication/changing-digital-climate/](https://envirodatagov.org/publication/changing-digital-climate/)
|
||||||
|
|
||||||
|
Environmental Data and Governance Initiative. (n.d.-b). Federal Environmental Web Tracker. Environmental Data and Governance Initiative. Retrieved January 3, 2025, from[ https://envirodatagov.org/federal-environmental-web-tracker-about-page/](https://envirodatagov.org/federal-environmental-web-tracker-about-page/)
|
||||||
|
|
||||||
|
Harmon, A. (2017, March 6). Activists Rush to Save Government Science Data—If They Can Find It. The New York Times.[ https://www.nytimes.com/2017/03/06/science/donald-trump-data-rescue-science.html](https://www.nytimes.com/2017/03/06/science/donald-trump-data-rescue-science.html)
|
||||||
|
|
||||||
|
Johnson, E., & Kubas, A. (2018, February 7). Spotlight on Digital Government Information Preservation: Examining the Context, Outcomes, Limitations, and Successes of the DataRefuge Movement. In the Library with the Lead Pipe.[ https://www.inthelibrarywiththeleadpipe.org/2018/information-preservation/](https://www.inthelibrarywiththeleadpipe.org/2018/information-preservation/)
|
||||||
|
|
||||||
|
Kosoff, M. (2017, January 25). Trump White House Orders E.P.A. to Delete Climate-Change Web Page. Vanity Fair.[ https://www.vanityfair.com/news/2017/01/trump-white-house-orders-epa-to-delete-climate-change-web-pages](https://www.vanityfair.com/news/2017/01/trump-white-house-orders-epa-to-delete-climate-change-web-pages)
|
||||||
|
|
||||||
|
Lamdan, S. (2018). Lessons from Datarescue: The Limits of Grassroots Climate Change Data Preservation and the Need for Federal Records Law Reform. University of Pennsylvania Law Review, 231.[ https://papers.ssrn.com/abstract=3163616](https://papers.ssrn.com/abstract=3163616)
|
||||||
|
|
||||||
|
Nost, E., Gehrke, G., Poudrier, G., Lemelin, A., Beck, M., Wylie, S., & Initiative, on behalf of the E. D. & G. (2021). Visualizing changes to US federal environmental agency websites, 2016–2020. PLOS ONE, 16(2), e0246450.[ https://doi.org/10.1371/journal.pone.0246450](https://doi.org/10.1371/journal.pone.0246450)
|
||||||
|
|
||||||
|
Sens. Markey, Hirono and Rep. Adams Introduce Legislation to Promote Conservation and Preservation of Government and Historic Records. Retrieved January 3, 2025, from[ https://www.markey.senate.gov/news/press-releases/sens-markey-hirono-and-rep-adams-introduce-legislation-to-promote-conservation-and-preservation-of-government-and-historic-records](https://www.markey.senate.gov/news/press-releases/sens-markey-hirono-and-rep-adams-introduce-legislation-to-promote-conservation-and-preservation-of-government-and-historic-records)
|
||||||
|
|
||||||
|
Sisak, M. R., Colvin, J., & Whitehurst, L. (2023, June 10). A timeline of events leading to Donald Trump’s indictment in the classified documents case. AP News.[ https://apnews.com/article/trump-documents-investigation-timeline-087f0c9a8368bb983a16b67dd31dcd4c](https://apnews.com/article/trump-documents-investigation-timeline-087f0c9a8368bb983a16b67dd31dcd4c)
|
||||||
|
|
||||||
|
Stein, R. (2024, November 12). With Trump coming into power, the NIH is in the crosshairs. NPR.[ https://www.npr.org/2024/11/12/nx-s1-5183014/trump-election-2024-nih-rfk](https://www.npr.org/2024/11/12/nx-s1-5183014/trump-election-2024-nih-rfk)
|
||||||
|
|
||||||
|
Sunlight Foundation. (n.d.). How federal agencies are quietly removing government Web resources, and why it matters. Retrieved January 3, 2025, from[ https://sunlightfoundation.com/2017/11/15/how-federal-agencies-are-quietly-removing-web-resources-and-why-it-matters/](https://sunlightfoundation.com/2017/11/15/how-federal-agencies-are-quietly-removing-web-resources-and-why-it-matters/)
|
||||||
|
|
||||||
|
Tirrell, C., Senier, L., Wylie, S. A., Alder, C., Poudrier, G., DiValli, J., Beck, M., Nost, E., Brackett, R., & Gehrke, G. (2020). Learning in Crisis: Training students to monitor and address irresponsible knowledge construction by U.S. federal agencies under Trump. Engaging Science, Technology, and Society, 6, 81–93.[ https://doi.org/10.17351/ests2020.313](https://doi.org/10.17351/ests2020.313)
|
||||||
|
|
||||||
|
Vinik, D. (2017, July 25). What happened to Trump’s war on data? The Agenda.[ https://www.politico.com/agenda/story/2017/07/25/what-happened-trump-war-data-000481](https://www.politico.com/agenda/story/2017/07/25/what-happened-trump-war-data-000481)
|
||||||
|
|
||||||
|
Williams, R. (2017, January 29). Michigan web developers and archivists join race to back up federal agency data. Michigan Public.[ https://www.michiganpublic.org/environment-science/2017-01-29/michigan-web-developers-and-archivists-join-race-to-back-up-federal-agency-data](https://www.michiganpublic.org/environment-science/2017-01-29/michigan-web-developers-and-archivists-join-race-to-back-up-federal-agency-data)
|
22
stay-in-touch.md
Normal file
22
stay-in-touch.md
Normal file
|
@ -0,0 +1,22 @@
|
||||||
|
---
|
||||||
|
description: Community Contacts
|
||||||
|
---
|
||||||
|
|
||||||
|
# Stay in Touch
|
||||||
|
|
||||||
|
Given the various existing networks, groups, and institutions involved, there is no established central communication platform or mailing list. But in order to help connect people and work, we have compiled a short and growing list of contact information and opportunities to connect with groups.
|
||||||
|
|
||||||
|
Environmental Data Preservation Listservs
|
||||||
|
|
||||||
|
* ProjectARCC (Archivists Responding to Climate Change) - [https://projectarcc.org/get-involved/](https://projectarcc.org/get-involved/)
|
||||||
|
* DLF (Digital Library Federation) Climate Justice Working Group - [https://www.diglib.org/groups/climate-justice-working-group/](https://www.diglib.org/groups/climate-justice-working-group/)
|
||||||
|
* NDSA (National Digital Stewardship Alliance) - https://ndsa.org/groups/climate-watch/
|
||||||
|
|
||||||
|
Discord/Slack Spaces
|
||||||
|
|
||||||
|
* DocNow (Document the Now) slack: [https://www.docnow.io/](https://www.docnow.io/)
|
||||||
|
|
||||||
|
Community Networks
|
||||||
|
|
||||||
|
* Environmental Data Governance Initiative (EDGI): [https://envirodatagov.org/volunteer/](https://envirodatagov.org/volunteer/) 
|
||||||
|
* DocNow (Archivists Supporting Activists): [https://archivist.docnow.io/#/web/index](https://archivist.docnow.io/#/web/index) 
|
23
what-are-data-rescues.md
Normal file
23
what-are-data-rescues.md
Normal file
|
@ -0,0 +1,23 @@
|
||||||
|
---
|
||||||
|
description: General Background
|
||||||
|
---
|
||||||
|
|
||||||
|
# What are Data Rescues
|
||||||
|
|
||||||
|
January 2025 marks a return to uncertainty with not only the presidential transition, but also economic, political, and social tensions at all time high. With all these dangers and threats, one might overlook the crucial yet ever changing scenario of at-risk government data. But these threats to information about our government spending, investments, research, education, social support and services, as well as presidential priorities are not new.  
|
||||||
|
|
||||||
|
Let’s think back to 2017 when hundreds of people gathered together in small and big rooms in libraries or classrooms to sit in front of brightly lit computer screens rushing against the political clock. What caused this urgency, you ask? The answer rests on both an [ongoing hazard to data integrity](https://www.politico.com/agenda/story/2017/07/25/what-happened-trump-war-data-000481/) in the form of neglectful data retention practices as well as a then new danger in the form of [unruly data destruction practices](https://www.nytimes.com/2017/03/06/science/donald-trump-data-rescue-science.html?smprod=nytcore-iphone\&smid=nytcore-iphone-share) of the first Trump presidency. After a polarizing 2016 election campaign cycle full of abundant information, misinformation, and disinformation, the incoming administration finally had control over countless government data repositories, information resources, and websites. 
|
||||||
|
|
||||||
|
On January 20th, 2017 the Trump administration had shut down websites, terminated funding sources, and threatened agencies and government workers confirming what voters recalled as targeted threats against environmental science and Climate Change programs. For example the Environmental Protection Agency (EPA) website in 2017 received notification that their publicly available data would be removed come the new Trump administration although later on the administration did back away from that plan prior to taking control of the presidency. That same year EPA staffers were put under a gag order from the Trump administration and were not allowed to talk to the media. Today we recognize the drastic changes that occurred for environmental data alone
|
||||||
|
|
||||||
|
Eight years ago, many people felt immense despair, fear, and uncertainty over their public records, many of which aided in countless scientific experiments and initiatives aimed at restoring and protecting our natural environments. Luckily, for some records and many people across the world, that same despair and panic turned into action. Across the country, people designed, held, and volunteered their skills and time at data preservation events with many titled “[Data Rescues](https://sunlightfoundation.com/2017/02/06/how-data-refuge-works-and-how-you-can-help-save-federal-open-data/).” These in-person gatherings [brought together](https://www.inthelibrarywiththeleadpipe.org/2018/information-preservation/) information professionals, digital archivists, librarians, students, programmers, scientists, community organizers, and volunteers across age, professions, and backgrounds. Modeled after crowdsourcing data editing events like "wiki edit-a-thon" and collaborating with efforts like “Endangered Data Week,” these civic engagement events convened concerned individuals resulting in more supportive communities. 
|
||||||
|
|
||||||
|
Through multiple hour-long sessions held at universities, public libraries, and other public spaces, volunteers participated in web crawling, metadata creation, data backup and uploads, and identification of vulnerable at-risk federal, state, and local data and resources. While destruction of government records remains illegal both at the federal and state level, the first Trump administration was frequently noted for intentional destruction of presidential records as well as failure to follow National Archives and Records Administration data retention policies. With the upcoming second term of the Trump administration and repeated threats to many of the same agencies as before, as described in Project 2025 documents and external organizational analysis, the urgency to preserve returns.
|
||||||
|
|
||||||
|
Given these frightening scenarios and risk factors, the Seattle community can work together to triage a potentially disastrous scenario for the United States, the public, and our publicly funded government data. Others like the [Environmental Data and Governance Initiative](https://envirodatagov.org/) (EDGI), [Silencing Science Tracker](https://climate.law.columbia.edu/Silencing-Science-Tracker), and the [End-of-Term Project](https://eotarchive.org/) are continuing the work from 2016 with ongoing monitoring, collecting, describing, and preserving at-risk data. More recently state lawmakers in Massachusetts, Hawai’i and North Carolina have introduced legislation the [“Public Archives Resiliency Act](https://www.markey.senate.gov/news/press-releases/sens-markey-hirono-and-rep-adams-introduce-legislation-to-promote-conservation-and-preservation-of-government-and-historic-records)” to safeguard vulnerable government data. 
|
||||||
|
|
||||||
|
In an attempt to respond to these risks, the University of Washington Center for Advances in Libraries, Museums, and Archives ([CALMA](https://calma.ischool.uw.edu/)) in collaboration with local Seattle cultural heritage consultants, BKS Studio, will host two Data Rescue data preserving events in January. Data Rescues will be held on Wednesday, January 15, from 12:30 to 5:30 pm and Wednesday, January 22, from 12:30 to 5:30 pm at the University of Washington Seattle campus in the [Suzzallo Library](https://lib.uw.edu/suzzallo/) Open Scholarship Commons. All skills, ages, and backgrounds welcomed with Data Rescue coordinators available to assist volunteers with tasks, technology, and questions. Additional information can be found on the [CALMA blog](https://calma.ischool.uw.edu/data-rescue-events-to-preserve-at-risk-government-data/). 
|
||||||
|
|
||||||
|
Data Rescue volunteers will participate in the process of preserving online public government data, understanding which types of data are most at risk, and begin the actions of rapid response data preservation. Local efforts aim to bring attention to both ongoing and upcoming threats to vital information that will be necessary for continued efforts at restoring and repairing environmental damages and social tension, the work cannot end there. Even after the inauguration and DataRescue events, Data Rescue coordinators will facilitate skill building and commitment to data preservation while opening up dialogue with others in the Seattle community on how our city and state can advocate for data integrity and environmental protection.
|
||||||
|
|
||||||
|
For more information or information on how to cover one of the Data Rescue events, please contact [calma@uw.edu](mailto:calma@uw.edu).
|
Loading…
Add table
Reference in a new issue