mirror of
https://github.com/UW-CALMA/datarescue.git
synced 2025-02-21 09:11:31 -08:00
2.5 KiB
2.5 KiB
description |
---|
Information about tools needed for tasks or other important digital preservation work |
Tools
For Authenticity and Verification
- Making signed BagIt files: https://github.com/harvard-lil/bag-nabit
- Make Bags: https://github.com/WeAreAVP/fixity or https://github.com/LibraryOfCongress/bagger
- Create checksums: https://corz.org/windows/software/checksum/
Metadata Creation and Description
- Analyze file & produce basic metadata: https://coptr.digipres.org/index.php/NARA_File_Analyzer_and_Metadata_Harvester
- Index web archive files:
For Data an Web Archive Capturing and Harvesting
- Conifer tool (website interactions): https://conifer.rhizome.org/_faq
- Browser extension web crawl (single page): https://warcreate.com/
- Browser extension to add webpage to Internet Archive: https://web.archive.org/
- Copy websites (HTTrack): http://www.httrack.com/
- Crawl website (Heritrix): https://sourceforge.net/projects/heritrix.mirror/
- Capture backend of websites: https://deeparc.sourceforge.net/
For Website Monitoring and Assessments
- Estimate website size: https://github.com/izkreny/website-size
- Monitor websites in bulk (thousands): https://github.com/edgi-govdata-archiving/web-monitoring
- Monitor websites (single or small batch): https://distill.io/
- Detect website changes: https://github.com/openpreserve/pagelyzer
- Assess websites (note differences in stories): https://github.com/DocNow/diffengine
- Assess websites (compare two pages): http://pagelyzer.openpreservation.org/
General Lists of Digital Preservation Tools
- Community Owned digital Preservation Tool Registry (COPTR) https://www.digipres.org/tools/by-function/#createorreceive(acquire):webcrawl