mirror of
https://github.com/UW-CALMA/datarescue.git
synced 2025-02-22 01:31:29 -08:00
42 lines
2.5 KiB
Markdown
42 lines
2.5 KiB
Markdown
|
---
|
||
|
description: >-
|
||
|
Information about tools needed for tasks or other important digital
|
||
|
preservation work
|
||
|
---
|
||
|
|
||
|
# Tools
|
||
|
|
||
|
For Authenticity and Verification
|
||
|
|
||
|
* Making signed BagIt files: [https://github.com/harvard-lil/bag-nabit](https://github.com/harvard-lil/bag-nabit)
|
||
|
* Make Bags: [https://github.com/WeAreAVP/fixity](https://github.com/WeAreAVP/fixity) or [https://github.com/LibraryOfCongress/bagger](https://github.com/LibraryOfCongress/bagger)
|
||
|
* Create checksums: [https://corz.org/windows/software/checksum/](https://corz.org/windows/software/checksum/)
|
||
|
|
||
|
Metadata Creation and Description
|
||
|
|
||
|
* Analyze file & produce basic metadata: [https://coptr.digipres.org/index.php/NARA\_File\_Analyzer\_and\_Metadata\_Harvester](https://coptr.digipres.org/index.php/NARA_File_Analyzer_and_Metadata_Harvester)
|
||
|
* Index web archive files: 
|
||
|
|
||
|
For Data an Web Archive Capturing and Harvesting
|
||
|
|
||
|
* Conifer tool (website interactions): [https://conifer.rhizome.org/\_faq](https://conifer.rhizome.org/_faq)
|
||
|
* Browser extension web crawl (single page): [https://warcreate.com/](https://warcreate.com/)
|
||
|
* Browser extension to add webpage to Internet Archive: [https://web.archive.org/](https://web.archive.org/)
|
||
|
* Copy websites (HTTrack): [http://www.httrack.com/](http://www.httrack.com/)
|
||
|
* Crawl website (Heritrix): [https://sourceforge.net/projects/heritrix.mirror/](https://sourceforge.net/projects/heritrix.mirror/)
|
||
|
* Capture backend of websites: [https://deeparc.sourceforge.net/](https://deeparc.sourceforge.net/)
|
||
|
|
||
|
**For Website Monitoring and Assessments**
|
||
|
|
||
|
* Estimate website size: [https://github.com/izkreny/website-size](https://github.com/izkreny/website-size)
|
||
|
* Monitor websites in bulk (thousands): [https://github.com/edgi-govdata-archiving/web-monitoring](https://github.com/edgi-govdata-archiving/web-monitoring)
|
||
|
* Monitor websites (single or small batch): [https://distill.io/](https://distill.io/)
|
||
|
* Detect website changes: [https://github.com/openpreserve/pagelyzer](https://github.com/openpreserve/pagelyzer)
|
||
|
* Assess websites (note differences in stories): [https://github.com/DocNow/diffengine](https://github.com/DocNow/diffengine)
|
||
|
* Assess websites (compare two pages): [http://pagelyzer.openpreservation.org/](http://pagelyzer.openpreservation.org/)
|
||
|
|
||
|
General Lists of Digital Preservation Tools
|
||
|
|
||
|
* Community Owned digital Preservation Tool Registry (COPTR) [https://www.digipres.org/tools/by-function/#createorreceive(acquire):webcrawl](https://www.digipres.org/tools/by-function/#createorreceive\(acquire\):webcrawl)
|
||
|
|