Categories
Dupseek
Dupseek groups files by size, then reads and compares small chunks of the files of the same size. It creates smaller groups depending on these comparisons. It proceeds with bigger and bigger chunks (of size up to a hard-coded limit). It stops reading from files as soon as they form a single-element group or they are read completely (which only happens when they have a very high probability of having duplicates). The program does remove files, but it asks first.
Dupseek aims for maximum efficiency by keeping file reads to a minimum and is much better than other similar programs when dealing with groups of large files of the same size. It can be interrupted at any moment. The user is then presented with partial results and can either intervene manually or go on with the reading and computation, on a group-by-group basis.
Last updated 13 Mar, 2005
About
Leadership
- Antonio Bellezza - Maintainer
Requirements
- Perl (Use Requirement)
- File::Find (Use Requirement)
Related Projects
Versions
1.3
1.3 beta released 2005-03-13
- Released: 13 Mar, 2005
- Code Maturity: Beta
- Source Archive: http://www.beautylabs.it/software/dupseek-1.3.tgz
- Licenses: GPLv2
- Interfaces: Command Line



