There are a number of options. One really useful option is "-i",
(inhibit), which causes relink to not do any relinking at all, but
to just look for the duplicates. If combined with the "-v" (verbose)
option, a list of all the duplicates is produced. The "-r"
(recursive) option means to go into directories. The "-s
When identical files are found, a decision must be made as to which
is chosen as the "real" one to be preserved, and which is to be
replaced by a link to the other.
If 2 files are the same file (eg. inodes ==), then they are already
linked. The compared list element is removed and the number of files
is then decremented.
We then read and compare the 2 files, and if they are identical, we
unlink one of the files, and establish a hard link to the "real"
one. The criteria for unliking is as follows:
If one is multiply linked and the other is singly linked, the singly
linked file is discarded and the multiply-linked one has a new link.
If both are multiply linked, the newer one is discarded and becomes
a link to the older. However, if the "-n" (newer) option is used,
the newer one will be the chosen file, and the older will be
replaced by a link.
This program isn't guaranteed to be totally successful. If two files
exist, each with several links, it is possible that not all of the
links to one will be replaced by a link to the other. This depends
on what order they are discovered. To catch such cases, repeat the
relink command. In actual experience, such misses happen less than
one time per 1000 files. Correcting this requires an algorithm that
is much more complicated, and probably not worth the bother.
This program uses John Chambers' dbg package for verbosity; you'll
need a copy of dbg to compile it. Or you can replace the D*() and
P*() macros with your own favorite wrappers for fprintf().
Recursion added by Gerry Feldman.
Ported to random computers by John and Gerry. Your mileage may vary.