find-mirror - find the best mirror in a list of mirrors
find-mirror [options] [files...]
Options:
--count=N repeat measurement N times per host
--debug debugging output
--domains=LIST comma-separated list of patterns
--extract[=TYPE] print mirrors, but do not contact them
TYPE can be "urls", "hosts", or empty.
--help show this help message
--ignore-case case-insensitive pattern matching
--jobs=N fork N simultaneous children
--method one of: ping,echo,connect
--ping same as --method=ping [default]
--echo same as --method=echo
--connect same as --method=connect
--pattern=PATTERN use custom perl regex to extract urls
--relaxed match anything that looks like a host
--top[=N] find the best N hosts
--verbose verbose output
--version report version and exit
Examples:
$ find-mirror mirrors.html
$ lynx -dump http://foo.com/mirrors.html | find-mirror -j 4
The latest version can be found at: http://sourceforge.net/projects/find-mirror/
find-mirror is a utilily to extract and rank addresses and urls by reachability and data rate. It is meant to be used when you are presented with a list of links to mirrors, ftp sites, etc., and you need to select one (or more) of them.
Since it extracts urls directly from html (and from any arbitrary text), you can find your mirror with very little effort.
Do not specify the initial . character for the domain. Example:
find-mirror --domains=com,net,edu
For more powerful control over the url matching, see option --pattern.
This is the default.
You can achieve the same results with perl's ugly (?i:...) syntax. e.g.
$ find-mirror -i --domains=com,net,org
$ find-mirror --domains='(?i:com|net|org)' # same thing
This option can greatly speed execution time.
Caution: Do not specifiy --jobs=0 unless you know that the list of mirrors is short. Otherwise, you may create so many processes that your measurements are adversely affected (i.e., by processing latency, network flooding, etc.), and your system is overloaded.
METHOD can be one of:
ping(1) for details.
Choose the method depending upopn how much time you have on your hands, and how accurate you need the results to be. Here are the methods in order of overall execution time, from quickest, to slowest.
ping
echo
connect
download
( (?:ftp|http):// # protocol specifier
(\w+(\.\w+)+(\.\w+)) # hostname in $1
(:\d+)?/(\S+)?) # optional port and path
$ find-mirror mirrors.html
$ find-mirror list1.dat list2.dat
$ find-mirror < mirrors.txt
$ wget -q -O- http://foo.com/mirrors/ | find-mirror
$ lynx -dump http://foo.com/mirrors/ | find-mirror # same thing
$ find-mirror -j 4 mirrors.html
$ find-mirror --extract < mirrors.html
$ find-mirror --ping < mirrors.html
$ find-mirror --echo < mirrors.html
$ find-mirror --connect < mirrors.html
$ find-mirror --http < mirrors.html
$ find-mirror --relaxed < mirrors.html
$ find-mirror --domains='org,edu' < mirrors.html
$ find-mirror --domains='(pa,nj,ny)\.us' < mirrors.html
$ find-mirror --pattern='(ftp(\.[[:alnum:]-]+)+\.kernel.org)' < mirrors.html
See BUGS file for details.
John Millaway <john43@temple.edu>