NAME

Findtgps - Locate TGPs (Thumbnail Gallery Posts) for later use with fetchgals


SYNOPSIS

findtgps -in infile -out outfile [options]

findtgps [-help|-version]


DESCRIPTION

Given a file infile containing URLs of TGPs, one per line, findtgps will follow links from those TGPs in order to find others. All found TGPs will be stored in outfile. Both infile and outfile have to be specified; they may be identical, but outfile will be overwritten.

Along with the URL, findtgps will record three numbers in outfile: the number of gallery links that expose the URL, the number of potential gallery links that go through a CGI script, and the number of direct links to galleries. The links through CGI scripts are the most inconvenient to deal with.

The purpose of findtgps is to create an input file for fetchgals(1).


OPTIONS

-known_tgps file
This is a file of known TGPS, one per line, which don't need to be checked again.

-link_threshold num
The number of external links a page needs to contain before we classify it as a TGP. Default is 30.

-fraction_followed num
If we have located a TGP, we will follow every num'th link to find new TGPs. Default is 6. One might be inclined to use -fraction_followed 1, but this is not advisable since TGPs usually link (through a CGI script) to only a small number of other TGPs.
=item B<-threads> F<num>

The number of parallel threads used for the task. Defaults to 8.

-help
Prints the manual page and exits.

-version
Prints the version of findtgps and exits.


BUGS AND FEEDBACK

Bug reports, feedback and patches for findtgps as well as improved TGP lists can be posted to any usenet group.


LICENSE

Findtgps is in the public domain.


SEE ALSO

fetchgals(1)


SourceForge.net LogoBack to the Fetchgals home page.