NAME

Fetchgals - Locate and (optionally) download porn thumbnail galleries


SYNOPSIS

fetchgals [-find_galleries|-create_html|-download_pics] [options]

fetchgals [-help|-version]


DESCRIPTION

A TGP (Thumbnail Gallery Post) is a website that links to many thumbnail galleries, sites with free porn. In one kind of TGP, the so-called CJ sites (for ``circle jerk''), the viewer is sometimes redirected to other TGPs rather than to the gallery they intend to visit. Many of these CJ sites also resize and hijack the browser, try to install adware or dialers on the viewer's computer, open consoles and popups, etc.

Fetchgals is designed to work around these annoyances of CJ sites, in order to achieve a satisfying masturbatory experience. It reads TGP URLs from a configuration file, visits the TGPs one by one, and determines gallery URLs as well as a preview thumbnail for each gallery. This information is saved to a file and can then be used in two ways: to download all the pictures from all the galleries to the local computer, or to create local HTML pages with preview thumbnails linking to the galleries.


OPTIONS

At least one of the options -find_galleries, -create_html or -download_pics must be present.

-find_galleries
Reads a list of TGP URLs from the file specified with -tgp_file (default: ./tgps) and tries to locate all thumbnail galleries that are linked from these TGPs. The URLs of these galleries, together with URLs for preview thumbnails, are then written to the file specified with -gallery_file (default: ./galleries).

-download_pics
Reads a list of galleries from the file specified with -gallery_file (default: ./galleries) and downloads all the pictures linked from those galleries. The pictures are stored in the directory specifed with -pics_dir (default: ./porn_pics).

You need to run -find_galleries at least once before you can run -download_pics.

The URLs of the galleries all of whose pictures have been completely downloaded are recorded in the file specified with -processed_downloads_file (default: ./gals_downloaded). Because of this, the program never downloads the same gallery twice, and it is able to resume a download after it was interrupted.

-create_html
Reads a list of galleries and associated preview thumbnails from the file specified with -gallery_file (default: ./galleries) and creates a set of HTML files that link to these galleries. The HTML files are created in the directory specified with -html_dir (default: ./porn_html). Neither the thumbnails nor the galleries are downloaded, so this actions is much faster than -download_pics and uses up only very little local storage space.

You need to run -find_galleries at least once before you can run -create_html.

The URLs of the galleries for which local HTML files have been created are recorded in the file specified with -processed_html_file (default: ./gals_processed_html). This is so that the program only produces HTML files for the most recently found galleries.

The following options affect the behavior of the main actions.

-tgp_file file
The file that contains URLs of TGPs, one per line; lines starting with # are comments and are ignored. Defaults to ./tgps. This option is only needed by the -find_galleries action.

Two TGP files are distributed with fetchgals. Others can be generated with the program findtgps(1).

-extensive_search
This option affects the operation of the -find_galleries action. Without it, only links from TGPs are investigated that expose the true location of the gallery in the URL. If -extensive_search is switched on, other links are followed as well; they have to be ``tried out'' to find the gallery's location, and often this has to be repeated several times until the gallery's URL is found. The -find_galleries action is much slower when -extensive_search is switched on, and for most users it is probably not worth it.

-attempts_per_link num
The number of times a link to a gallery is followed (default: 5). Many TGPs use CGI scripts as links to the galleries, and these scripts frequently redirect to another TGP rather than to the gallery. If this happens to us, we try the link again, until we find a bona fide gallery or we exceed num attempts. This option only has an effect if -extensive_search is being used.

-sleep_time num
The average time (in seconds) between accesses to the same TGP (default is 7). This is an average; the actual waiting time is randomized. The purpose is to avoid our bot being detected. We cycle through the TGPs so that we don't have to sleep often.

This option only affects the -find_galleries action if the -extensive_search option is switched on. Without -extensive_search, we never have to sleep. If you get many ``bot detected'' messages during the -find_galleries action, you need to increase the sleep time.

-gallery_file file
The file that contains URLs of galleries and preview thumbnails, one gallery and one thumbnail per line, separated by a space. Defaults to ./galleries. Before the -find_galleries action starts to collect galleries, it reads this file so that it won't have to test a previously seen gallery. It then appends newly found galleries. The file is read by the -create_html and -download_pics actions.

If the file doesn't exist, then it will be created.

-use_cookies
With this option, fetchgals will save the cookies it receives and return them to TGPs and galleries. By default, cookies are not used.

-pics_dir dir
The directory where the -download_pics action stores the downloaded images. Defaults to ./porn_pics. The directory will be created if it doesn't exist. The pictures receive names indicating their origin and are stored in subdirectories a-z depending on the first letter of their name.

-processed_downloads_file file
The file where the URLs of downloaded galleries are recorded. Defaults to ./gals_downloaded. This is only used by the -download_pics action, to avoid repeated downloads of the same gallery. The file will be created if it doesn't exist.

-types regexp
A regular expression matching the extensions of files you want to download. Defaults to jpg|jpeg|gif|png|mpg|mpeg|avi|asf|mov|wmv. The match is case-insensitive, so you don't need to list the upper-case variants. Note that if you want to download videos, you need to increase -max_img_size.

-min_img_size num
The minimum size (in bytes) of image files to download (default: 15000). This is only used by the -download_pics action; files smaller than this threshold are taken to be thumbnails or advertising banners and not downloaded.

-max_img_size num
The maximum size (in bytes) of image files to download (default: 250000). This is only used by the -download_pics action; files larger than this threshold are not downloaded. If you are interested in movies, you want to set this to 10000000.

-max_disk_usage num
The image download is stopped if the disk usage exceeds num% (default: 95). This percentage is based on the total disk space available to non-superusers. Your system may reserve some additional disk space for the super user.

-pic_file_template template
The template to be used for the filenames of the downloaded images. This is a string containing the following special place holdersq:
%h
The host name of the gallery, with leading ``www.'' removed.

%i
The first letter of the host name, or ``z'' if that is a digit.

%p
The path of the gallery's URL, with all slashes replaced by hyphens.

%d
The current date.

%t
The current time.

%n
The number of the image within the gallery.

%c
The filetype: either ``video'' or ``image''. Note that if you want fetchgals to download videos, you need to increase the default maximal file size with -max_img_size.

%e
The filename extension giving the file's type. Starts with a period.

%%
A literal %-sign.

The default template is %i/%h-%p-%n%e. Every template should contain %h, %p and %n, or else some images run the risk of being overwritten.

-num_download_processes num
The number of parallel processes used to download pictures (default: 8). This only affects the -download_pics action.

We use parallel processes because today's cable and DSL internet connections allow much faster downloads than a single porn host can provide. Experiment with this number until your download speeds approach the bandwidth of your internet connection.

-html_dir dir
The directory where the -create_html action creates the HTML files. Defaults to ./porn_html. The directory will be created if it doesn't exist. The file index.html in that directory links to the other created HTML files.

-thumbs_per_row num
The number of thumbnail pictures that will be put in one horizontal row of the resulting HTML file. Defaults to 6. Only used by the -create_html action.

-rows_per_file
The number of rows that we be put in each of the resulting HTML files. Defaults to 10. Only used by the -create_html action.

-processed_html_file file
The file where the URLs of galleries are recorded that have been processed with the -create_html action. Defaults to ./gals_processed_html. This is only used by the -download_pics action, to avoid repeated downloads of the same gallery. The file will be created if it doesn't exist.

-help
Prints the manual page and exits.

-version
Prints the version of fetchgals and exits.

All options can be abbreviated to a unique prefix.


FILES

./tgps
The main input file, containing TGP URLs, one per line. The ``http://'' is optional. Lines starting with # are ignored. The default file name can be overridden with the -tgp_file option.

./galleries
This file is created by fetchgals and contains gallery URLs and corresponding preview thumbnail URLs and descriptive text, one gallery and one thumbnail and one text per line, separated by '|'. The purpose of the file is to record the galleries that have already been found, so that they are not processed again during a later invocation of fetchgals. The file's contents may also be useful for other programs. The default file name can be overridden with the -gallery_file option.

./gals_downloaded
This file contains a list of those galleries whose pictures have been completely downloaded to the local computer. This allows to interrupt a download and resume later. Also, if fetchgals is run again and finds new galleries, only those not contained in this file will be downloaded by the -download_pics option. The default file name can be overridden with the -processed_downloads_file option.

./gals_processed_html
This file contains a list of those galleries for which HTML pages have been created on the local computer. If fetchgals is run again and finds new galleries, only those not contained in this file will be processsed by the -create_html option. The default file name can be overridden with the -processed_html_file option.


BUGS AND FEEDBACK

Bug reports, feedback and patches for fetchgals as well as improved TGP lists can be posted to any usenet group.


LICENSE

Fetchgals is in the public domain.


SEE ALSO

findtgps(1)


SourceForge.net LogoBack to the Fetchgals home page.