Fetchgals - Locate and (optionally) download porn thumbnail galleries
fetchgals [-find_galleries|-create_html|-download_pics] [options]
fetchgals [-help|-version]
A TGP (Thumbnail Gallery Post) is a website that links to many
thumbnail galleries, sites with free porn. In one kind of TGP, the
so-called CJ sites (for ``circle jerk''), the viewer is sometimes
redirected to other TGPs rather than to the gallery they intend to
visit. Many of these CJ sites also resize and hijack the browser, try
to install adware or dialers on the viewer's computer, open consoles
and popups, etc.
Fetchgals is designed to work around these annoyances of CJ sites,
in order to achieve a satisfying masturbatory experience. It reads
TGP URLs from a configuration file, visits the TGPs one by one, and
determines gallery URLs as well as a preview thumbnail for each
gallery. This information is saved to a file and can then be used in
two ways: to download all the pictures from all the galleries to the
local computer, or to create local HTML pages with preview thumbnails
linking to the galleries.
At least one of the options -find_galleries, -create_html or
-download_pics must be present.
- -find_galleries
-
Reads a list of TGP URLs from the file specified with -tgp_file
(default: ./tgps) and tries to locate all thumbnail galleries
that are linked from these TGPs. The URLs of these galleries, together with
URLs for preview thumbnails, are then written to the file specified
with -gallery_file (default: ./galleries).
- -download_pics
-
Reads a list of galleries from the file specified with
-gallery_file (default: ./galleries) and downloads all the
pictures linked from those galleries. The pictures are stored in the
directory specifed with -pics_dir (default: ./porn_pics).
-
You need to run -find_galleries at least once before you can run
-download_pics.
-
The URLs of the galleries all of whose pictures have been completely
downloaded are recorded in the file specified with
-processed_downloads_file (default: ./gals_downloaded). Because
of this, the program never downloads the same gallery twice, and it is
able to resume a download after it was interrupted.
- -create_html
-
Reads a list of galleries and associated preview thumbnails from the
file specified with -gallery_file (default: ./galleries) and
creates a set of HTML files that link to these galleries. The HTML
files are created in the directory specified with -html_dir
(default: ./porn_html). Neither the thumbnails nor the galleries
are downloaded, so this actions is much faster than -download_pics
and uses up only very little local storage space.
-
You need to run -find_galleries at least once before you can run
-create_html.
-
The URLs of the galleries for which local HTML files have been created
are recorded in the file specified with -processed_html_file
(default: ./gals_processed_html). This is so that the program only
produces HTML files for the most recently found galleries.
The following options affect the behavior of the main actions.
- -tgp_file file
-
The file that contains URLs of TGPs, one per line; lines starting with
# are comments and are ignored. Defaults to ./tgps. This option is
only needed by the -find_galleries action.
-
Two TGP files are distributed with fetchgals. Others can be
generated with the program findtgps(1).
- -extensive_search
-
This option affects the operation of the -find_galleries action.
Without it, only links from TGPs are investigated that expose the true
location of the gallery in the URL. If -extensive_search is
switched on, other links are followed as well; they have to be ``tried
out'' to find the gallery's location, and often this has to be repeated
several times until the gallery's URL is found. The -find_galleries
action is much slower when -extensive_search is switched on, and
for most users it is probably not worth it.
- -attempts_per_link num
-
The number of times a link to a gallery is followed (default: 5). Many
TGPs use CGI scripts as links to the galleries, and these scripts
frequently redirect to another TGP rather than to the gallery. If this
happens to us, we try the link again, until we find a bona fide
gallery or we exceed num attempts. This option only has an effect if
-extensive_search is being used.
- -sleep_time num
-
The average time (in seconds) between accesses to the same TGP
(default is 7). This is an average; the actual waiting time is
randomized. The purpose is to avoid our bot being detected. We cycle
through the TGPs so that we don't have to sleep often.
-
This option only affects the -find_galleries action if the
-extensive_search option is switched on. Without
-extensive_search, we never have to sleep. If you get many ``bot
detected'' messages during the -find_galleries action, you need to
increase the sleep time.
- -gallery_file file
-
The file that contains URLs of galleries and preview thumbnails, one
gallery and one thumbnail per line, separated by a space. Defaults to
./galleries. Before the -find_galleries action starts to collect
galleries, it reads this file so that it won't have to test a
previously seen gallery. It then appends newly found galleries. The
file is read by the -create_html and -download_pics actions.
-
If the file doesn't exist, then it will be created.
- -use_cookies
-
With this option, fetchgals will save the cookies it receives
and return them to TGPs and galleries. By default, cookies are not
used.
- -pics_dir dir
-
The directory where the -download_pics action stores the downloaded
images. Defaults to ./porn_pics. The directory will be created if it
doesn't exist. The pictures receive names indicating their origin and
are stored in subdirectories a-z depending on the first letter of
their name.
- -processed_downloads_file file
-
The file where the URLs of downloaded galleries are recorded. Defaults
to ./gals_downloaded. This is only used by the -download_pics
action, to avoid repeated downloads of the same gallery. The file will
be created if it doesn't exist.
- -types regexp
-
A regular expression matching the extensions of files you want to
download. Defaults to
jpg|jpeg|gif|png|mpg|mpeg|avi|asf|mov|wmv
.
The match is case-insensitive, so you don't need to list the
upper-case variants. Note that if you want to download videos, you
need to increase -max_img_size.
- -min_img_size num
-
The minimum size (in bytes) of image files to download (default:
15000). This is only used by the -download_pics action; files
smaller than this threshold are taken to be thumbnails or advertising
banners and not downloaded.
- -max_img_size num
-
The maximum size (in bytes) of image files to download (default:
250000). This is only used by the -download_pics action; files
larger than this threshold are not downloaded. If you are interested
in movies, you want to set this to 10000000.
- -max_disk_usage num
-
The image download is stopped if the disk usage exceeds num%
(default: 95). This percentage is based on the total disk space
available to non-superusers. Your system may reserve some additional
disk space for the super user.
- -pic_file_template template
-
The template to be used for the filenames of the downloaded images.
This is a string containing the following special place holdersq:
- %h
-
The host name of the gallery, with leading ``www.'' removed.
- %i
-
The first letter of the host name, or ``z'' if that is a digit.
- %p
-
The path of the gallery's URL, with all slashes replaced by hyphens.
- %d
-
The current date.
- %t
-
The current time.
- %n
-
The number of the image within the gallery.
- %c
-
The filetype: either ``video'' or ``image''. Note that if you want
fetchgals to download videos, you need to increase the default
maximal file size with -max_img_size.
- %e
-
The filename extension giving the file's type. Starts with a period.
- %%
-
A literal %-sign.
The default template is %i/%h-%p-%n%e
. Every template should contain
%h
, %p
and %n
, or else some images run the risk of being overwritten.
- -num_download_processes num
-
The number of parallel processes used to download pictures (default:
8). This only affects the -download_pics action.
-
We use parallel processes because today's cable and DSL internet
connections allow much faster downloads than a single porn host can
provide. Experiment with this number until your download speeds
approach the bandwidth of your internet connection.
- -html_dir dir
-
The directory where the -create_html action creates the HTML files.
Defaults to ./porn_html. The directory will be created if it
doesn't exist. The file index.html in that directory links to the
other created HTML files.
- -thumbs_per_row num
-
The number of thumbnail pictures that will be put in one horizontal
row of the resulting HTML file. Defaults to 6. Only used by the
-create_html action.
- -rows_per_file
-
The number of rows that we be put in each of the resulting HTML files.
Defaults to 10. Only used by the -create_html action.
- -processed_html_file file
-
The file where the URLs of galleries are recorded that have been
processed with the -create_html action. Defaults to
./gals_processed_html. This is only used by the -download_pics
action, to avoid repeated downloads of the same gallery. The file will
be created if it doesn't exist.
- -help
-
Prints the manual page and exits.
- -version
-
Prints the version of fetchgals and exits.
All options can be abbreviated to a unique prefix.
- ./tgps
-
The main input file, containing TGP URLs, one per line. The ``http://''
is optional. Lines starting with # are ignored. The default file name
can be overridden with the -tgp_file option.
- ./galleries
-
This file is created by fetchgals and contains gallery URLs and
corresponding preview thumbnail URLs and descriptive text, one gallery
and one thumbnail and one text per line, separated by '|'. The purpose
of the file is to record the galleries that have already been found,
so that they are not processed again during a later invocation of
fetchgals. The file's contents may also be useful for other
programs. The default file name can be overridden with the
-gallery_file option.
- ./gals_downloaded
-
This file contains a list of those galleries whose pictures have been
completely downloaded to the local computer. This allows to interrupt
a download and resume later. Also, if fetchgals is run again and
finds new galleries, only those not contained in this file will be
downloaded by the -download_pics option. The default file name can
be overridden with the -processed_downloads_file option.
- ./gals_processed_html
-
This file contains a list of those galleries for which HTML pages have
been created on the local computer. If fetchgals is run again and
finds new galleries, only those not contained in this file will be
processsed by the -create_html option. The default file name can be
overridden with the -processed_html_file option.
Bug reports, feedback and patches for fetchgals as well as improved
TGP lists can be posted to any usenet group.
Fetchgals is in the public domain.
findtgps(1)
Back to the Fetchgals home page.