QPictureDownloader Documentation
Table of contents
- Introduction
- Installation
- Update
- Overview
- Directories, indexes and files
- URL compiler
- Page Analysis
- License
Introduction
QPictureDownloader allows you to explore a combinatorial set of web pages and download every image (or whatever kind of file) linked by them.
The target set may be defined both statically through the URL compiler and dynamically, through the page analysis facilities.
Installation
QPictureDownloader is written in C++ using the toolkit Qt 4.3Instructions for Windows users
Download and execute the easy-to-use Windows-style installer.
Instructions for Linux users
Linux users may install QPictureDownloader either from an RPM package or from sources.
Installation from RPM
If you perform an installation from RPM, missing dependencies will be downloaded from the right repositories.
- Configure the fioreltech.net repository
-
Install the qpicturedownloader package
$ su -c "yum install qpicturedownloader"
Password: <your password>
Installation from sources
If you perform an installation from sources, you must resolve missing dependencies.
- Download the sources of QPictureDownloader
-
Unpack the sources
$ tar xvzf qpicturedownloader-2.0.0.tar.gz -
Enter in the source tree
$ cd qpicturedownloader-2.0.0 -
Create the Makefile
$ chmod +x ./configure
$ ./configureThe script may fail when it isn't able to detect the tool qmake belonging to the toolkit Qt 4.x. In that case you must save in the environment variable QMAKE the path to the homonymous program and execute the script again.
$ QMAKE=<path-to-qmake>
$ ./configure -
Build QPictureDownloader
$ make -
Install QPictureDownloader
$ su -c "make install"
Password: <your password>
Update
At every startup QPictureDownloader connects to www.fioreltech.net server, in order to verify if there is a newer version of the program. In the affirmative case, a label is shown in the very right of the status bar and it may be clicked to open in the default browser a page describing the instructions for the update.
Overview
The average user should deal exclusively with the main window.
The common sequence of events is:
- The user enters an URL in the top text field
- The user chooses to interpret the URLs either as indexes or as files through the couple of radio buttons
- If the URLs are interpreted as indexes, then the user have to select the extensions of the files to be downloaded either through the check boxes or through a space-separed list of extensions
- The user clicks the [Load] button, in order to compile the URL into a set of URLs: if they are interpreted as indexes, then they are downloaded and filtered in order to determine the set of files to be downloaded
- The bottom view shows the files to be downloaded and the folders where they are placed in the local machine
- The user chooses the working directory, where the downloaded files and folders are placed
- The user clicks the [Download] button
- The program attempts to download every file in the Default state
Download Manager
A file may be in three different states:
- Default (displayed in black)
- if it wasn't already downloaded
- Okay (displayed in green)
- if it was successfully downloaded
- Failure (displayed in red)
- if an error occurred during file transfer
During the file transfer, a modal dialog is shown, which expresses the progress and let the user to stop the operation clicking the [Cancel] button.
If the user clicks again the [Download] button, then the transfer resumes from the point it was interrupted.
File Menu
Under the file menu there are the following actions:
- Export transfer list...
- to export the transfer list into an XML file
- Import transfer list...
- to import the transfer list from an XML file
You can use these functions in order to distribute the file transfer across different sessions (each time you continue to downlaod every file in the default state)
Filter Menu
Under the filter menu there are the following actions:
- Default filter
to use the default filter
- Custom filter...
-
to open a dialog where choose a custom filter.
In the bottom combo box you can select the file encoding (System is an alias for the system default encoding)
Directories, indexes and files
QPictureDownloader works on those kind of items: indexes, files and directories.
Indexes
An index is a web page which is scanned for available files. By default, the program recognises the
href attribute of the <a> tags and filters the files against a given
list of extensions: you can either select them among some well-known image extensions (such as .jpeg, .bmp, ...)
or enter a space separated list of extensions (without the leading dot). However, the page analysis
tools allow you to fully customize how indexes are parsed.
Files
A file is simply a remote resource which is to be downloaded. Files are the main output of the analysis of a page, but they can be edited manually to correct possible errors.
Directories
When exploring the given set of indexes, QPictureDownloader ensures that every file from the same index is downloaded in the same sub-folder of the working directory.
URL compiler
It is able to expand a given URL in a combinatorial set of URLs, which may be interpreted either as indexes or as files.
The given URL is parsed against a grammar, which identifies some portions of it as expressions.
Since an expression denotes a sequence of values, the given URL is expanded into a collection of strings, considering each value for each expression.
The following picture describes the expansion mechanism, in the case
that the given URL contains the expressions ["a","b"] and ["c","d"]
Introduction to expressions
An expression is a portion of the given URL, which is delimited by a couple of square brackets. It denotes a sequence of values, which are used to expand the given URL into a collection of strings.
SynopsisA simple expression is a comma-separated list of elements. It denotes the sequence of values, obtained merging the sequences, which are denoted by each element.
[element1, element2, ..., elementN]
Elements may be:
Ranges
A range is a multi-valued element which denotes a sequence of integers.
Synopsisleft [ - right ][ { step [ , width [ , padding ] ] } ]
Each couple of brackets denotes an optional part of the syntax
- left [mandatory]
- The left extreme of the range
- right
- The right extreme of the range
- step
- The difference between two consecutive values (default: 1)
- width
- The minimum field width (default: 0, that is to say no padding is applied)
- padding
- The padding character (default: "0")
1 denotes 1
1{1,5} denotes 00001
999{1,2} denotes 999
1-10 denotes 1, 2, 3, ..., 10
1-10{2} denotes 1, 3, 5, ... 9
1-10{2,2} denotes 01, 03, 05, ... 09
1-10{2,2,"*"} denotes *1, *3, *5, ... *9
Strings
A string is a sequence of zero or more characters surrounded by double quotation marks ("). It is a single-valued element which denotes the same sequence of characters.
Synopsis
"characters"
"string" denotes string
References
The given URL may be interpreted as a sequence of expressions, which are alternated with literals (which are a sequence of characters between two consecutive expressions)
For example, in the following url the two expressions are colored in red:
http://www.fioreltech.net/[1-10]/hello/[1-10]
The expressions are indexed from left to right starting from zero.
A reference to an expression is a token of the form \n, where n is the index of the
referenced expression.
A reference cannot point either directly or indirectly the expression which contains it.
Synopsis
[\n]
The expression expands to the same value as the n-th expression.
[\n : sequence]
The expression expands to the i-th value of the sequence when the n-th expression expands to its i-th value.
Sequence is a comma-separated list of elements, which denotes a sequence of as much values as the referenced expression.
In both cases the expansion is conditioned by the referenced expression so that the first expression has just a value for each expansion of the second expression.
Examples
http://www.fioreltech.net/[1-12{1,2}]/[\0 : "january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november","december"]/index.html
expands to
http://www.fioreltech.net/01/january/index.html
http://www.fioreltech.net/02/february/index.html
http://www.fioreltech.net/03/march/index.html
http://www.fioreltech.net/04/april/index.html
http://www.fioreltech.net/05/may/index.html
http://www.fioreltech.net/06/june/index.html
http://www.fioreltech.net/07/july/index.html
http://www.fioreltech.net/08/august/index.html
http://www.fioreltech.net/09/september/index.html
http://www.fioreltech.net/10/october/index.html
http://www.fioreltech.net/11/november/index.html
http://www.fioreltech.net/12/december/index.html
Page analysis
It is a collection of tools, which allow
- to customize how indexes are parsed
- to dynamically expand the set of indexes
Page filtering
The content of indexes is filtered by a function, called filter, defined in a scripting language based on ECMAScript 3.0.
function filter( content, downloadManager )
{
var matches = new Array(); // list of urls
return matches;
}
The first parameter is the content of an index. The second parameter is related to the growing indexes facility.
The function returns an array of the URLs to be downloaded. Those URLs may be either absolute or relative to the index.
NB: those URLs are also matched against the extensions choosen in the program main window.
Growing set of indexes
The second parameter of the filter function may be used to schedule other indexes.
Synopsis
downloadManager.addRequest(url, folder);
- url
- URL of the index to be downloaded
- folder
- name of the folder where the files linked by this index are downloaded to. If this argument is omitted, then an automatic generated name is used.
The function returns true if and only if the request is schedeled. It may fail if the folder's name isn't correct
(ex it contains / or it is equal to . or ..).
You have to guarantee that infinite recursion may not occur.
License
QPictureDownloader 2.0.0, a free picture grabber
Copyright (C) 2007 2008 Manuel Fiorelli <manuel.fiorelli@gmail.com>
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (version 2) as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

