QPictureDownloader Documentation

Vai alla documentazione in Italiano.

Table of contents

Introduction

QPictureDownloader allows you to explore a combinatorial set of web pages and download every image (or whatever kind of file) linked by them.

The target set may be defined both statically through the URL compiler and dynamically, through the page analysis facilities.

Installation

QPictureDownloader is written in C++ using the toolkit Qt 4.3

Instructions for Windows users

Download and execute the easy-to-use Windows-style installer.

Instructions for Linux users

Linux users may install QPictureDownloader either from an RPM package or from sources.

Installation from RPM

If you perform an installation from RPM, missing dependencies will be downloaded from the right repositories.

  1. Configure the fioreltech.net repository
  2. Install the qpicturedownloader package

    $ su -c "yum install qpicturedownloader"
    Password: <your password>

Installation from sources

If you perform an installation from sources, you must resolve missing dependencies.

  1. Download the sources of QPictureDownloader
  2. Unpack the sources

    $ tar xvzf qpicturedownloader-2.0.0.tar.gz

  3. Enter in the source tree

    $ cd qpicturedownloader-2.0.0

  4. Create the Makefile

    $ chmod +x ./configure
    $ ./configure

    The script may fail when it isn't able to detect the tool qmake belonging to the toolkit Qt 4.x. In that case you must save in the environment variable QMAKE the path to the homonymous program and execute the script again.

    $ QMAKE=<path-to-qmake>
    $ ./configure

  5. Build QPictureDownloader

    $ make

  6. Install QPictureDownloader

    $ su -c "make install"
    Password: <your password>

Update

At every startup QPictureDownloader connects to www.fioreltech.net server, in order to verify if there is a newer version of the program. In the affirmative case, a label is shown in the very right of the status bar and it may be clicked to open in the default browser a page describing the instructions for the update.

Overview

The average user should deal exclusively with the main window.

Main window of QPictureDownloader

The common sequence of events is:

  1. The user enters an URL in the top text field
  2. The user chooses to interpret the URLs either as indexes or as files through the couple of radio buttons
  3. If the URLs are interpreted as indexes, then the user have to select the extensions of the files to be downloaded either through the check boxes or through a space-separed list of extensions
  4. The user clicks the [Load] button, in order to compile the URL into a set of URLs: if they are interpreted as indexes, then they are downloaded and filtered in order to determine the set of files to be downloaded
  5. The bottom view shows the files to be downloaded and the folders where they are placed in the local machine
  6. The user chooses the working directory, where the downloaded files and folders are placed
  7. The user clicks the [Download] button
  8. The program attempts to download every file in the Default state

Download Manager

A file may be in three different states:

Default (displayed in black)
if it wasn't already downloaded
Okay (displayed in green)
if it was successfully downloaded
Failure (displayed in red)
if an error occurred during file transfer

During the file transfer, a modal dialog is shown, which expresses the progress and let the user to stop the operation clicking the [Cancel] button.

If the user clicks again the [Download] button, then the transfer resumes from the point it was interrupted.

File Menu

Under the file menu there are the following actions:

Export transfer list...
to export the transfer list into an XML file
Import transfer list...
to import the transfer list from an XML file

You can use these functions in order to distribute the file transfer across different sessions (each time you continue to downlaod every file in the default state)

Filter Menu

Under the filter menu there are the following actions:

Default filter

to use the default filter

Custom filter...

to open a dialog where choose a custom filter.

Custom filter dialog

In the bottom combo box you can select the file encoding (System is an alias for the system default encoding)

Directories, indexes and files

QPictureDownloader works on those kind of items: indexes, files and directories.

Indexes

An index is a web page which is scanned for available files. By default, the program recognises the href attribute of the <a> tags and filters the files against a given list of extensions: you can either select them among some well-known image extensions (such as .jpeg, .bmp, ...) or enter a space separated list of extensions (without the leading dot). However, the page analysis tools allow you to fully customize how indexes are parsed.

Files

A file is simply a remote resource which is to be downloaded. Files are the main output of the analysis of a page, but they can be edited manually to correct possible errors.

Directories

When exploring the given set of indexes, QPictureDownloader ensures that every file from the same index is downloaded in the same sub-folder of the working directory.

URL compiler

It is able to expand a given URL in a combinatorial set of URLs, which may be interpreted either as indexes or as files.

The given URL is parsed against a grammar, which identifies some portions of it as expressions.

Since an expression denotes a sequence of values, the given URL is expanded into a collection of strings, considering each value for each expression.

The following picture describes the expansion mechanism, in the case that the given URL contains the expressions ["a","b"] and ["c","d"]
The URL expansion mechanism

Introduction to expressions

An expression is a portion of the given URL, which is delimited by a couple of square brackets. It denotes a sequence of values, which are used to expand the given URL into a collection of strings.

Synopsis

A simple expression is a comma-separated list of elements. It denotes the sequence of values, obtained merging the sequences, which are denoted by each element.

[element1, element2, ..., elementN]

Elements may be:

Ranges

A range is a multi-valued element which denotes a sequence of integers.

Synopsis

left [ - right ][ { step [ , width [ , padding ] ] } ]

Each couple of brackets denotes an optional part of the syntax

left [mandatory]
The left extreme of the range
right
The right extreme of the range
step
The difference between two consecutive values (default: 1)
width
The minimum field width (default: 0, that is to say no padding is applied)
padding
The padding character (default: "0")
Examples

1 denotes 1

1{1,5} denotes 00001

999{1,2} denotes 999

1-10 denotes 1, 2, 3, ..., 10

1-10{2} denotes 1, 3, 5, ... 9

1-10{2,2} denotes 01, 03, 05, ... 09

1-10{2,2,"*"} denotes *1, *3, *5, ... *9

Strings

A string is a sequence of zero or more characters surrounded by double quotation marks ("). It is a single-valued element which denotes the same sequence of characters.

Synopsis

"characters"

Examples

"string" denotes string

References

The given URL may be interpreted as a sequence of expressions, which are alternated with literals (which are a sequence of characters between two consecutive expressions)

For example, in the following url the two expressions are colored in red:
http://www.fioreltech.net/[1-10]/hello/[1-10]

The expressions are indexed from left to right starting from zero.

A reference to an expression is a token of the form \n, where n is the index of the referenced expression.

A reference cannot point either directly or indirectly the expression which contains it.

Synopsis

[\n]

The expression expands to the same value as the n-th expression.

[\n : sequence]

The expression expands to the i-th value of the sequence when the n-th expression expands to its i-th value.

Sequence is a comma-separated list of elements, which denotes a sequence of as much values as the referenced expression.

In both cases the expansion is conditioned by the referenced expression so that the first expression has just a value for each expansion of the second expression.

Examples

http://www.fioreltech.net/[1-12{1,2}]/[\0 : "january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november","december"]/index.html

expands to

http://www.fioreltech.net/01/january/index.html
http://www.fioreltech.net/02/february/index.html
http://www.fioreltech.net/03/march/index.html
http://www.fioreltech.net/04/april/index.html
http://www.fioreltech.net/05/may/index.html
http://www.fioreltech.net/06/june/index.html
http://www.fioreltech.net/07/july/index.html
http://www.fioreltech.net/08/august/index.html
http://www.fioreltech.net/09/september/index.html
http://www.fioreltech.net/10/october/index.html
http://www.fioreltech.net/11/november/index.html
http://www.fioreltech.net/12/december/index.html

Page analysis

It is a collection of tools, which allow

Page filtering

The content of indexes is filtered by a function, called filter, defined in a scripting language based on ECMAScript 3.0.

Synopsis

function filter( content, downloadManager )
{
var matches = new Array(); // list of urls
 
return matches;
}

The first parameter is the content of an index. The second parameter is related to the growing indexes facility.

The function returns an array of the URLs to be downloaded. Those URLs may be either absolute or relative to the index.

NB: those URLs are also matched against the extensions choosen in the program main window.

Growing set of indexes

The second parameter of the filter function may be used to schedule other indexes.

Synopsis

downloadManager.addRequest(url, folder);

url
URL of the index to be downloaded
folder
name of the folder where the files linked by this index are downloaded to. If this argument is omitted, then an automatic generated name is used.

The function returns true if and only if the request is schedeled. It may fail if the folder's name isn't correct (ex it contains / or it is equal to . or ..).

You have to guarantee that infinite recursion may not occur.

License

QPictureDownloader 2.0.0, a free picture grabber
Copyright (C) 2007 2008 Manuel Fiorelli <manuel.fiorelli@gmail.com>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (version 2) as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

I contenuti pubblicati sulle pagine di Fioreltech.net non possono essere replicati su altri siti Web, mailing list, newsletter, riviste cartacee e cdrom senza la preventiva autorizzazione dei curatori di Fioreltech.net, indipendentemente dalle finalità di lucro. È altresì concessa la produzione di opere derivate, purché queste abbiano una forma originale ed estendano quanto riportato in Fioreltech.net (NON la solita parafrasi anti-copyright), che deve essere comunque visibilmente citato tra le fonti. I curatori di Fioreltech.net non si assumono alcuna responsabilità per eventuali danni causati, direttamente o indirittamente, dai gadget presenti sulle proprie pagine e/o dall'applicazione di certe procedure descritte negli articoli. Eventuali marchi registrati possono essere citati in Fioreltech.net per scopo informativo, pur non avendo su di essi alcun diritto.