Page analysis

It is a collection of tools, which allow

  • to customize how indexes are parsed
  • to dynamically expand the set of indexes

Page filtering

The content of indexes is filtered by a function, called filter, defined in a scripting language based on ECMAScript 3.0.

Synopsis

function filter(content, downloadManager)
{
   var matches = new Array(); // list of urls
 
   return matches;
} 

The first parameter is the content of an index. The second parameter is related to the growing indexes facility.

The function returns an array of the URLs to be downloaded. Those URLs may be either absolute or relative to the index.

those URLs are also matched against the extensions choosen in the program main window.

Growing set of indexes

The second parameter of the filter function may be used to schedule other indexes.

Synopsis

downloadManager.addRequest(url, folder);
  • url –> URL of the index to be downloaded
  • folder –> name of the folder where the files linked by this index are downloaded to. If this argument is omitted, then an automatic generated name is used.

The function returns true if and only if the request is scheduled. It may fail if the folder's name isn't correct (ex. it contains / or it is equal to . or ..).

You have to guarantee that infinite recursion may not occur.