From cc142f41e13d304d00d7b7b068e08bf2e8e60fd2 Mon Sep 17 00:00:00 2001
From: Jaime Perez <jaime.perez@uninett.no>
Date: Fri, 8 Aug 2014 15:57:39 +0200
Subject: [PATCH] aggregator2: update the documentation to reflect the latest
 changes and features.

---
 modules/aggregator2/docs/aggregator2.txt | 151 +++++++++++++++--------
 1 file changed, 102 insertions(+), 49 deletions(-)

diff --git a/modules/aggregator2/docs/aggregator2.txt b/modules/aggregator2/docs/aggregator2.txt
index 3e499cecc..4b7eca7f2 100644
--- a/modules/aggregator2/docs/aggregator2.txt
+++ b/modules/aggregator2/docs/aggregator2.txt
@@ -1,10 +1,15 @@
 aggregator2 Module
 ==================
 
-This is an experimental module for aggregating metadata.
-It is designed to preserve most of the common metadata items, and also attempt to preserve unknown elements.
+This is a module for metadata aggregation. It is designed to preserve most of the common metadata items, and it also
+attempts to preserve unknown elements. Metadata sources are parsed and rebuilt, so small differences between the
+original sources and the metadata generated may occur. More specifically:
 
-*Note*: This aggregator only works on XML metadata, and does its work independently of the of other parts of simpleSAMLphp, such as the `metarefresh` module.
+* Signatures will be removed from every signed metadata source.
+* All sources will be wrapped up in an `EntitiesDescriptor` element.
+
+*Note*: This aggregator works only with XML metadata, and does its work independently of other parts of SimpleSAMLphp,
+such as the `metarefresh` module.
 
 
 Configuration
@@ -13,11 +18,10 @@ Configuration
 This module is configured through the `config/module_aggregator2.php` configuration file.
 An example file is available in `modules/aggregator2/config-templates/`:
 
-    cd /var/simplesaml
     cp modules/aggregator2/config-templates/module_aggregator2.php config/
 
 The configuration file contains one or more aggregators in the configuration array.
-The index in the configuration array gives the identifier of the aggregator.
+The index for each item in the configuration array gives the identifier of the aggregator.
 
 
 ### Aggregator entry configuration
@@ -25,52 +29,87 @@ The index in the configuration array gives the identifier of the aggregator.
 The aggregator can be configured with the following options:
 
 `sources`
-:   Array which describes which metadata we should download.
+:   Array which describes a source from which we should download metadata.
 
 `cron.tag`
-:   Can be used to periodically run an update.
-    Only useful when you have enabled caching of metadata.
+:   Can be used to run periodical updates. It will only be useful when you have metadata caching enabled.
 
 `cache.directory`
-:   A path to a directory where the aggregator will cache downloaded and generated metadata.
-    This directory must be writeable by the webserver.
+:   The path to a directory where the aggregator will cache downloaded and generated metadata.
+    This directory must be writable by the web server.
 
 `cache.generated`
-:   The number of seconds generated metadata should be cached.
-    If this option is unset, the generated metadata will not be cached.
+:   The number of seconds the generated metadata will be cached for.
+
+:    *Note*: generated metadata will not be cached if this option is unset.
 
 `valid.length`
-:   The number of seconds the generated metadata should be valid.
-    This is used to set the validUntil attribute on the generated metadata.
-    The default is one week.
+:   The number of seconds the generated metadata should be valid for.
+    This is used to set the `validUntil` attribute on the generated metadata.
+    Defaults to one week..
 
-:   *Note*: The `cache.generated` option must be smaller than this option, otherwise you will end up returning outdated metadata.
+:   *Note*: The value of the `cache.generated` option must be smaller than the value here, otherwise you would end up
+    returning outdated metadata.
 
 `ssl.cafile`
-:   This option enables validation of the server certificate when fetching metadata over https.
-    It must be set to a path to a PEM-file which contains one or more valid CA certificates.
-    The path can be absolute, or it can be relative to the `cert`-directory.
+:   This option enables validation of the server certificate when fetching metadata over HTTPS. It must be a path
+    pointing to a PEM file which contains one or more valid CA certificates. The path can be either absolute or
+    relative to the `cert` directory.
 
 :   *Note*: This option can be overridden for each metadata source.
 
 `sign.privatekey`
-:   The private key that should be used to sign the metadata, in PEM format.
-    The path to the private key can be absolute, or it can be relative to the `cert`-directory. Skip this option or
-    set it to NULL if you don't want to sign the generated metadata.
+:   The private key that should be used to sign the resulting metadata, in PEM format. The path to the private key can
+    be either absolute or relative to the `cert` directory. Skip this option or set it to `NULL` if you don't want to
+    sign the generated metadata.
 
 `sign.privatekey_pass`
-:   The password for the private key.
-    If this option is unset, the private key is assumed to be unencrypted.
+:   The password used to encrypt the private key. If this option is unset, the private key is assumed to be unencrypted.
 
 `sign.certificate`
-:   The certificate which contains the public key corresponding to the private key, in PEM format.
-    This certificate is included in the generated metadata.
-    The path to the certificate can be absolute, or it can be relative to the `cert`-directory.
+:   The certificate that contains the public key corresponding to the private key, in PEM format. The path to the
+    certificate can be either absolute or relative to the `cert` directory.
+
+:   *Note*: This certificate will be included in the generated metadata.
 
 `RegistrationInfo`
-:   Allows to specify information about the registrar of this metadata. Please refer to the
+:   Allows to specify information about the registrar of the generated metadata. Please refer to the
     [MDRPI extension](./simplesamlphp-metadata-extensions-rpi) document for further information.
 
+`exclude`
+:   Allows to exclude one or more entities from the generated metadata, represented by their entity IDs. Can be either
+    a string with the entity ID of a single entity, or an array of strings with all the entity IDs to exclude from
+    the result.
+
+:   *Note*: this option will not exclude the entities from the cached metadata sources. It will only act as a default
+    configuration for the generation of the metadata aggregate, and therefore can be overridden per request.
+
+`filter`
+:   One or more sets representing the types of entities that should be included in the generated metadata. Filtering
+    will be performed depending on the role of the entity, as well as the protocols it supports. Can be either a string
+    with the set of entities desired, or an array of strings with all the different sets to filter by. The following
+    sets are available:
+
+*   `saml2`
+:   All the entities that support the SAML 2.0 protocol.
+*   `shib13`
+:   All the entities that support the SAML 1.1 protocol.
+*   `saml20-idp`
+:   All the identity providers that support the SAML 2.0 protocol.
+*   `saml20-sp`
+:   All the service providers that support the SAML 2.0 protocol.
+*   `saml20-aa`
+:   All the attribute authorities that support the SAML 2.0 protocol.
+*   `shib13-idp`
+:   All the identity providers that support the SAML 1.1 protocol.
+*   `shib13-sp`
+:   All the service providers that support the SAML 1.1 protocol.
+*   `shib13-aa`
+:   All the attribute authorities that support the SAML 1.1 protocol.
+
+:   *Note*: this option will not filter the entities in the cached metadata sources. It will only act as a default
+    configuration for the generation of the metadata aggregate, and therefore can be overriden per request.
+
 
 ### Aggregator source configuration
 
@@ -78,41 +117,55 @@ The aggregator can be configured with the following options:
 :   The URL the metadata should be fetched from.
 
 `ssl.cafile`
-:   This option enables validation of the server certificate when fetching metadata over https.
-    It must be the path to a PEM-file which contains one or more valid CA certificates.
-    The path can be absolute, or it can be relative to the `cert`-directory.
+:   This option enables validation of the server certificate when fetching metadata over HTTPS. It must be a path
+    pointing to a PEM file which contains one or more valid CA certificates. The path can be either absolute or
+    relative to the `cert` directory.
 
-:   *Note*: This option overrides the aggregator option.
+:   *Note*: This option overrides the option with the same name in the root configuration for the an aggregator.
 
 `cert`
-:   Check the signature on the metadata against the specified certificate.
-    The path to the certificate can be absolute, or it can be relative to the `cert`-directory.
+:   The certificate that should be used to check the signature of this metadata document, in PEM format. The path to
+    the certificate can be either absolute or relative to the `cert` directory.
 
-:   *Note*: This can not be a CA certificate.
-    Validation against a CA certificate is not supported.
+:   *Note*: This cannot be a CA certificate. Validation against CA certificates (PKI) is not supported.
 
 
 Retrieving aggregated metadata
 ------------------------------
 
-The metadata can be downloaded from the following location:
+You will find a link to the aggregator2 module in the *Federation* tab of SimpleSAMLphp's web interface. There you will
+be able to see a list of all the metadata aggregates you have configured, and see or download them in different
+formats.
+
+In general, metadata aggregates can be downloaded from the following location:
+
+    http://<YOUR HOST>/simplesaml/modules.php/aggregator2/get.php?id=<aggregator id>
+
+where the *aggregator id* is the identifier you used as an index for the aggregator configuration array. Additionally,
+you can use the following parameters to customize the resulting metadata aggregate:
+
+`exclude`
+:   Allows to exclude one or more entities from the generated metadata, represented by their entity IDs. If you need to
+    specify more than one entity, use a comma-separated list of entity IDs.
 
-    http://<server>/simplesaml/modules.php/aggregator2/get.php?id=<aggregator id>
+`filter`
+:   Allows to filter by sets specifying the type of entities or the protocols they support. If you need to specify more
+    than one set, use a comma-separated list. See the configuration option with the same name to get a list of all
+    the sets supported.
 
 
 Asynchronous metadata updates
 -----------------------------
 
-By default, the `aggregator2` module will update the metadata when receiving a request.
-For performance reasons, it is recommended to run the updates asynchronously.
-By doing this, the aggregated metadata will be generated in the background.
+By default, the `aggregator2` module will update the metadata upon receiving a request. For performance reasons, it is
+recommended to run the updates asynchronously. By doing this, the aggregated metadata will be generated in the
+background.
 
-To enable this, you must configure a cache directory with the `cache.directory` option.
-This directory must be writeable by the web server.
-You can then enable caching of generated metadata by setting the `cache.generated` option to the number of seconds the metadata can be cached.
+To enable this, you must configure a cache directory with the `cache.directory` option. This directory must be writable
+by the web server. You can then enable caching of generated metadata by setting the `cache.generated` option to the
+number of seconds the metadata should be cached.
 
-You will now have a configuration that caches both downloaded and generated metadata.
-It will however still update the metadata when the user accesses the aggregator endpoint
-To update the generated metadata in the background, you must add a `cron.tag` option.
-This option must reference a cron tag entry configured in `module_cron.php`.
-Once this is done, your aggregated metadata will be updated everytime that cron entry is executed.
+You will now have a configuration that caches both downloaded and generated metadata. However, it will still update the
+metadata when the user accesses the aggregator endpoint. To update the generated metadata in the background, you must
+add a `cron.tag` option. This option must reference a cron tag entry configured in `module_cron.php`. Once this is
+done, your aggregated metadata will be updated every time that cron entry is executed.
-- 
GitLab