extensions/filter module documentation
--------------------------------------

The filter extension implements message content filtering using 
solanum's hook framework and Intel's Hyperscan regular expression
matching library.

It requires an x86_64 processor with SSSE3 extensions.

To operate, the filter requires a database of regular expessions
that have been compiled using the Hyperscan library's 
hs_compile_multi() or hs_compile_ext_multi() functions. 

The command SETFILTER is used to manage operation of the filter and to
load compiled Hyperscan databases.

General documenation of SETFILTER is available using the 'HELP SETFILTER'
command.

For each expression in the database, the three least significant bits 
of the expression ID are used to indicate which action the ircd should
take in the event of a match:

001 (1) DROP   - The message will be dropped and the client will be sent
                 an ERR_CANNOTSENDTOCHAN message.
010 (2) KILL   - The connection from which the message was recevied will
                 be closed.
100 (4) ALARM  - A Server Notice will be generated indicating that an
                 expression was matched. The nick, user, hostname and
                 IP address will be reported. For privacy, the expression
                 that has been matched will not be disclosed.

Messages are passed to the filter module in a format similar to an
IRC messages:

0:nick!user@host#1 PRIVMSG #help :hello!

The number at the start of the line indicates the scanning pass:
Messages are scanned twice, once as they were received (0), and once
with any formatting or unprintable characters stripped (1).

By default, 'nick', 'user' and 'host' will contain *. This behaviour
can be changed at build time if filtering on these fields is required.

The number after the # will be 0 or 1 depending on whether the sending
client was identified to a NickServ account.

The process for loading filters is as follows:

1. The Hyperscan database is serialized using hs_serialize_database().
2. A 'SETFILTER NEW' command is sent.
3. The serialized data is split into chunks and base64 encoded. 
   The chunk size needs to be chosen to ensure that the resuliting
   strings are short enough to fit into a 510 byte IRC line, taking 
   into account space needed for the 'SETFILTER +' command, check field, 
   server mask, and base64 overhead.
4. The encoded chunks are sent using 'SETFILTER +' commands
5. Once the entire database has been sent, a 'SETFILTER APPLY' command
   is sent to commit it.
