[PREVIOUS CHAPTER] [______TOC_______]
2 Mail Traffic Information


2.1	How to


How to customize MTI ? The most important variables are

	$MTI_BURST_SOFT_LIMIT
	$MTI_BURST_HARD_LIMIT


IF you change this value smaller, even small traffic can be a bomb.
In default the limit is about 1, sequential in-coming 12-15 mails is
the limit of a bomb or not.

2.2	Traffic Monitor


FML 2.2 (after 2.1B) has internal traffic monitor system called "MTI".
It checks the traffic and detects intermittency (burst) effect of
traffic against mail bombing. It is applicable to trap mail loops
for a bonus:) though several loop checks must reject the loop before
MTI routines.


Firstly to trap mail bombing is to detect the intermittency effect /
burst of mail traffic. This logic is effective except for the case
burst that traffic from network errors and hosts over UUCP or DIP
service.  We cannot determine the burst occurs from network errors or
bombings. We should observe the burst to assume the possibility that
we regard network errors as mail bombing.


In all cases, how effect MTI depends on the fake level of injected
mails.  Since SMTP has no authentication now, we cannot believe
anything except for the time when mail kicks off fml.pl.

2.3	Mail Traffic Information


Mail Traffic Information (MTI) is FML built-in traffic monitor. It
monitors the traffic and logs (via Berkeley DB) the traffic in the
past 1 hours (default) with the date and From:, Return-Path:, Sender:
fields et.al.

2.4	MAIL BOMBING Evaluation Function


Here we use TeX statements to show some mathematical concepts.


Consider a time sequence of mails incoming to fml.pl. Let the time
when mail m_i comes and runs fml.pl be t_i, the Date: time of the
mail be d_i. "i" is an index number. It has the same order as the
article sequence number ordering.


MTI caches a set (t_i, d_i) for m_i. The key of database is addresses
of e.g. From:, Sender:, Return-Path:. MTI discards data entries after
$MTI_EXPIRE_UNIT (3600 sec in default).


In default MTI evaluation function is 


	\sum 1/  | t_i - t_j + \epsilon | 


where \epsilon is against divergence. |t_i - t_j| is the difference of
time between when fml.pl runs.  It becomes smaller if mails comes
continuously. We sum up the inverse of it, so the sum becomes large if
the burst occurs. If the sum is over some thresholds, we think our
server is attacked!


This threshold is typically 0.2. The choice of this value is sensitive.
For example, we evaluate it in the following way

		i=N
	E{t} =	\sigma 1 / | t_i - t_j + \epsilon | > \sigma 1/M \sim N/M
		i=1

	| t_i - t_j + \epsilon | < M


where M = 10, N = 5. It yields N/M = 0.2. It means our limit is that
5 mails comes continuously in the time slice of 10 seconds.


This logic has a problem. We cannot distinguish network errors and
bombings only based on the in-coming mails burst trap.


To distinguish network errors or bombing, we calculate a sum of
sequences d_i (Date:) in the same way. So consider


	E{d} =	\sigam 1 / | d_i - d_j + \epsilon |


Usually we edit mail draft and post it on random time sequence.
Hence the time of Date: is random. When network errors delay mail
delivery, Date: is random (E{d} is small) and fml.pl runs in burst
(E{t} is large). When a real mail bomb attack you, both E{d} and E{t}
must be large.


A bad MUA may send plural mails in the same Date: ;_; In such a case,
E{d} >> 1 in one mail. We avoid this by limit the minimum for |d_i -
d_j |.


	\sigma 1 / ( | d_i - d_j | < 3 ? 3 : | d_i - d_j | )


2.5	Evaluation Function (default)

Let E{d} and E{t} be the same sum described above. 
We consider the following evaluation.

1	soft limit


	If mail bombing (as describe above)
	E{d} > E{t} or E{d} is nearly equal to E{t}.
	If E{d} < E{t}, network errors may occur.
	If E{d} > E{t} and E{d} > $MTI_BURST_SOFT_LIMIT, 
	mail bomb attack comes in, so we reject them.

2	hard limit


	Condition 1 is not effective if the mail header is faked.
	So we use another absolute limit.
	If E{d} > $MTI_BURST_HARD_LIMIT or
	   E{t} > $MTI_BURST_HARD_LIMIT,
	we reject in-coming mails.

3 	">" (greater than) 


	The value of sum up has some measurement errors. 
	We need some margin for ">" evaluation.

	if ( E{d} > E{t} ) {
		if ( E{d} > $MTI_BURST_SOFT_LIMIT) {
			mail bomb
		} 
	}
	else {  # if the mail header is a fake

		if ( E{t} > $MTI_BURST_HARD_LIMIT ) {
			mail bomb
		}
		if ( E{d} > $MTI_BURST_HARD_LIMIT ) {
			mail bomb
		}
	}

2.6	MTI Configuration Variables

	$USE_MTI


Enables MTI function to work. If not set, MTI does not work.


	$MTI_BURST_SOFT_LIMIT
	$MTI_BURST_HARD_LIMIT


described above.

	$MTI_BURST_MINIMUM


described above.

	$MTI_COST_EVAL_FUNCTION


Evaluation Function you write.

	$MTI_COST_EVAL_HOOK

HOOK

2.7	Maximum Limit of Traffic


If the evaluation function is \sigma 1, we can count the mail traffic
in the unit time. The cost does not count the burst effect but is
the usual average. The unit is $MTI_EXPIRE_UNIT.

	$MTI_DISTRIBUTE_TRAFFIC_MAX


The maximum of traffic of distribution mails from an address.

	$MTI_COMMAND_TRAFFIC_MAX


The maximum of traffic of distribution command mails from an address.

2.8	Other variables

	$MTI_EXPIRE_UNIT

cache life time.

	$MTI_APPEND_TO_REJECT_ADDR_LIST


If set, we inject the address we determined as a bomber to
$REJECT_ADDR_LIST ($DIR/spamlist). However it may be not effective
since $REJECT_ADDR_LIST is rejection based on From: address checks.

2.9	Files

	$MTI_DB
	$MTI_HI_DB

	$MTI_DIST_DB
	$MTI_HI_DIST_DB
	$MTI_HI_COMMAND_DB
	$MTI_COMMAND_DB

cache files.

	$MTI_MAIL_FROM_HINT_LIST


This file ($DIR/mti_mailfrom.hint) is a list we regard as a bomber.
It is a hint. How you apply this to your operation depends on you.

2.10	Arguments of function


Here is Beta test specification:)

    $fp = $MTI_COST_EVAL_FUNCTION || 'MTISimpleBomberP';
    &$fp(*e, *MTI, *HI, *addrinfo, *hostinfo);

	%Envelope	Envelope


	%Envelope	Envelope
	%MTI		address => time sequence
	%HI		host => time sequence
	%addrinfo	address
	%hostinfo	host evaluated from Received: fields

2.11	perl 5 tie 

	$MTI_TIE_TYPE

e.g. DB_File, NDBM_File, ...


If defined, fml uses "tie" function in MTI sub system irrespective of
dbmopen(). This is used only on perl 5. What you use depends on
your operating system. Please see your OS manuals and perl.

2.12	Negative cache of warning mails


negative cache to warn the burst traffic to maintainers

2.13	DB type

	$MTI_TIE_TYPE


use "tie" function in MTI sub system.

3	several size limits

3.1	Limit of size for a posted article


You can restrict a posted article size. The maximum size is defined by

		$INCOMING_MAIL_SIZE_LIMIT


where the unit of this variable is "bytes". If

		$NOTIFY_MAIL_SIZE_OVERFLOW (default 1)


is set, fml notifes the rejection to the sender.


If message/partial style mail is given, fml speculates the total size.
fml rejects it if the speculated total size is over
$INCOMING_MAIL_SIZE_LIMIT.


If 

		$ANNOUNCE_MAIL_SIZE_OVERFLOW (default 0)


is defined, announce "somebody sends too big mail to ML" to ML.


3.2	Limit of the number of members in a ML

	$MAX_MEMBER_LIMIT


is the maximum number of members. It may be useful in automatic
registration mode.


3.3	Example: to discard an over length mail

We discard mail with over 1000 lines. Today please use
$INCOMING_MAIL_SIZE_LIMIT for incoming mail size upper limit.

$START_HOOK = q#
    if ($Envelope{'nlines'} > 1000) {
	&Warn("Discarded on the behalf of too Large Mail", &WholeMail);	
	$DO_NOTHING = 1;
    }
#;

3.4	$START_HOOK: limit the number of member 


A file $LIMIT_OVER_FILE is to say "Sorry for that I cannot regist you
since this ML is over the limit of ML member".
Ref: START_HOOK => ../hooks 3.1

$START_HOOK = q%;

$MAX_MEMBER = 100;

$LIMIT_OVER_FILE = "$DIR/limit.over"; 

sub WC
{
    local($f) = @_;
    local($lines) = 0;

    open(TMP, $f) || return 0;
    while (<TMP>) { 
	next if /^\#/;
        $lines++;
    }
    close(TMP);

    $lines;
}


if (&WC($ACTIVE_LIST) > $MAX_MEMBER) {
    &SendFile($From_address, 
	      "Sorry, the mailing list member exceeds the limit $ML_FN", 
	      $LIMIT_OVER_FILE);
    $DO_NOTHING = 1;
}


%;