Le Lundi 01 Août 2005 22:32, Elliot Finley a écrit :
>
> How does DSPAM decide which tokens to use when classifying an email?
I assume that it takes a number of tokens with the highest scores, plus a
number of tokens with the lowest scores, and mitigates them with the
configured bayesian algorithm...
> Most of the time when looking at the X-DSPAM-Factors header, it shows
> mostly tokens taken from the email header.
Maybe your database isn't yet populated enough ? See how my DSPAM factored
your own email, you'll see both tokens from the headers and tokens from the
body :
X-DSPAM-Result: Innocent
X-DSPAM-Confidence: 0.9965
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 42ee86e274031263121426
X-DSPAM-Factors: 27,
use+when, 0.00010,
the+DSPAM, 0.00010,
Subject*factors, 0.00020,
does+DSPAM, 0.00020,
tokens+to, 0.00020,
decide+which, 0.00020,
From*Elliot, 0.00020,
taken+from, 0.00020,
Sender*owner, 0.00077,
DSPAM, 0.00207,
DSPAM, 0.00207,
Subject*dspam, 0.00317,
Return-Path*owner, 0.00466,
*owner, 0.00467,
To*users, 0.00600,
Sender*nuclearelephant, 0.00600,
*dspam+users, 0.00600,
*owner+dspam, 0.00600,
Sender*dspam+users, 0.00600,
Sender*dspam, 0.00600,
Return-Path*nuclearelephant+com, 0.00600,
Received*dspam-users, 0.00600,
Received*dspam-users, 0.00600,
*nuclearelephant, 0.00600,
To*users+lists, 0.00600,
Sender*owner+dspam, 0.00600,
Sender*lists+nuclearelephant, 0.00600
-- Michel Bouissou <michel@bouissou.net> OpenPGP ID 0xDDE8AC6EReceived on Mon Aug 1 17:07:00 2005
This archive was generated by hypermail 2.1.8 : Thu Sep 29 2005 - 13:51:28 EDT