confidence is based on robinson's geometric mean test, which
basically plots the confidence on a plane based on the different
values of all tokens included. so if you have 25% spammy tokens and
75% hammy tokens, your confidence will sit around 75% (innocent).
On Aug 18, 2005, at 3:51 PM, Jonas Widarsson wrote:
> Thursday 18 August 2005 21:14 Jonathan Zdziarski wrote:
>
>> That makes sense, since no actual tokens take on a 1.0 value.
>>
> So you're actually saying it *could* get above that, but it takes a
> lot to get
> there?
>
> And there isn't any guarantee that new real mail won't get high
> ratings like
> 0.9996 ?
>
> If a real mail had the very message content of a spam message,
> there would
> still be some tokens that differ from the usual spam we've learned:
> the
> headers that make the from address and the recieved headers telling
> about the
> path of the mail.
> If those hasn't been seen by DSPAM before, could the message still
> get full
> rating (X-DSPAM-Confidence >= 0.9997) ?
>
> Best regards,
> --
> Jonas Widarsson
>
> tel: +46 271 152 00 - tel: +46 271 121 42 (hemma/home) - gsm: +46
> 70 539 64 79
> MSN: jonas@widarsson.com ICQ: 72016688 jabber: jonas@widarsson.com
>
>
Received on Thu Aug 18 15:54:16 2005
This archive was generated by hypermail 2.1.8 : Thu Sep 29 2005 - 13:51:28 EDT