On Mon, 8 Aug 2005 12:20:05 +0200
Andreas Klemm <andreas@klemm.apsfilter.org> wrote:
> On Mon, Aug 08, 2005 at 11:34:35AM +0300, Ion-Mihai Tetcu wrote:
> >
> > [ Now why I didn't receive this on dspam-users also ? ]
> >
> > On Mon, 8 Aug 2005 08:27:35 +0200
> > Andreas Klemm <andreas@klemm.apsfilter.org> wrote:
> >
> > > Hi,
> > >
> > > its about this dspam version (dspam-3.4.6.20050523.0845).
> >
> > Can you update to 3.4.8 ? From what I remember there was a training
> > bug that got fixed.
>
> Can try that later, k thanks !
> I hope the database can stay ..
> Or do I need to setup database newly ?
No. (I try not to miss noting this kind of things in files/UPDATING).
> > > This time I tried to setup dspam, that it gets trained.
> > > Since by training corpus I somtimes get strange results,
> > > too weighted to the one or the other side (spam/innoc.).
> >
> > As a rule, I don't like corpus;
>
> haha, well, but even if you dont create one, one will eb
> created for you on the long run ;-)
>
> Or do you mean "*foreign* ready to use" corpuses ?
Foreign or home made :) it's all about mail habits and type of spam you
receive.
> > > But the corpus only is increasing very slowly.
How exactly do you input this corpus ? A spam trap or ?
> > > Seems for me as if training lasts a year or even more.
> > > Though I get aprox. 2000 mails a day. Mainly mailinglist
> > > traffic and ... spam.
> >
> > This is strange, AFAIR tum means teft until the magic border is
> > reached.
>
> Is tum ok for me ? Or should I use teft, see later below.
How many mails do you get / day ?
> > > Look here: Only 157 and 11 for spam and innocent corpus.
> > >
> > > This few in about 10 weeks:
> > > Look, from when my db newinstall is:
> > > root@titan[ttyp2]{221} /var/db/pkg/postgresql-server-8.0.3 ll
> > > total 80
> > > -rw-r--r-- 1 root wheel 58 May 21 19:07 +COMMENT
> >
> > I've blown my db some months ago and from what I remember it took
> > about 2 or 3 weeks to get good filtering (but with teft). Same type
> > of emails as you.
>
> Should I use teft then ?
Maybe. If you get a few hundred mails per day it would make sense to
blow your db, start new with teft and correct misclassified mails
daily for two weeks - a month and then turn to tum if you like.
> > > root@titan[ttyp2]{211} ~ dspam_stats andreas
> > > andreas TS: 6204 TI: 23156 SM: 1045 IM: 77 SC:
> > > 157 IC: 11
> > >
> > > Part of my config.
> > >
> > > TrainingMode tum
> > > #Feature sbph
> > > Feature chained
> > > Feature tb=4
> >
> > Try going with a smaller tb, eventually; it works for me, but I have
> > 1/30 spam/ham ratio and yours is 1/4 so you might get some more FP.
>
> False positives would be bad for me.
> I cannot review all of my spam daily.
Yeh, same here, I prefer to have some missed spam that a fp.
[ ... ]
-- IOnut Unregistered ;) FreeBSD "user" "Intellectual Property" is nowhere near as valuable as "Intellect"Received on Mon Aug 8 08:47:48 2005
This archive was generated by hypermail 2.1.8 : Thu Sep 29 2005 - 13:51:28 EDT