Re: [dspam-users] Problem with history (web interface)

From: Aaron Wolfe <aawolfe@gmail.com>
Date: Sat Jan 21 2006 - 21:16:29 EST

Ok, below is the "patch" for dspam.cgi.. its more a rewrite of the
DisplayHistory sub.
It works for me, I hope it works for anyone else that is bothered by the
issues with the current webui history page.
If you find errors or know a better way to do this, please let me know.

-Aaron

186,188d185
< if ($CONFIG{'HISTORY_PER_PAGE'} > 0) {
<
< $history_site = 1 if $history_site eq "";
190,196c187
< open(LINES,"wc -l $LOG|");
< while (<LINES>){
< chomp;
< s/^ +//;
< if (/([0-9]*)/) { $all_lines = $1; }
< }
< close (LINES);

---
>   $history_site = 1 if $history_site eq "";
198,199c189
<     $end = $all_lines - (($history_site-1) * $CONFIG{'HISTORY_PER_PAGE'});
<     $begin = $end - $CONFIG{'HISTORY_PER_PAGE'} + 1 ;
---
>   #read through entire log
201,205c191
<     if ($begin < 0) {
<       $begin = 1;
<     } elsif ($begin < $all_lines - $CONFIG{'HISTORY_SIZE'} -
$CONFIG{'HISTORY_PER_PAGE'}) {
<       $begin = $all_lines - $CONFIG{'HISTORY_SIZE'}  + 1;
<     }
---
>   open (LOG,"< $LOG");
207,213c193,194
<     if ($all_lines < $CONFIG{'HISTORY_PER_PAGE'}) {
<       $history_pages = 1;
<     } elsif ($all_lines > $CONFIG{'HISTORY_SIZE'}) {
<       $history_pages = ($CONFIG{'HISTORY_SIZE'} +
$CONFIG{'HISTORY_PER_PAGE'}) / $CONFIG{'HISTORY_PER_PAGE'};
<     } else {
<       $history_pages = $all_lines / $CONFIG{'HISTORY_PER_PAGE'};
<     }
---
>   while ($line = <LOG>) {
>     my($time, $class, $from, $signature, $subject, $info, $messageid) =
split(/\t/, $line);
215,216c196,199
<     open(LOG, "sed -n \'$begin,$end\p\' $LOG|");
<   }
---
>     if ($signature eq "") {
>     warn "no sig?: $line";
>     warn "time: $time  class: $class  from: $from  sig: $signature  subj:
$subject  info: $info  msgid: $messageid";
>   }  #when/why/does this happen? am i totally lost?
218,221c201
<   while(<LOG>) {
<     push(@buffer, $_);
<   }
<   close(LOG);
---
>   if ($class eq "M" || $class eq "F" || $class eq "E") {
222a203
>     # process retrains and errors (should always occur after original
message)
224,237c205,213
<   # Preseed retraining information and delivery errors
<
<   foreach $line (@buffer) {
<     my($time, $class, $from, $signature, $subject, $info, $messageid)
<       = split(/\t/, $line);
<     next if ($signature eq "");
<     if ($class eq "M" || $class eq "F" || $class eq "E") {
<       if ($class eq "E") {
<         $rec{$signature}->{'info'} = $info;
<       } elsif ($class eq "F" || $class eq "M") {
<         $rec{$signature}->{'class'} = $class;
<         $rec{$signature}->{'info'} = $info
<           if ($rec{$signature}->{'info'} eq "");
<       }
---
>     unless (defined($rec{$signature})) { warn("where is msg for $signature
?"); next; }
>     if ($class eq "E") {
>       $rec{$signature}->{'info'} = $info;
>     } else {
>       $rec{$signature}->{'info'} = $info;
>       $rec{$signature}->{'retrain'} = "<b>Retrained</b>";
>       $rec{$signature}->{'rclass'} = "spam" if
($rec{$signature}->{'class'} eq "I" || $rec{$signature}->{'class'} eq "W");
>       $rec{$signature}->{'rclass'} = "innocent" if
($rec{$signature}->{'class'} eq "S");
>       $rec{$signature}->{'class'} = $class;
239c215
<   }
---
>   } else {
241,261c217
<   while($line = shift(@buffer)) {
<     chomp($line);
<     my($time, $class, $from, $signature, $subject, $info, $messageid)
<       = split(/\t/, $line);
<     next if ($signature eq "");
<     next if ($rec{$signature}->{'displayed'} ne "");
<     next if ($class eq "E");
<     $rec{$signature}->{'displayed'} = 1;
<
<     # Resends of retrained messages will need the original from/subject
line
<     if ($messageid ne "") {
<       $from = $rec{$messageid}->{'from'}
<         if ($from eq "<None Specified>");
<       $subject = $rec{$messageid}->{'subject'}
<         if ($subject eq "<None Specified>");
<
<       $rec{$messageid}->{'from'} = $from
<         if ($rec{$messageid}->{'from'} eq "");
<       $rec{$messageid}->{'subject'} = $subject
<         if ($rec{$messageid}->{'subject'} eq "");
<     }
---
>     # process messages
266,294d221
<     my($ctime) = ctime($time);
<     my(@t) = split(/\:/, (split(/\s+/, $ctime))[3]);
<     my($x) = (split(/\s+/, $ctime))[0];
<     my($m) = "a";
<     if ($t[0]>12) { $t[0] -= 12; $m = "p"; }
<     if ($t[0] == 0) { $t[0] = 12; }
<     $ctime = "$x $t[0]:$t[1]$m";
<
<     # Set the appropriate type and label for this message
<
<     my($cl, $cllabel);
<     $class = $rec{$signature}->{'class'} if ($rec{$signature}->{'class'}
ne "");
<     if ($class eq "S") { $cl = "spam"; $cllabel="SPAM"; }
<     elsif ($class eq "I") { $cl = "innocent"; $cllabel="Good"; }
<     elsif ($class eq "F") { $cl = "false"; $cllabel="Miss"; }
<     elsif ($class eq "M") { $cl = "missed"; $cllabel="Miss"; }
<     elsif ($class eq "N") { $cl = "inoculation"; $cllabel="Spam"; }
<     elsif ($class eq "C") { $cl = "blacklisted"; $cllabel="RBL"; }
<     elsif ($class eq "W") { $cl = "whitelisted"; $cllabel="Whitelist"; }
<     if ($messageid ne "") {
<       if ($rec{$messageid}->{'resend'} ne "") {
<         $cl = "relay";
<         $cllabel = "Resend";
<       }
<       $rec{$messageid}->{'resend'} = $signature;
<     }
<
<     $info = $rec{$signature}->{'info'} if ($rec{$signature}->{'info'} ne
"");
<
302d228
<     $time = sprintf("%01.2f", $time);
304,310c230,266
<     my($retrain);
<     if ($rec{$signature}->{'class'} =~ /^(M|F)$/) {
<       $retrain = "<b>Retrained</b>";
<     } else {
<       my($rclass);
<       $rclass = "spam" if ($class eq "I" || $class eq "W");
<       $rclass = "innocent" if ($class eq "S");
---
>     $rec{$signature}->{'from'} = $from;
>     $rec{$signature}->{'subject'} = $subject;
>     $rec{$signature}->{'time'} = $time;
>     $rec{$signature}->{'class'} = $class;
>     $rec{$signature}->{'info'} = $info;
>     $rec{$signature}->{'rclass'} = "spam" if ($rec{$signature}->{'class'}
eq "I" || $rec{$signature}->{'class'} eq "W");
>     $rec{$signature}->{'rclass'} = "innocent" if
($rec{$signature}->{'class'} eq "S");
>     $rec{$signature}->{'retrain'} = qq!<A
HREF="$CONFIG{'ME'}?template=$FORM{'template'}&user=$FORM{'user'}&retrain=$rec{$signature}->{'rclass'}&signatureID=$signature&history_site=$history_site">As
! . ucfirst($rec{$signature}->{'rclass'}) . "</A>";
>    }
>  }
>  close(LOG);
>
>  $all_lines = (keys %rec);
>
>  my($key,$dnum);
>  $dnum = 0;
>
>  foreach $key (sort { $rec{$a}->{'time'} cmp $rec{$b}->{'time'} } keys
%rec) {
>
>    #only show current page
>    $dnum++;
>    next unless (($dnum <= (keys(%rec) - $CONFIG{'HISTORY_PER_PAGE'} *
($history_site - 1)))
>             && ($dnum > (keys(%rec) - $CONFIG{'HISTORY_PER_PAGE'} *
$history_site)));
>
>    my($cl, $cllabel, $class);
>    $class = $rec{$key}->{'class'};
>
>    if ($class eq "S") { $cl = "spam"; $cllabel="SPAM"; }
>       elsif ($class eq "I") { $cl = "innocent"; $cllabel="Good"; }
>       elsif ($class eq "F") { $cl = "false"; $cllabel="Miss"; }
>       elsif ($class eq "M") { $cl = "missed"; $cllabel="Miss"; }
>       elsif ($class eq "N") { $cl = "inoculation"; $cllabel="Spam"; }
>       elsif ($class eq "C") { $cl = "blacklisted"; $cllabel="RBL"; }
>       elsif ($class eq "W") { $cl = "whitelisted"; $cllabel="Whitelist"; }
>
>    my($ctime) = ctime($rec{$key}->{'time'});
>    my(@t) = split(/\:/, (split(/\s+/, $ctime))[3]);
312,316c268,278
<       if ($rclass ne "") {
<         $retrain = qq!<A
HREF="$CONFIG{'ME'}?template=$FORM{'template'}&user=$FORM{'user'}&retrain=$rclass&signatureID=$signature">As
! . ucfirst($rclass) . "</
< A>";
<       }
<     }
---
>    #show dates on messages older than 1 week
>    my($x);
>    if ($rec{$key}->{'time'} < (time - 604800)) {
>      $x = (split(/\s+/, $ctime))[1] . " " . (split(/\s+/, $ctime))[2];
>    }
>    else {  $x = (split(/\s+/, $ctime))[0]; }
>
>    my($m) = "a";
>    if ($t[0]>12) { $t[0] -= 12; $m = "p"; }
>    if ($t[0] == 0) { $t[0] = 12; }
>    $ctime = "$x $t[0]:$t[1]$m";
318c280
<     my($entry) = <<_END;
---
>    my($entry) = <<_END;
320,325c282,287
<       <td class="$cl $rowclass" nowrap="true"><small>$cllabel</td>
<         <td class="$rowclass" nowrap="true"><small>$retrain</td>
<       <td class="$rowclass" nowrap="true"><small>$ctime</td>
<       <td class="$rowclass" nowrap="true"><small>$from</td>
<       <td class="$rowclass" nowrap="true"><small>$subject</td>
<       <td class="$rowclass" nowrap="true"><small>$info</td>
---
>         <td class="$cl $rowclass" nowrap="true"><small>$cllabel</td>
>         <td class="$rowclass"
nowrap="true"><small>$rec{$key}->{'retrain'}</td>
>         <td class="$rowclass" nowrap="true"><small>$ctime</td>
>         <td class="$rowclass"
nowrap="true"><small>$rec{$key}->{'from'}</td>
>         <td class="$rowclass"
nowrap="true"><small>$rec{$key}->{'subject'}</td>
>         <td class="$rowclass"
nowrap="true"><small>$rec{$key}->{'info'}</td>
328c290
<     push(@history, $entry);
---
>    push(@history, $entry);
330,334c292,296
<     if ($rowclass eq "rowEven") {
<       $rowclass = "rowOdd";
<     } else {
<       $rowclass = "rowEven";
<     }
---
>    if ($rowclass eq "rowEven") {
>      $rowclass = "rowOdd";
>    } else {
>      $rowclass = "rowEven";
>    }
339a302
>   # add page selection html
340a304,312
>
>     if ($all_lines < $CONFIG{'HISTORY_PER_PAGE'}) {
>       $history_pages = 1;
>     } elsif ($all_lines > $CONFIG{'HISTORY_SIZE'}) {
>       $history_pages = ($CONFIG{'HISTORY_SIZE'} +
$CONFIG{'HISTORY_PER_PAGE'}) / $CONFIG{'HISTORY_PER_PAGE'};
>     } else {
>       $history_pages = $all_lines / $CONFIG{'HISTORY_PER_PAGE'};
>     }
>
351a324
>
On 1/20/06, Kyle Johnson <kjohnson@fixertec.net> wrote:
>
> This sounds good.  How do we get a hold of it?
>
> Kyle Johnson
> Fixertec <http://www.fixertec.net/> - Dynamic Computer and Technology
> Solutions
> 410-609-4191
>
>
> Aaron Wolfe wrote:
>
> In a nutshell, rather than try to paginate by only viewing portions of the
> log (which is hopeless), it reads the entire log and creates a hash
> structure which identifies the correct current status of each and every
> message once the entire log has been processed.  Then it displays only the
> correct portion of the hash according to the "page" you're on.
>
> The obvious drawback is that more memory and processing time is used, but
> I think it's going to have to be, since theoretically the more recent log
> entry could be a retrain of the oldest message.
>
> In practice, it add no observable load to my server with ~100 users, but
> the increase might become noticeable on a huge installation... dunno :)
>
> -Aaron
>
>
>
>
> On 1/20/06, Kyle Johnson <kjohnson@fixertec.net> wrote:
> >
> > What changes does this make?
> >
> > Kyle Johnson
> > Fixertec <http://www.fixertec.net/> - Dynamic Computer and Technology
> > Solutions
> > 410-609-4191
> >
> >
> >  Aaron Wolfe wrote:
> >
> > I have a fixed version of the cgi that I've given to a few people and
> > sent to Jon... actually a few months ago.  The problem is that the entire
> > logic of the "paging" is broken, so its not just a patch but rather a
> > replacement of the whole section of code.  I'm not sure that my version is
> > the best way to do things either, but it does take into account that fact
> > that retrain events effecting a particular message are not neccesarily found
> > nearby in the log.
> > The fix is based on a 3.6 prerelease version, I'd be happy to integrate
> > it into the current cgi if anyone is interested.
> >
> > -Aaron
> >
> >
> > On 1/20/06, Kyle Johnson <kjohnson@fixertec.net > wrote:
> > >
> > > We hear you :)
> > > This has been discussed a lot in #dspam, as well as a few times here
> > > on the mailing list....  The WebUI does need a rewrite, however Jon in a
> > > busy man, so it's going to fall into our hands to do this (most likely)
> > > Until then, it's best to keep up on the training, so you won't have to
> > > remember that much, and to open trained links in a new tab (ctrl+click) in
> > > firefox, as it saves a lot of time.
> > >
> > > Kyle Johnson
> > > Fixertec <http://www.fixertec.net/> - Dynamic Computer and Technology
> > > Solutions
> > > 410-609-4191
> > >
> > >
> > > Pierre Girard wrote:
> > >
> > > Kyle Johnson wrote:
> > >
> > > Right now, the problem with the history of the WebUI is that when you
> > > click retrain on page > 2, you get sent back to page one.  When you go back
> > > to page > 2, the message hasn't changed.
> > > This is only a visual bug, and the retraining should have still taken
> > > place...
> > >
> > >
> > > Yes that's the problem i have.  I know it's some sort of visual bug
> > > but given the amount of email and spam i receive everyday it's still
> > > confusing not to be able to determine if i trained the message or not since
> > > i have to rescan the whole thing everytime and often have the same spam more
> > > than once.  I often end up retraining the same message multiple times.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>
Received on Sat Jan 21 21:18:44 2006

This archive was generated by hypermail 2.1.8 : Mon Jan 23 2006 - 00:00:01 EST