The patch has now been applied and committed. Please let me know of any
issues should they arise.
Cheers
mick
> -----Original Message-----
> From: Kenneth Marshall [mailto:ktm@rice.edu]
> Sent: Friday, February 01, 2008 2:53 PM
> To: Mick Johnson
> Cc: dspam-users@lists.nuclearelephant.com; dspam-
> dev@lists.nuclearelephant.com
> Subject: Re: [dspam-dev] CVS updates
>
> Mick,
>
> I was reading my mail from most recent to less recent. If you
> are the one who should receive patches, I have attached my
> PostgreSQL 8+ performance patch for group performance. It is
> based on the DSPAM 3.6.2 update to the driver. Please consider
> applying it for the 3.8.1 release. I would be happy to answer
> any questions and there is a longish thread in the mailing
> list archives from when I submitted it originally. Thank you
> for your consideration.
>
> Cheers,
> Ken
>
> On Fri, Feb 01, 2008 at 01:47:25PM -0800, Mick Johnson wrote:
> > All
> >
> > A few pending updates have been pushed up to CVS :
> >
> > * Allow users to select multiple rows by clicking
> > on the initial row, holding shift, and clicking on the final row.
> > * Adds a "select 200" button to the quarantine page.
> > * Removed some junk from a previous merge.
> >
> > The Feature Request page has also been updated - the donations button
> and
> > dollar value components have been removed as we're no longer
> accepting
> > donations for this project.
> >
> > A few of the patches came in via this interface, this actually makes
> it
> > harder to patch as a) I don't know who submitted them and can't bring
> up any
> > suggestions or corrections directly, and b) they have less visibility
> on
> > this list. In the future, if you wish to submit patches (and I'm
> always
> > happy when people do) please do so here to ensure the community can
> review.
> >
> > The latest CVS version seems to have been running stably for some
> time now
> > and I'm looking to push this out as a stable 3.8.1 this month unless
> I hear
> > otherwise.
> >
> > Finally, looking forward to a great 2008!
> >
> > Cheers
> > Mick Johnson
> > Sensory Networks
> >
> >
> -----------previous patch posting to dspam-users/dspam-dev--------
> Dear dspam-users:
>
> I sent this patch update to the dspam-dev list, but I never
> saw it. It should be useful to others in the DSPAM community
> that are or would like to use PostgreSQL as the DB backend.
> The patch is against version 3.6.2 but should apply fairly
> easily to any of the 3.6.x series.
>
> Ken Marshall
>
> ----- Forwarded message from Kenneth Marshall <ktm@it.is.rice.edu> ----
> -
>
> Date: Sun, 26 Feb 2006 16:49:16 -0600
> From: Kenneth Marshall <ktm@it.is.rice.edu>
> To: Rustam Aliyev <rustam@azernews.com>
> Cc: Jonathan Zdziarski <jonathan@nuclearelephant.com>,
> dspam-dev@lists.nuclearelephant.com,
> Subject: Re: [dspam-dev] PostgreSQL Performance Patch
>
> Jonathan and Rustam,
>
> We are still having problems with the delicateness of MySQL
> as a backend DB. The problem is not with DSPAM at all but that
> we are having problems scaling MySQL to the number of users and
> still be able to provide a backup DB server, in case the primary
> server has an outage. Also, the database cleaning and optimization
> shutsdown the processing while it is running when using a MYISAM
> DB. With the updated PostgreSQL driver in DSPAM and the release
> of version 8.1, we are again testing the PostgreSQL backend.
>
> I had high expectations for the combo of DSPAM 3.6 and PostgreSQL
> 8.1, but while it easily handled much more than 3.2/3.4 and 8.0
> it still took too long to process messages. The basic token select
> took seconds to perform. I finally tracked the problem down to the
> fact that the updated 8.x query is only used if you are not using
> the merged group. Without the merged/global group the selects took
> approximately 20ms with it they took 3000ms.
>
> I generated the attached patch which uses the same technique to
> speed the selects for both the uid and gid. With this patch, my
> test queries returned identical results, but in 22ms versus almost
> 5900ms for the unpatched code. Could you please take a look at it
> and consider it for inclusion in the next point release.
>
> Yours truly,
> Ken Marshall
>
> On Wed, Aug 03, 2005 at 10:42:57AM +0500, Rustam Aliyev wrote:
> > Here's the patch. Should work, tested.
> >
> > There's also alternative way (just for discussion): maybe it would be
> > better to create two different "lookup_tokens()" functions for 7.x
> and
> > 8.x versions?
> >
> >
> > Jonathan Zdziarski wrote:
> >
> > >so what do you think rustam about checking the version and adjusting
> > >the query accordingly?
> > >
> > >Jonathan
> >
> >
>
> > --- pgsql_drv.c.jz Wed Aug 3 10:09:18 2005
> > +++ pgsql_drv.c Wed Aug 3 10:16:10 2005
> > @@ -498,8 +498,15 @@
> > "FROM dspam_token_data WHERE uid IN ('%d','%d') AND token
> IN (",
> > uid, gid);
> > } else {
> > - snprintf (scratch, sizeof (scratch),
> > - "SELECT * FROM lookup_tokens(%d, '{", uid);
> > + if (PQserverVersion(s->dbh) > 80000) {
> > + snprintf (scratch, sizeof (scratch),
> > + "SELECT * FROM lookup_tokens(%d, '{", uid);
> > + } else {
> > + snprintf (scratch, sizeof (scratch),
> > + "SELECT uid, token, spam_hits, innocent_hits "
> > + "FROM dspam_token_data WHERE uid = '%d' AND token IN
> (",
> > + uid);
> > + }
> > }
> >
> > buffer_cat (query, scratch);
> > @@ -520,10 +527,14 @@
> > }
> > ds_diction_close(ds_c);
> >
> > - if (gid != uid)
> > + if (PQserverVersion(s->dbh) > 80000) {
> > + if (gid != uid)
> > + buffer_cat (query, ")");
> > + else
> > + buffer_cat(query, "}')");
> > + } else {
> > buffer_cat (query, ")");
> > - else
> > - buffer_cat(query, "}')");
> > + }
> >
> > #ifdef VERBOSE
> > LOGDEBUG ("pgsql query length: %ld\n", query->used);
>
>
> --- pgsql_drv.c_362 2006-02-26 16:25:29.266084245 -0600
> +++ pgsql_drv.c 2006-02-24 14:17:24.717817917 -0600
> @@ -490,10 +490,15 @@
> }
>
> if (gid != uid) {
> - snprintf (scratch, sizeof (scratch),
> - "SELECT uid, token, spam_hits, innocent_hits "
> - "FROM dspam_token_data WHERE uid IN ('%d','%d') AND
> token IN (",
> - uid, gid);
> + if (s->pg_major_ver >= 8) {
> + snprintf (scratch, sizeof (scratch),
> + "SELECT * FROM lookup_tokens(%d, %d, '{", uid, gid);
> + } else {
> + snprintf (scratch, sizeof (scratch),
> + "SELECT uid, token, spam_hits, innocent_hits "
> + "FROM dspam_token_data WHERE uid IN ('%d','%d') AND
> token IN (",
> + uid, gid);
> + }
> } else {
> if (s->pg_major_ver >= 8) {
> snprintf (scratch, sizeof (scratch),
> @@ -525,10 +530,7 @@
> ds_diction_close(ds_c);
>
> if (s->pg_major_ver >= 8) {
> - if (gid != uid)
> - buffer_cat (query, ")");
> - else
> - buffer_cat(query, "}')");
> + buffer_cat(query, "}')");
> } else {
> buffer_cat (query, ")");
> }
>
> --- pgsql_objects.sql_362 2006-02-26 16:25:52.636539923 -0600
> +++ pgsql_objects.sql 2006-02-24 12:56:51.523174334 -0600
> @@ -74,3 +74,29 @@
> return;
> end;';
>
> +create function lookup_tokens(integer,integer,bigint[])
> + returns setof dspam_token_data
> + language plpgsql stable
> + as '
> +declare
> + v_rec record;
> +begin
> + for v_rec in select * from dspam_token_data
> + where uid=$1
> + and token in (select $3[i]
> + from
> generate_series(array_lower($3,1),
> +
> array_upper($3,1)) s(i))
> + loop
> + return next v_rec;
> + end loop;
> + for v_rec in select * from dspam_token_data
> + where uid=$2
> + and token in (select $3[i]
> + from
> generate_series(array_lower($3,1),
> +
> array_upper($3,1)) s(i))
> + loop
> + return next v_rec;
> + end loop;
> + return;
> +end;';
> +
>
>
> ----- End forwarded message -----
Received on Sun Feb 3 01:34:24 2008
This archive was generated by hypermail 2.1.8 : Mon Feb 04 2008 - 00:00:12 CET