From m.pohl@gmx.net Thu Jun 21 09:14:17 2001
Date: Thu, 21 Jun 2001 10:05:18 +0100
From: Marc Pohl <m.pohl@gmx.net>
Reply-To: m.pohl@gmx.de
To: htdig-dev@lists.sourceforge.net
Subject: [htdig-dev] 3 small patches for htdig-3.2.0b4

hi,

while testing the b3 release aof htdig-3.2.0 i found several errors in 
htdig. (all patches are against the latest snapshot build)

marc



if you init a URL with a empty string "" then the result of _url is 
"http:///". this happens for example while constructing http-referer 
headers for update-digs. the patch is to ensure that the _host is 
set if we are constructing non-file urls.

diff -ur htdig-3.2.0b4-061701-orig/htcommon/URL.cc htdig-3.2.0b4-
061701/htcommon/URL.cc
htdig-3.2.0b4-061701-orig/htcommon/URL.cc	Sun May 20 09:13:46 2001
+++ htdig-3.2.0b4-061701/htcommon/URL.cc	Thu Jun 21 07:44:59 2001
@@ -701,7 +701,12 @@
// Also ensures the port number is correct for the service
//
void URL::constructURL()
-{
+{    
+    if (strcmp((char*)_service, "file") != 0 && _host.length() == 0) {
+        _url = "";
+        return;
+    }
+
    _url = _service;
    _url << ":";


i'm using alternate workfiles for my search-engine. if i also use md5 
signatures then the second initial dig will fail, because htdig uses the 
existing md5-database and no alternate workfile.

diff -ur htdig-3.2.0b4-061701-orig/htdig/htdig.cc htdig-3.2.0b4-
061701/htdig/htdig.cc
htdig-3.2.0b4-061701-orig/htdig/htdig.cc	Sun May 20 09:13:51 2001
+++ htdig-3.2.0b4-061701/htdig/htdig.cc	Thu Jun 21 07:53:44 2001
@@ -194,6 +194,13 @@
	    configValue << ".work";
	    config->Add("doc_excerpt", configValue);
	}
+
+        configValue = config->Find("md5_db");
+        if (configValue.length() != 0)
+        {
+            configValue << ".work";
+            config->Add("md5_db", configValue);
+        }
    }
    

my last patch repairs the broken date_factor calculation. the problem is 
that the multiplication of the timestamp with 1000 produces an overflow 
on 32 bit systems. so this should be done with floating point arithmetic.
another small performance patch is to move the call of time(0) out of the 
loop to avoid frequent calling of time().

diff -ur htdig-3.2.0b4-061701-orig/htsearch/Display.cc htdig-3.2.0b4-
061701/htsearch/Display.cc
htdig-3.2.0b4-061701-orig/htsearch/Display.cc	Sun Jun 10 09:13:56 2001
+++ htdig-3.2.0b4-061701/htsearch/Display.cc	Thu Jun 21 09:24:13 2001
@@ -1190,12 +1190,13 @@
		     (config->Value("endday"))     ||
		     (config->Value("endyear")));

+    time_t now = time((time_t *)0); 	// fill in all fields for mktime
+
    // find the end of time
    endoftime = gmtime(&eternity);

    if(dategiven)    // user specified some sort of date information
      {
-	time_t now = time((time_t *)0); 	// fill in all fields for mktime
	tm *lt = localtime(&now); 		//  - Gilles's fix
	startdate = *lt; 
	enddate = *lt; 
@@ -1424,7 +1425,7 @@
	if (date_factor != 0.0)
	{
	    date_score =  date_factor * 
-	      ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
+	      ((thisRef->DocTime() * 1000.0 / (double)now) - 900);
	    score += date_score;
        }



--------------------------------------------------------------
Marc Pohl 
                                 WWF Wort + Ton GmbH e-media
Tel.:  +49 221  6900 112         Kolumbastrasse 3
FAX:   +49 221  6900 150         D-50667 Koeln
Email:   marc.pohl@wwf-gruppe.de
Privat:     m.pohl@gmx.de

_______________________________________________
htdig-dev mailing list
htdig-dev@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/htdig-dev
