From wessels@COLORADO.EDU Date: Thu, 02 Nov 1995 01:26:11 -0700 From: Duane Wessels To: harvest-users-real@burton.cs.colorado.edu Subject: Harvest beta available The Harvest version 1.3 patchlevel 3 beta release has been updated. Both source and binary distributions are available at ftp://harvest.cs.colorado.edu/priv/pl3.beta/ Changes between release v1.3 pl3 (November, 1995) and v1.3: - Changes to the Gatherer: - Added symbolic link loop detection to httpenum. - Added a GIF image summarizer (GIFImage.sum), requires netpbm. The GIFImage type is still in the Essence stoplist by default. - Added 'C' version of ftpget. - Added ability to rewrite the SOIF template URL with Essence post-processing. Could be used to gather file:// URLs and have them exported as http:// URLs. - Fixed select() timeouts to POSIX semantics. - Fixed SGML summarizer to give error if input is empty. - Fixed a Makefile to actually build and install HTML-lax.sum. - Fixed liburl problem with AFS. Must *copy* files into the cache-liburl directory. - Fixed News gatherering: If 'newsget.pl' exits non-zero, close the NNTP server socket. - Fixed newsget.pl with a major rewrite. - Fixed 'fileenum' to use URLs and not always return file://hostname/. - Fixed gatherd bug where child process would remove parent's gatherd.pid file. - Changed NewsArticle.sum TTL to 7 days by default. - Changed Essence unnesting to occur in individual directories. - Removed confusing gatherd DNS mismatch warning message. - Changes to the Broker: - Added #Restart-Index-Server command to broker admin command set. - Added error logging and debugging in Glimpse inline query code. - Fixed select() timeouts to POSIX semantics. - Fixed Glimpse minor malloc problems. - Fixed the broker on Linux; needs unbuffered input from gather process. - Fixed broker query language bug for high-bit (international) characters. - Changed Broker to allow specifically setting GlimpseServer_Port again; if not set, port is chosen randomly. - Changed BrokerAdmin.cgi to use unbuffered output. - Changed Glimpse macros CLEANUP and RETURN to be functions. - Changed broker admin/LOG to log FQDN instead of IP address. - Remove glimpse version ambiguities in Glimpse/index.c. - Removed getpeername() call in the broker; get address from accept(). - Changes to the Cache: - The cache has been moved to a separate distribution. - Miscellaneous Changes - Dont link with -lmalloc on Solaris. - Fixed User Manual and FAQ inconsistencies. Changes between release v1.3 pl3 beta (26 Oct 1995) and v1.3 pl2: - Changes to the Cache: - Add behind_firewall mode to cached.conf. - Add TCP_DONE logging. - Add C version of ftpget. - Add rotate_logs(), called upon SIGHUP. Renames log files with incremental digits 0 through 9. - Add code to disable UDP port if set to -1. - Add swap file bitmap to avoid stat(2) calls. - Add check for objects in DELETE_BEHIND mode; junk the object when client goes away. - Add {FTP,HTTP,GOPHER}_EXPIRE logging. - Add User-Agent fix to include "via Harvest Cache..." - Add domain list to neighbor query algorithm--only ping neighbors for URLs of the given domains. - Add checks for NULL StoreEntry key. - Add 'cache_hot_vm_factor' to cached.conf. - Add dead parent/neighbor detection. - Add single parent mode. No ping or DNS lookups are done. Everything is retrieved from the parent. - Add cache_neighbor_obj on/off to cached.conf. If off, objects pulled from neighbors are not cached. - Add Linux patches from gitelson@chaph.usc.edu - Add configure check for IRIX, add -ansi if found. - Fix APPEND_DOMAIN coredumps. - Fix many things wrong with cachemgr/stats information. - Fix asciiProcessInput() to wait for end of HTTP request; don't assume entire request comes in one read(). - Fix having a cached parent/neighbor on the same machine. - Fix bindAddressList memory leak. - Fix client-goes-away coredump bug. - Fix object size limit problems. Now can proxy any size object. - Fix bogus URL coredump bug (http://aaaaaaa.aaaaaaa.aaaaa....) - Fix local_ip to check server IP instead of client IP. - Fix gopher off-by-one bug. - Fix SECHO bug. Added DECHO for dumb cache pinging. - Fix cached.conf inconsistencies and missing tags. - Fix getFromSource() coredump bug. - Fix cachemgr.c to not use /proxy/. - Change select() timeouts to POSIX style. - Changed NO_TESTNAME #define to a command line option, -D. - Changed stats to show objects in transit. - Changed select loop to check for FD timeouts less frequently. - Changed cached to stop accepting new connections when 75% of available file descriptors have been used. - Changed store.c to continue swapfileno where storeRebuildFromDisk() left off. - Changed HTTP and Gopher to not use lifetime handlers for server FDs. Use read timeouts instead. - Changed store.c to use multiple swap directories. - Changed stat.c to use storeAppend(). - Changed strncmp()'s to strcmp() in cache_cf.c main loop. - Change DO_MALLINFO to work better. - Change configure to not link with -lmalloc on Solaris - Remove sys_errlist[] in each file. Use xstrerror() instead. Known Bugs: - cached loops if reading from the server-side is more than *_DELETE_GAP bytes ahead of writing to the client. - "Server-Push" URLs not detected. Duane W. -- Duane Wessels Harvest Developer http://harvest.cs.colorado.edu/~wessels/