---------------------------------------------------------------------------- The Florida SunFlash The Archie Archive Server SunFLASH Vol 27 #14 March 1991 ---------------------------------------------------------------------------- This is an article from USENET that described the Archie archive server. Many thanks to McGill University and the "Archie Group": Bill Heelan (wheelan@cs.mcgill.ca) Peter Deutsch (peterd@cc.mcgill.ca) Alan Emtage (bajan@cs.mcgill.ca) for providing this excellent service. Summary: (a) telnet to quiche.cs.mcgill.ca (132.206.2.3 or 132.206.51.1) login as user "archie" (b) send a mail message with the word "help" as the subject line to archie@cs.mcgill.ca The first part of this article describes the telnet interface and the second part describes the email interface. -johnj ---------------------------------------------------------------------------- Telnet Interface to Archive ---------------------------------------------------------------------------- Archie 2.0 ---------- The "Archie Group" of McGill University is pleased to announce Archie, the "Archive Server Server" Version 2.0. McGill University Operating "archie" ------------------------------------ - An Internet Archive Server Listing Service ---------------------------------------------- Given the number of hosts being used as archive sites nowadays, there can be great difficulty in finding needed software in a distributed environment. You may know that the software that you need is out there, but it can sometimes be difficult to find. The School of Computer Science at McGill University has one solution to the problem - "archie". Since the announcement of the dedicated-database version of archie in November, the popularity of the program has grown by leaps and bounds. From an average of about 30 logins/day in November we are now averaging over 500 with our all-time high coming in at 700 for a single day. (Our ex-boss owes us lunch for the 500+ mark :-). Archie's email interface averages about 40/day and anonymous ftp to quiche (for retrieval of the compressed site listings files in ~ftp/archie/listings) is over 70/day. Needless to say, quiche is a well-used system right about now :-) Getting To The Point: --------------------- So how do you get to use archie? If you are Internet connected, it's easy. Telnet to quiche.cs.mcgill.ca (132.206.2.3 or 132.206.51.1) and login as user "archie". You should get a banner message and status report on our latest additions (there's no password, although we do log the sessions to provide rudimentary stats). "help" gets a list of valid commands. Feedback welcome and can be sent to archie-l@cs.mcgill.ca NOTE: The following changes only apply to the interactive version of archie (the one you see when you telnet or rlogin to quiche) and NOT to the E-Mail interface. We will hopefully be overhauling that interface in the coming week(s). Quick Summary ------------- For those of you who don't want to read the whole thing, here's a quick summary of what's new in V2.0. If you want the full explanation, skip to the next section. Otherwise, see the archie online help facility. (a) Speed and performance under load should be improved. Feedback (to archie-l@cs.mcgill.ca) on this would be appreciated. (b) 3 new searching methods added. See help section under "set search". (c) Output may now be sorted. See help under "set sortby". (d) New Software Description database to help you find the names of packages to do what you want done, as well as an RFC index and other useful information. See help under "whatis". (e) New "mail" command allows you to mail archie results back to you. Say goodbye to those hated script sessions :-). See help under "mail" and "set mailto". (f) "list" command now tells the truth. Help "list". (g) A "status" variable allows you to turn on or off search progress information. Help "set status". Changes in Version 2.0 ---------------------- Thanks to all the feedback we've gotten over the past couple of months, we have modified archie into what we hope will be a more friendly and efficient service. The changes in V2.0 are: (1) Speed & Implementation ---------------------- For faster execution, Archie has been rewritten using a shared memory model which greatly improves execution times especially when the host on which archie is running is under load [which, for those of you who use archie regularly, know that quiche has been for some time now :-]. This model also allows for much faster database updates. We'd appreciate feedback on what kind of response times you are getting (subjective rather than objective). (2) Searching --------- Wider range of search methods. Until this point, archie could only search using regular expressions (as defined in ed(1)). Since most users don't require the power of regex's (and many who don't use them regularly have (understandably) trouble composing them), 3 new search methods have been added, bringing the total to 4. To change the search method, set the "search" variable and use the "prog" command per usual. Command line options are in the works but have not yet been incorporated into this version of archie. The value of the search variable for each method is listed in brackets '[ ]' below. Type "help set search" at the "archie>" prompt if you want more info. (1) Substring (case insensitive) ["sub"]. As above but ignoring the case of the strings involved. Speed about on par with the regex equivalent. (2) Substring (case sensitive) ["subcase"]. A simple, everyday substring search. A match occurs if the the file (or directory) name in the database contains the user-given substring. Slightly faster than the equivalent regex. (3) Exact match ["exact"]. The fastest search method of all. The restriction is that the user (search) string has to exactly match (including case) the string in the database. Provided for those of who who know just what you are looking for. For example, if you wanted to know where all the xlock.tar.Z files were, this is the kind of search to use. [For those of you that are interested, the search is O(1) in this case via the magic of dbm]. (4) Regex ["regex"]. The "old" method. Searches the database with the user (search) string which is given in the form of an ed(1) regular expression. This is the DEFAULT search method. Note : The "status" line that used to appear when the "pager" variable was set and the search was proceeding (showing the number of matches found and the percentage of the database) can be enabled or disabled by the use of the "status" variable, which can either be set or unset depending on if you want the line to be displayed or not. Therfore there will be no search ouput displayed until the search is complete or aborted by the user. (3) Sorting ------- Ordering the output. Archie V1.X had no concept of sorted output, except for the fact that we tried to do the updates in lexical order so that the output would be (mostly) sorted in that order. It didn't work. Consequently, you may now sort your 'prog' command output in 5 different ways. For each method, the "natural" sort order (or at least, what we consider to be the natural order) is the default. To change the sort method, set the "sortby" variable. The value of the sortby variable for each method is listed in brackets '[ ]' below. Command line options are not available at this time. The reverse sorting orders from those described here are obtained by prepending "r" to the sortby value given. (Eg. reverse hostname order "hostname" is "rhostname"). (1) Hostname order ["hostname"]. Output is sorted on the archive hostname in lexical order. (2) File/Directory name modification time ["time"]. Output is sorted with the most recent modifcation times of the found file/directory names coming first (youngest -> oldest). (3) File/Directory size ["size"]. Output is sorted by the size of the found files/directories, largest first. (4) File/Directory name lexical order ["filename"]. (5) Database order ["none"]. In other words, effectively non sorted. This is the default order and is the one that most users of archie 1.X versions will be used to. Note: Typing the keyboard interrupt character ( Ctl-C for most people on UNIX) during a search will cause the search to aborted. The results up to that time will be sorted (determined by the value of the sortby variable) and the results output. Typing an abort character during the sort will cause that to be aborted. Results up to that point will be output. (4) PD Software Description Database -------------------------------- A new database, similar to the one that the man(1) UNIX command uses when doing a "keyword" ( -k option ) lookup has been added to archie. The database currently contains about 2600 entries that we have gleaned from various sources (such as the comp.sources.*, alt.sources and RFC indices). The format is basically the name of a PD program, document, or software package followed by a short description of said object. The command is "whatis" and takes a (sub)string as an argument. All lines in the database containing that substring (case insensitive) will be printed. I think such a beast would be very useful if it were properly maintained. These current entries should be considered the mere start of the database and I'm depending on all you authors and maintainers out there to send me additions, corrections and updates to the various entries in the database. All such info should be sent to archie-admin@cs.mcgill.ca All entries are welcome, and I'll endevour to keep the database uptodate. I have not finialized what will and will not be in it so send whatever you have along and I'll make up the policy as we go along. (5) Getting rid of those crummy "script" sessions ------------------------------------------- Your days of typing "script" before every interactive archie session are now over: archie can now mail you the results of your interactive sessions. It works like this: (a) Set the "mailto" variable to your E-mail address (b) Run archie as you normally would. When you get a result that you want to keep a record of (and after you have finished browsing through it if you have the pager set on) type "mail". Archie will automatically forward the results of the last request (site, prog, etc) to the email address set before. If you have not set the address in the mailto variable you may specify one on the command line to the "mail" command. [If you do neither, and type "mail", archie will tell you]. (c) The mail is sent asynchronously (you don't have to wait for it to be sent). You will be informed when it is complete. If the generated output from archie is greater than 45K bytes, it will automatically be split it into as many parts as required to get it to you in chunks this size or less. This is so as to cooperate with certain mail systems which don't handle 50+ K chunks. [Many thanks to Mark Crispin's c-client library of mail routines which made this code SOOOO much easier] Note: For those of you who have to do source routing for your email, remember that the mail address given has to be a path from our machines to yours. Our mail setup here is pretty darned good (if I might say so myself) so the results should get to you in reasonable time (there's no queueing on our part unless the load gets abnormally high). (6) What achive sites does archie know about ? ----------------------------------- The "list" command which has been out for a couple of weeks under version 1.3 is now formally part of archie. This command allows you to specify a regular expression as an argument and prints the site names in the database which match that expression, along with the primary IP address of the site and the date that archie last updated the site for the database. "list" without an argument prints the data on all sites that archie knows about. (7) Getting kicked off for loitering -------------------------------- Archie now has an autologout feature (well, actually it has had one for the past couple of weeks, but we're now telling you about it :-). If you hang around for too long without doing anything, we'll bump you off and free up the resources for the next person along. We aren't very strict on this and, in fact, you can set the autologout period yourself, varying from 1 minute to 5 hours, with 1 hour being the default. The variable "autologout" controls this feature. Things to be done ----------------- A couple of things on our wishlist that still haven't been done: (1) Restricting searches to specific sites (soon hopefully). (2) Non UNIX sites aren't in the database (soon, maybe). (2) GUI interface (a little further off). The email interface will have to be brought up to the level of the interactive interface (as well as fixing some pretty annoying bugs in it), and hopefully that will be done fairly soon. That's all for the moment folks. We would really like to see that "whatis" database get off of the ground and all contributions are welcome. If you have any comment, suggestions or constructive critisism, please don't hesitate to drop us a line at archie-l@cs.mcgill.ca It was your comments which led to the above improvements and we'd like to keep hearing from you. - - The "Archie Group": Bill Heelan (wheelan@cs.mcgill.ca) Peter Deutsch (peterd@cc.mcgill.ca) Alan Emtage (bajan@cs.mcgill.ca) ---------------------------------------------------------------------------- Email Interface to Archive ---------------------------------------------------------------------------- For those people who do not have direct Internet access or those who would prefer 'batching' of their requests, archie provides an email interface handling a limited subset of the interactive archie commands. The address of the interface is archie@cs.mcgill.ca The help message for the interface follows: ----------------------------------------------------------------------------- The ARCHIE Mail Server HELP for the archie mail server, as of 18 December, 1990 (modified from the KISS help file) Requests to this server should be addressed to archie@cs.mcgill.ca To contact us humans, mail to archie-l@cs.mcgill.ca For your information anonymous FTP may be performed through the mail by the ftp-mail server. Send a message with the word 'help' in it to: bitftp@pucc.princeton.edu for an explanation on how to use it. NOTE: The Subject: line is processed as if it were part of the main message body. No special keywords are required. Note that the "help" command is exclusive. All other commands in the same message are ignored. Command lines begin in the first column. All lines that do not match a valid commands are ignored. The server recognizes six commands. If a message not containing any valid requests or an empty message is received, it will be considered to be a 'help' request. path This lets the requestor override the address that would normally be extracted from the header. If you do not hear from the archive server within oh, about 2 days, you might consider adding a "path" command to your request. The path describes how to mail a message from cs.mcgill.ca to your address. cs.mcgill.ca is fully connected to the Internet. help Will send you this message. prog [ ...] A search of the "archie" database is performed with each (a regular expression as defined by ed(1)) in turn, and any matches found are returned to the requestor. Note that multiple may be placed on one line, in which case the results will be mailed back to you in one message. If you have multiple "prog" lines, then multiple messages will be returned, one for each line [This doesn't work as expected at the moment... stay tuned]. Any regular expression containing spaces must be quoted with single (') or double (") quotes. ALL OTHER ed(1) rules must be followed. NOTE: The searches are CASE SENSITIVE. The ability to change this will hopefully be added soon. site | A listing of the given will be returned. The fully qualified domain name or IP address may be used. compress ALL of your files in the current mail message will be "compressed" and "uuencoded". When you receive the reply, remove everything before the "begin" line and run it through "uudecode". This will produce a .Z file. You can then run "uncompress" on this file and get the results of your request. quit Nothing past this point is interpreted. This is provided so that the occasional lost soul whose signature contains a line that looks like a command can still use the server without getting a bogus response. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ For information send mail to info-sunflash@sunvice.East.Sun.COM. Subscription requests should be sent to sunflash-request@sunvice.East.Sun.COM. Archives are on solar.nova.edu and paris.cs.miami.edu. All prices, availability, and other statements relating to Sun or third party products are valid in the U.S. only. Please contact your local Sales Representative for details of pricing and product availability in your region. Descriptions of, or references to products or publications within SunFlash does not imply an endorsement of that product or publication by Sun Microsystems. John McLaughlin, SunFlash editor, flash@sunvice.East.Sun.COM. (305) 776-7770.