The package htgrep.pl
and the script htgrep
can be obtained by ftp or http from:
-
http://cui_www.unige.ch/ftp/PUBLIC/oscar/scripts/README.html
-
ftp://cui.unige.ch:/PUBLIC/oscar/scripts/
http://
site/
path
is the URL of an HTML document available at your site.
Then you can query this document by simply using the URL:
http://
site/cgi-bin/htgrep/file=
path
http://cui_www.unige.ch/cgi-bin/htgrep/file=W3catalog/cat.html
If the file contents are list items rather than standalone HTML blocks,
htgrep can be instructed to bracket
the results of the search with
<DL>
and </DL>
,
<OL>
and </OL>
or
<UL>
and </UL>
.
The tag style=
style must be included
in the call to htgrep, where style is one of
pre
, dl
, ol
or ul
.
For example, we may query a list of titles of HTML documents
and cause the resulting entries to be numbered as follows:
http://cui_www.unige.ch/cgi-bin/htgrep/file=unige-pages.html&style=ol
If a header file
base.hdr
exists, htgrep will print that instead
of the default header. In addition, if base.qry
exists, it will be used whenever a non-empty query is given.
(Normally base.hdr
will be a cover page with
introductory information, whereas base.qry
will
only contain the title and main headline.)
The header pages can also be specified with the tags
hdr=
file and qry=
file.
style=pre
can be used if the source document is a
plain text file. This will cause special characters to be escaped and
each paragraph to be surrounded by <PRE>
and
</PRE>
.
The tag grab=yes
will cause htgrep to search for URLs and
ftp pointers and convert them into hypertext links. This is most
interesting in combination with the tag style=pre
to query
plain text files. An example is the
Free Compilers List.
refer=plain
.
See, for example, the OO Bibliography Database.
The tag refer=abstract
is used internally by htgrep and is
automatically generated when a bibliography entry contains an abstract
(%X field).
A link to a new call to htgrep is then generated, which will cause
the abstract for a given entry to be displayed.
Links to ftpable papers are also generated, if the refer entry
contains a line of the form:
%% ftp:
site:file
If the tag ftpstyle=dir
is used, the link will be to the
containing directory rather than to the file itself (to facilitate
exploration).
max=
number.
In some cases, this package is not called by htgrep but by another
script that is responsible for setting the tags. You can inform
the package to use a different URL when generating new requests
by using the tag htgrep=
path.
This is used, for example, by
W3catalog
to hide the actual arguments to htgrep.
Finally, the tag linemode=yes
causes htgrep to retrieve
refer records on a line-by-line basis, if fields are separated by ^A
instead of a "\n". (This is mainly interesting for the
CUI library database.)
Summary of htgrep tags
file -- file to search isindex -- query string hdr -- header file (to preceed output) qry -- query file (alternative header for non-empty query) style -- [none/pre/ol/ul/dl] format of records max -- max records to return (default 250) grab -- [no/yes] convert URLs to hypertext (in plain text) refer -- [plain/abstract] format ftpstyle -- [file/dir] make link to ftp file or dir (for refer) linemode -- [no/yes] each record is a single line htgrep -- alternative URL to use for self-calls
/parscan
to /cgi-bin/parscan
,
they should be correctly interpreted by htgrep.
Note that htgrep takes the file to search directly from the URL.
Although the package takes pains to ensure that only files
visible to the http server may be searched, there is presently
no further support for access control. It is, however, possible
to restrict the set of files that may be searched through an
alternative interface that hard-wires the parameters to the
htgrep package for a variety of search engines.
See
cuisearch
for an example of such a front-end.