/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/prog-bin/README
ViewVC logotype

Annotation of /mitgcm.org/devel/buildweb/pkg/swish-e/prog-bin/README

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (hide annotations) (download) (vendor branch)
Fri Sep 20 19:47:30 2002 UTC (22 years, 10 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Importing web-site building process.

1 adcroft 1.1 These are example scripts that you can use the with "prog" document source
2     feature of Swish-e.
3    
4     The "prog" document source feature of Swish-e allow you to index any type of
5     document, provided you can convert the document into a format that Swish-e
6     can parse (text, html, or xml).
7    
8     spider.pl
9     Working example of a web spider. This program is a full-featured
10     spider, that is fully customizable through its configuration file.
11    
12     Type perldoc spider.pl from the prog-bin directory for documentation.
13    
14     SwishSpiderConfig.pl
15     Example configuration file for the spider.pl program
16    
17     file.pl
18     A very simple examle of a program that feeds documents to swish.
19     Its purpose it to demonstrate how to write a program for use with
20     Swish-e's "prog" input method.
21    
22    
23     DirTree.pl
24     A slightly more advanced example that reads a directory tree and indexes
25     a few files types. Uses the pdf2xml module for pdf files.
26     Its purpose it to demonstrate how to write a program for use with
27     Swish-e's "prog" input method.
28    
29    
30     MySQL.pl
31     Another simple example that shows how to index data stored in a
32     MySQL database. Instructions are included on how to configure the
33     swish.cgi program
34    
35    
36     index_hypermail.pl
37     An example program for indexing mailing list archives that are created
38     with the popular Hypermail program.
39    
40    
41     pdf2xml.pm and pdf2html.pm
42     Perl modules to convert pdf to xml documents for indexing.
43     Requires the pdftotext program. Type perldoc pdf2xml.pm
44     or perldoc pdf2html.pm from the prog-bin directory for documentation.
45    
46    
47     doc2txt.pm
48     Perl module to convert MS Word documents to text.
49     Requires the catdoc program. Type perldoc doc2txt.pm
50     from the prog-bin directory for documentation.

  ViewVC Help
Powered by ViewVC 1.1.22