1 |
These are example scripts that you can use the with "prog" document source |
2 |
feature of Swish-e. |
3 |
|
4 |
The "prog" document source feature of Swish-e allow you to index any type of |
5 |
document, provided you can convert the document into a format that Swish-e |
6 |
can parse (text, html, or xml). |
7 |
|
8 |
spider.pl |
9 |
Working example of a web spider. This program is a full-featured |
10 |
spider, that is fully customizable through its configuration file. |
11 |
|
12 |
Type perldoc spider.pl from the prog-bin directory for documentation. |
13 |
|
14 |
SwishSpiderConfig.pl |
15 |
Example configuration file for the spider.pl program |
16 |
|
17 |
file.pl |
18 |
A very simple examle of a program that feeds documents to swish. |
19 |
Its purpose it to demonstrate how to write a program for use with |
20 |
Swish-e's "prog" input method. |
21 |
|
22 |
|
23 |
DirTree.pl |
24 |
A slightly more advanced example that reads a directory tree and indexes |
25 |
a few files types. Uses the pdf2xml module for pdf files. |
26 |
Its purpose it to demonstrate how to write a program for use with |
27 |
Swish-e's "prog" input method. |
28 |
|
29 |
|
30 |
MySQL.pl |
31 |
Another simple example that shows how to index data stored in a |
32 |
MySQL database. Instructions are included on how to configure the |
33 |
swish.cgi program |
34 |
|
35 |
|
36 |
index_hypermail.pl |
37 |
An example program for indexing mailing list archives that are created |
38 |
with the popular Hypermail program. |
39 |
|
40 |
|
41 |
pdf2xml.pm and pdf2html.pm |
42 |
Perl modules to convert pdf to xml documents for indexing. |
43 |
Requires the pdftotext program. Type perldoc pdf2xml.pm |
44 |
or perldoc pdf2html.pm from the prog-bin directory for documentation. |
45 |
|
46 |
|
47 |
doc2txt.pm |
48 |
Perl module to convert MS Word documents to text. |
49 |
Requires the catdoc program. Type perldoc doc2txt.pm |
50 |
from the prog-bin directory for documentation. |