1 |
adcroft |
1.1 |
These are example scripts that you can use the with "prog" document source |
2 |
|
|
feature of Swish-e. |
3 |
|
|
|
4 |
|
|
The "prog" document source feature of Swish-e allow you to index any type of |
5 |
|
|
document, provided you can convert the document into a format that Swish-e |
6 |
|
|
can parse (text, html, or xml). |
7 |
|
|
|
8 |
|
|
spider.pl |
9 |
|
|
Working example of a web spider. This program is a full-featured |
10 |
|
|
spider, that is fully customizable through its configuration file. |
11 |
|
|
|
12 |
|
|
Type perldoc spider.pl from the prog-bin directory for documentation. |
13 |
|
|
|
14 |
|
|
SwishSpiderConfig.pl |
15 |
|
|
Example configuration file for the spider.pl program |
16 |
|
|
|
17 |
|
|
file.pl |
18 |
|
|
A very simple examle of a program that feeds documents to swish. |
19 |
|
|
Its purpose it to demonstrate how to write a program for use with |
20 |
|
|
Swish-e's "prog" input method. |
21 |
|
|
|
22 |
|
|
|
23 |
|
|
DirTree.pl |
24 |
|
|
A slightly more advanced example that reads a directory tree and indexes |
25 |
|
|
a few files types. Uses the pdf2xml module for pdf files. |
26 |
|
|
Its purpose it to demonstrate how to write a program for use with |
27 |
|
|
Swish-e's "prog" input method. |
28 |
|
|
|
29 |
|
|
|
30 |
|
|
MySQL.pl |
31 |
|
|
Another simple example that shows how to index data stored in a |
32 |
|
|
MySQL database. Instructions are included on how to configure the |
33 |
|
|
swish.cgi program |
34 |
|
|
|
35 |
|
|
|
36 |
|
|
index_hypermail.pl |
37 |
|
|
An example program for indexing mailing list archives that are created |
38 |
|
|
with the popular Hypermail program. |
39 |
|
|
|
40 |
|
|
|
41 |
|
|
pdf2xml.pm and pdf2html.pm |
42 |
|
|
Perl modules to convert pdf to xml documents for indexing. |
43 |
|
|
Requires the pdftotext program. Type perldoc pdf2xml.pm |
44 |
|
|
or perldoc pdf2html.pm from the prog-bin directory for documentation. |
45 |
|
|
|
46 |
|
|
|
47 |
|
|
doc2txt.pm |
48 |
|
|
Perl module to convert MS Word documents to text. |
49 |
|
|
Requires the catdoc program. Type perldoc doc2txt.pm |
50 |
|
|
from the prog-bin directory for documentation. |