/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/conf/example6.config
ViewVC logotype

Contents of /mitgcm.org/devel/buildweb/pkg/swish-e/conf/example6.config

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (show annotations) (download) (vendor branch)
Fri Sep 20 19:47:30 2002 UTC (22 years, 10 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Error occurred while calculating annotation data.
Importing web-site building process.

1 # ----- Example 6 - Spider using "prog" feature -------
2 #
3 # Please see the swish-e documentation for
4 # information on configuration directives.
5 # Documentation is included with the swish-e
6 # distribution, and also can be found on-line
7 # at http://swish-e.org
8 #
9 #
10 # This example demonstrates how to use the
11 # new (as of 2.2) "prog" document source feature
12 # to spider a webserver.
13 #
14 # The "prog" document source feature allows
15 # an external program to feed documents to
16 # swish, one after another. This allows you
17 # to index documents from any source (e.g. web, DBMS)
18 # and to filter and adjust the content before swish
19 # indexes the content.
20 #
21 # This example uses the provided spider.pl program
22 # to spider a remote web server. This spider offers
23 # more features than the "http" spider method shown
24 # in example7.config.
25 #
26 # ** Please don't test with this exact config **
27 # spider your own web server
28 #
29 # Indexing (spidering) is started with the following
30 # command issued from the "conf" directory:
31 #
32 # swish-e -S prog -c example6.config
33 #
34 # Note: You should have the current Bundle::LWP bundle
35 # of perl modules installed. This was tested with:
36 # libwww-perl-5.53
37 # Run "perldoc spider.pl" in the prog-bin directory for
38 # more information.
39 #
40 # ** Do not spider a web server without permission **
41 #
42 #---------------------------------------------------
43
44 # Include our site-wide configuration settings:
45
46 IncludeConfigFile example4.config
47
48 # Specify the program to run
49 IndexDir ../prog-bin/spider.pl
50
51
52 # When running under the "prog" document source method you can
53 # pass a list of parameters to the program (specified with -i or IndexDir).
54
55 # If a parameter is passed to spider.pl, it will use that as the configuration
56 # file.
57
58 # As a special case, the word "default" followed by URL(s).
59 # In this case the spider will use default settings to spider the provided URLs.
60
61 SwishProgParameters default http://swish-e.org
62
63 # Note: the default used by spider.pl is SwishSpiderConfig.pl.
64 # See prog-bin/SwishSpiderConfig.pl for examples
65 # that include filtering PDF and MS Word documents.
66
67 # Tell swish that about how to parse the content
68 DefaultContents HTML
69 IndexContents HTML .htm .html
70 IndexContents TXT .txt .conf
71
72
73
74 # Just to make it interesting, let's modify the URL that get's indexed:
75 # replace http://swish-e.org/ => http:/localhost/
76
77 ReplaceRules replace swish-e.org localhost
78
79
80 # end of example

  ViewVC Help
Powered by ViewVC 1.1.22