/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/conf/example9.config
ViewVC logotype

Annotation of /mitgcm.org/devel/buildweb/pkg/swish-e/conf/example9.config

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (hide annotations) (download) (vendor branch)
Fri Sep 20 19:47:30 2002 UTC (22 years, 10 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Importing web-site building process.

1 adcroft 1.1 # ----- Example 9 - Filtering PDF with "prog" -------
2     #
3     # Please see the swish-e documentation for
4     # information on configuration directives.
5     # Documentation is included with the swish-e
6     # distribution, and also can be found on-line
7     # at http://swish-e.org
8     #
9     #
10     # This example demonstrates how to use swish's
11     # "prog" document source feature to filter documents.
12     #
13     # The "prog" document source feature allows
14     # an external program to feed documents to
15     # swish, one after another. This allows you
16     # to index documents from any source (e.g. web, DBMS)
17     # and to filter and adjust the content before swish
18     # indexes the content.
19     #
20     # Using the "prog" method to filter documents requires more
21     # work to set up than using the "filters" described in
22     # example8.config because you must write a program to retrieve
23     # the documents and feed them to swish.
24     #
25     # On the otherhand, the "prog" method should be faster than the
26     # filter method in example8.config because swish doesn't need to fork
27     # itself and run an external program for each document to filter.
28     # This can be significant if you are using a perl script as a filter since
29     # the perl script must be compiled each time it is run. This "prog" method
30     # avoides that overhead.
31     #
32     # This example uses the example9.pl program. This program
33     # is very similar to the included DirTree.pl program found in
34     # the prog-bin directory. This program simple reads files from the
35     # file system, and passes their content onto swish if they are the correct
36     # type. PDF files are converted by the prog-bin/pdf2xml.pm module.
37     #
38     # The PDF info fields (e.g. author) are placed in xml tags
39     # which allows indexing the PDF info as MetaNames.
40     # By specifying metanemes you can limit searches by this PDF info.
41     #
42     # For this example, you will need the xpdf package.
43     # Type "perldoc pdf2xml" from the prog-bin directory for
44     # more information.
45     #
46     # Run this example as:
47     #
48     # swish-e -S prog -c example9.config
49     #
50     #---------------------------------------------------
51    
52     # Include our site-wide configuration settings:
53     IncludeConfigFile example4.config
54    
55    
56     # Define the program to run
57     IndexDir ./example9.pl
58    
59    
60     # Pass in the top-level directory to index
61     # (here we specify the current directory)
62     SwishProgParameters .
63    
64    
65     # Swish can index a number of different types of documents.
66     # .config are text, and .pdf are converted (filtered) to xml:
67     IndexContents TXT .config
68     IndexContents XML .pdf
69    
70    
71     # Since the pdf2xml module generates xml for the PDF info fields and
72     # for the PDF content, let's use MetaNames
73     # Instead of specifying each metaname, let's let swish do it automatically.
74     UndefinedMetaTags auto
75    
76    
77    
78     # Show what's happening
79    
80     IndexReport 3
81    
82    
83     # end of example

  ViewVC Help
Powered by ViewVC 1.1.22