/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/README
ViewVC logotype

Annotation of /mitgcm.org/devel/buildweb/pkg/swish-e/README

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (hide annotations) (download) (vendor branch)
Fri Sep 20 19:47:29 2002 UTC (21 years, 7 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Importing web-site building process.

1 adcroft 1.1 NAME
2     The Swish-e README File
3    
4     What is Swish-e?
5     Swish-e is Simple Web Indexing System for Humans - Enhanced. Swish-e can
6     quickly and easily index directories of files or remote web sites and
7     search the generated indexes.
8    
9     Swish-e it extremely fast in both indexing and searching, highly
10     configurable, and can be seamlessly integrated with existing web sites
11     to maintain a consistent design. Swish-e can index web pages, but can
12     just as easily index text files, mailing list archives, or data stored
13     in a relational database.
14    
15     Swish-e version 2.2 represents a major rewrite of the code and the
16     addition of many new features. Memory requirements for indexing have
17     been reduced, and indexing speed is significantly improved from previous
18     versions. New features allow more control over indexing, better document
19     parsing, improved indexing and searching logic, better filter code, and
20     the ability to index from any data source.
21    
22     Swish-e is not a "turn-key" indexing and searching solution. The Swish-e
23     distribution contains most of the parts to create such a system, but you
24     need to put the parts together as best meets your needs. You will need
25     to configure Swish-e to index your documents, create an index by running
26     Swish-e, and setup an interface such as a CGI script (a script is
27     included). Swish uses helper programs to index documents of types that
28     Swish-e cannot natively index. These programs may need to be installed
29     separately from Swish-e.
30    
31     Swish-e is an Open Source (see: http://opensource.org ) program
32     supported by developers and a large group of users. Please take time to
33     join the Swish-e discussion list at http://Swish-e.org.
34    
35     Key features
36    
37     * Quickly index a large number of documents in different formats
38     including text, HTML, and XML
39    
40     * Use "filters" to index other types of files such as PDF, gzip, or
41     Postscript.
42    
43     * Includes a web spider for indexing remote documents over HTTP.
44     Follows Robots Exclusion Rules (including META tags).
45    
46     * Use an external program to supply documents to Swish-e, such as an
47     advanced spider for your web server or a program to read and format
48     records from a relational database.
49    
50     * Document "properties" (some subset of the source document, usually
51     defined as a META or XML elements) may be stored in the index and
52     returned with search results
53    
54     * Document summaries can be returned with each search
55    
56     * Word stemming, soundex, metaphone, and double-metaphone indexing for
57     "fuzzy" searching
58    
59     * Phrase searching and wildcard searching
60    
61     * Limit searches to HTML links
62    
63     * Use powerful Regular Expressions to select documents for indexing or
64     exclusion
65    
66     * Easily limit searches to parts or all of your web site
67    
68     * Results can be sorted by relevance or by any number of properties in
69     ascending or descending order
70    
71     * Limit searches to parts of documents such as certain HTML tags
72     (META, TITLE, comments, etc.) or to XML elements.
73    
74     * Can report structural errors in your XML and HTML documents
75    
76     * Index file is portable between platforms.
77    
78     * A Swish-e library is provided to allow embedding Swish-e into your
79     applications. A Perl module is available that provides a standard
80     API for accessing Swish-e.
81    
82     * Includes example search scripts
83    
84     * Swish-e is fast.
85    
86     * It's open source and FREE! You can customize Swish-e and you can
87     contribute your fancy new features to the project.
88    
89     * Supported by on-line user and developer groups
90    
91     Where do I get Swish-e?
92     The current version of Swish-e can be found at:
93    
94     http://Swish-e.org
95    
96     Please make sure you use a current version of Swish-e.
97    
98     Information about Windows binary distributions can also be found at this
99     site.
100    
101     How Do I Install Swish-e?
102     Read the INSTALL page.
103    
104     Building from source is recommended. On most platforms Swish-e should
105     build without problems. Information on building for VMS and Win32 can be
106     found in sub-directories of the "src" directory. Check the Swish-e site
107     for information about binary distributions (such as for Windows).
108    
109     In addition to the INSTALL page, make sure you read the SWISH-FAQ page
110     if you have any questions, or to get an idea of questions that you might
111     someday ask.
112    
113     Problems or questions about installing Swish-e should be directed to the
114     Swish-e discussion list (see the Swish-e web site at
115     http://Swish-e.org).
116    
117     The Swish-e Documentation
118     Documetation is provided in the Swish-e distribution package in two
119     forms, POD (Plain Old Documentation), and in html format. The POD
120     documentation is in the pod directory, and the HTML documentation is in
121     the html directory, of course.
122    
123     If your system includes the required support files and programs, the
124     distribution make files can also generate the documentation in these
125     formats:
126    
127     Postscript
128     PDF (Adobe Acrobat)
129     system man pages
130    
131     You may also build a "split" version of the documentation where each
132     topic heading is a separate web page. Building the split version also
133     creates a Swish-e index of the documentation that makes the
134     documentation searchable via the included Perl CGI program.
135    
136     Building these other forms of documentation require additional helper
137     applications -- most modern Linux distributions will include all that's
138     needed (at least mine does...). You shouldn't have a problem if you have
139     kept your Perl and Perl libraries up to date.
140    
141     Online documentation can be found at the Swish-e web site listed above.
142    
143     See INSTALL for information on creating the PDF and Postscript versions
144     of the documentation, and for information on installing the SWISH-*
145     documentation as Unix man(1) pages.
146    
147     How do I read the Swish-e documentation?
148    
149     The Swish-e documentation included with the distribution is in POD and
150     HTML formats. The POD documentation can be found in the pod directory,
151     and the HTML documentation can be found in the html directory.
152    
153     To view the HTML documentation point your browser to the html/index.html
154     file.
155    
156     The POD documentation is displayed by the "perldoc" command that is
157     included with every Perl installation. For example, to view the Swish-e
158     installation documentation page called "INSTALL", type
159    
160     perldoc pod/INSTALL
161    
162     or to make life easier,
163    
164     cd pod
165     perldoc INSTALL
166     perldoc SWISH-RUN
167    
168     Complain to your system administrator if the "perldoc" command is not
169     available on your machine.
170    
171     Included Documentation
172    
173     The following documentation is included in this Swish-e distribution.
174    
175     If you are new to Swish-e read the INSTALL page to get Swish-e installed
176     and tested. Work through the example in shown in the INSTALL page, and
177     the examples in the conf directory. Also review the SWISH-FAQ.
178    
179     * README - This file
180    
181     * INSTALL - Installation and basic usage instructions
182    
183     * SWISH-CONFIG - Configuration File Directives
184    
185     * SWISH-RUN - Running Swish and Command Line Switches
186    
187     * SWISH-SEARCH - All about Searching with Swish-e
188    
189     * SWISH-FAQ - Common questions, and some answers
190    
191     * SWISH-LIBRARY - Interface to the Swish-e C library
192    
193     * SWISH-PERL - Instructions for using the Perl library
194    
195     * CHANGES - List of feature changes and bug fixes
196    
197     * SWISH-BUGS - List of known bugs in the release
198    
199     Document Generation
200    
201     The Swish-e documentation in HTML format was created with
202     Pod::HtmlPsPdf, a package of Perl modules written and/or modified by
203     Stas Bekman to automate the conversion of documents in pod format (see
204     perldoc perlpod) to HTML, Postscript, and PDF. A slightly modified
205     version of this package is include with the Swish-e distribution and
206     used for building the HTML. As distributed, Swish-e contains only the
207     pod and HTML documentation. See INSTALL for instructions on creating
208     man(1), Postscript, and PDF formats.
209    
210     Thanks, Stas, for your help!
211    
212     What's included in the Swish-e distribution?
213     Here's an overview of the directories included in the Swish-e
214     distribution:
215    
216     conf/
217     Example Swish-e configuration setups to help you get started. After
218     reading the INSTALL page, and its included example, review the sample
219     configuration in this directory.
220    
221     conf/stopwords
222     In the "conf/stopwords" sub-directory are a number of stopword files
223     for different languages. Use of stopwords is not required with
224     Swish-e.
225    
226     doc/
227     Contains files required for building the HTML, PDF, and Postscript
228     documentation.
229    
230     example/
231     This contains a sample CGI script (swish.cgi) for searching with
232     Swish-e. Documentation for using swish.cgi are included within the
233     script. Type:
234    
235     perldoc example/swish.cgi
236    
237     from the top-level directory where the Swish-e distribution was
238     unpacked.
239    
240     filter-bin/
241     Sample programs to use with Swish-e's "filters". Examples include
242     PDF, MS Word, and binary strings filters. Filters often require
243     installing separate document conversion programs.
244    
245     html/
246     The documentation in HTML format.
247    
248     perl/
249     The Perl interface to the Swish-e C library. This Perl module
250     provides direct access to Swish-e from within your Perl programs. See
251     the perl/README file for more information.
252    
253     pod/
254     The source for all documentation in perldoc (pod) format.
255    
256     prog-bin/
257     Example programs and modules to use with the "prog" document source
258     access method. Examples include a web spider, a program to index
259     directly from a MySQL database, and a program to recurse a directory
260     tree. Example Perl modules are provided for converting PDF and
261     MS-Word documents into a format usable by Swish-e. See
262     prog-bin/README for an overview of the programs and modules, and
263     check each file for included documentation.
264    
265     The prog-bin/spider.pl program is a web spider program with many
266     features. It contains its own documentation. Type:
267    
268     perldoc example/spider.pl
269    
270     from the top-level directory where the Swish-e distribution was
271     unpacked.
272    
273     The "prog" document source feature is very powerful, but can be a
274     challange to set up when first using Swish-e. Please contact the
275     Swish-e disussion list if you have any questions.
276    
277     src/
278     This directory contains the source code for Swish-e. OS-specific
279     directories are also found here.
280    
281     tests/
282     The documents used for running "make test".
283    
284     Where do I get help with Swish-e?
285     If you need help with installing or using Swish-e please subscribe to
286     the Swish-e mailing list. Visit the Swish-e web site listed above for
287     information on subscribing to the mailing list.
288    
289     Before posting any questions please read QUESTIONS AND TROUBLESHOOTING
290     in the INSTALL documentation page.
291    
292     Speling mistakes
293     Please contact the Swish-e list with corrections to this documentation.
294     Any help in cleaning up the docs will be appreciated!
295    
296     Any patches should be made against the .pod files, not the .html files.
297    
298     Swish-e Development
299     Swish-e is currently being developed as an open source project on
300     SourceForge http://sourceforge.net.
301    
302     Contact the Swish-e list for questions.
303    
304     Swish-e's History
305     SWISH was created by Kevin Hughes to fill the need of the growing number
306     of Web administrators on the Internet - many of the indexing systems
307     were not well documented, were hard to use and install, and were too
308     complex for their own good. The system was widely used for several
309     years, long enough to collect some bug fixes and requests for
310     enhancements.
311    
312     In Fall 1996, The Library of UC Berkeley received permission from Kevin
313     Hughes to implement bug fixes and enhancements to the original binary.
314     The result is Swish-enhanced or Swish-e, brought to you by the Swish-e
315     Development Team.
316    
317     Document Info
318     Each document in the Swish-e distribution contains this section. It
319     refers only to the specific page it's located in, and not to the Swish-e
320     program or the documentation as a whole.
321    
322     $Id: README.pod,v 1.11 2002/08/20 22:24:08 whmoseley Exp $
323    
324     .

  ViewVC Help
Powered by ViewVC 1.1.22