/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/pod/README.pod
ViewVC logotype

Annotation of /mitgcm.org/devel/buildweb/pkg/swish-e/pod/README.pod

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (hide annotations) (download) (vendor branch)
Fri Sep 20 19:47:29 2002 UTC (22 years, 10 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Importing web-site building process.

1 adcroft 1.1 =head1 NAME
2    
3     The Swish-e README File
4    
5     =head1 What is Swish-e?
6    
7     Swish-e is B<S>imple B<W>eb B<I>ndexing B<S>ystem for B<H>umans - B<E>nhanced.
8     Swish-e can quickly and easily index directories of files or remote web sites
9     and search the generated indexes.
10    
11     Swish-e it extremely fast in both indexing and searching, highly configurable,
12     and can be seamlessly integrated with existing web sites to maintain a consistent design.
13     Swish-e can index web pages, but can just as easily index text files, mailing list archives,
14     or data stored in a relational database.
15    
16     Swish-e version 2.2 represents a major rewrite of the code and the
17     addition of many new features. Memory requirements for indexing have
18     been reduced, and indexing speed is significantly improved from previous versions.
19     New features allow more control over indexing, better document parsing, improved indexing
20     and searching logic, better filter code, and the ability to index from any data source.
21    
22     Swish-e is not a "turn-key" indexing and searching solution. The Swish-e distribution contains
23     most of the parts to create such a system, but you need to put the parts together as best meets your needs.
24     You will need to configure Swish-e to index your documents, create an index by running Swish-e,
25     and setup an interface such as a CGI script (a script is included). Swish uses helper programs
26     to index documents of types that Swish-e cannot natively index. These programs may need to be installed
27     separately from Swish-e.
28    
29     Swish-e is an Open Source (see: http://opensource.org ) program supported by developers and a large group of users.
30     Please take time to join the Swish-e discussion list at http://Swish-e.org.
31    
32    
33     =head2 Key features
34    
35     =over 4
36    
37     =item *
38    
39     Quickly index a large number of documents in different formats
40     including text, HTML, and XML
41    
42     =item *
43    
44     Use "filters" to index other types of files such as PDF, gzip, or
45     Postscript.
46    
47     =item *
48    
49     Includes a web spider for indexing remote documents over HTTP.
50     Follows Robots Exclusion Rules (including META tags).
51    
52     =item *
53    
54     Use an external program to supply documents to Swish-e, such as an
55     advanced spider for your web server or a program to read and format
56     records from a relational database.
57    
58     =item *
59    
60     Document "properties" (some subset of the source document, usually defined
61     as a META or XML elements) may be stored in the index and returned with
62     search results
63    
64     =item *
65    
66     Document summaries can be returned with each search
67    
68     =item *
69    
70     Word stemming, soundex, metaphone, and double-metaphone indexing for "fuzzy" searching
71    
72     =item *
73    
74     Phrase searching and wildcard searching
75    
76     =item *
77    
78     Limit searches to HTML links
79    
80     =item *
81    
82     Use powerful Regular Expressions to select documents for indexing or exclusion
83    
84     =item *
85    
86     Easily limit searches to parts or all of your web site
87    
88     =item *
89    
90     Results can be sorted by relevance or by any number of properties
91     in ascending or descending order
92    
93     =item *
94    
95     Limit searches to parts of documents such as certain HTML tags
96     (META, TITLE, comments, etc.) or to XML elements.
97    
98     =item *
99    
100     Can report structural errors in your XML and HTML documents
101    
102     =item *
103    
104     Index file is portable between platforms.
105    
106     =item *
107    
108     A Swish-e library is provided to allow embedding Swish-e into your applications.
109     A Perl module is available that provides a standard API for accessing Swish-e.
110    
111     =item *
112    
113     Includes example search scripts
114    
115     =item *
116    
117     Swish-e is fast.
118    
119     =item *
120    
121     It's open source and FREE! You can customize Swish-e and you can
122     contribute your fancy new features to the project.
123    
124     =item *
125    
126     Supported by on-line user and developer groups
127    
128     =back
129    
130    
131     =head1 Where do I get Swish-e?
132    
133     The current version of Swish-e can be found at:
134    
135     http://Swish-e.org
136    
137     Please make sure you use a current version of Swish-e.
138    
139     Information about Windows binary distributions can also be found at
140     this site.
141    
142     =head1 How Do I Install Swish-e?
143    
144     Read the L<INSTALL|INSTALL> page.
145    
146     Building from source is recommended. On most platforms Swish-e should build without problems.
147     Information on building for VMS and Win32 can be found in sub-directories of the C<src> directory.
148     Check the Swish-e site for information about binary distributions (such as for Windows).
149    
150     In addition to the INSTALL page, make sure you read the
151     L<SWISH-FAQ|SWISH-FAQ> page if you have any questions, or to get an idea
152     of questions that you might someday ask.
153    
154     Problems or questions about installing Swish-e should be directed to the Swish-e discussion list (see the
155     Swish-e web site at http://Swish-e.org).
156    
157    
158     =head1 The Swish-e Documentation
159    
160     Documetation is provided in the Swish-e distribution package in two forms,
161     POD (Plain Old Documentation), and in html format. The POD documentation
162     is in the F<pod> directory, and the HTML documentation is in the F<html>
163     directory, of course.
164    
165     If your system includes the required support files and programs, the
166     distribution make files can also generate the documentation in these
167     formats:
168    
169     Postscript
170     PDF (Adobe Acrobat)
171     system man pages
172    
173     You may also build a "split" version of the documentation where each
174     topic heading is a separate web page. Building the split version also
175     creates a Swish-e index of the documentation that makes the documentation
176     searchable via the included Perl CGI program.
177    
178     Building these other forms of documentation require additional helper
179     applications -- most modern Linux distributions will include all that's
180     needed (at least mine does...). You shouldn't have a problem if you have
181     kept your Perl and Perl libraries up to date.
182    
183     Online documentation can be found at the Swish-e web site listed above.
184    
185     See L<INSTALL|INSTALL> for information on creating the PDF and Postscript
186     versions of the documentation, and for information on installing the
187     SWISH-* documentation as Unix man(1) pages.
188    
189    
190     =head2 How do I read the Swish-e documentation?
191    
192     The Swish-e documentation included with the distribution is in POD and
193     HTML formats. The POD documentation can be found in the F<pod> directory,
194     and the HTML documentation can be found in the F<html> directory.
195    
196     To view the HTML documentation point your browser to the
197     F<html/index.html> file.
198    
199     The POD documentation is displayed by the "perldoc" command that is
200     included with every Perl installation. For example, to view the Swish-e
201     installation documentation page called "INSTALL", type
202    
203     perldoc pod/INSTALL
204    
205     or to make life easier,
206    
207     cd pod
208     perldoc INSTALL
209     perldoc SWISH-RUN
210    
211     Complain to your system administrator if the C<perldoc> command is not
212     available on your machine.
213    
214     =head2 Included Documentation
215    
216     The following documentation is included in this Swish-e distribution.
217    
218     If you are new to Swish-e read the L<INSTALL|INSTALL> page to get Swish-e installed
219     and tested. Work through the example in shown in the L<INSTALL|INSTALL> page, and
220     the examples in the F<conf> directory. Also review the L<SWISH-FAQ|SWISH-FAQ>.
221    
222     =over 4
223    
224     =item *
225    
226     L<README|README> - This file
227    
228     =item *
229    
230     L<INSTALL|INSTALL> - Installation and basic usage instructions
231    
232     =item *
233    
234     L<SWISH-CONFIG|SWISH-CONFIG> - Configuration File Directives
235    
236     =item *
237    
238     L<SWISH-RUN|SWISH-RUN> - Running Swish and Command Line Switches
239    
240     =item *
241    
242     L<SWISH-SEARCH|SWISH-SEARCH> - All about Searching with Swish-e
243    
244     =item *
245    
246     L<SWISH-FAQ|SWISH-FAQ> - Common questions, and some answers
247    
248     =item *
249    
250     L<SWISH-LIBRARY|SWISH-LIBRARY> - Interface to the Swish-e C library
251    
252     =item *
253    
254     L<SWISH-PERL|SWISH-PERL> - Instructions for using the Perl library
255    
256     =item *
257    
258     L<CHANGES|CHANGES> - List of feature changes and bug fixes
259    
260     =item *
261    
262     L<SWISH-BUGS|SWISH-BUGS> - List of known bugs in the release
263    
264     =back
265    
266     =head2 Document Generation
267    
268     The Swish-e documentation in HTML format was created with Pod::HtmlPsPdf,
269     a package of Perl modules written and/or modified by Stas Bekman to
270     automate the conversion of documents in pod format (see perldoc perlpod)
271     to HTML, Postscript, and PDF. A slightly modified version of this package
272     is include with the Swish-e distribution and used for building the HTML.
273     As distributed, Swish-e contains only the pod and HTML documentation.
274     See L<INSTALL|INSTALL> for instructions on creating man(1), Postscript,
275     and PDF formats.
276    
277     Thanks, Stas, for your help!
278    
279     =head1 What's included in the Swish-e distribution?
280    
281     Here's an overview of the directories included in the Swish-e
282     distribution:
283    
284     =over 3
285    
286     =item conf/
287    
288     Example Swish-e configuration setups to help you get started.
289     After reading the L<INSTALL|INSTALL> page, and its included example, review
290     the sample configuration in this directory.
291    
292     =item conf/stopwords
293    
294     In the C<conf/stopwords> sub-directory are a number of stopword files for different
295     languages. Use of stopwords is not required with Swish-e.
296    
297     =item doc/
298    
299     Contains files required for building the HTML, PDF, and Postscript
300     documentation.
301    
302     =item example/
303    
304     This contains a sample CGI script (F<swish.cgi>) for searching with Swish-e.
305     Documentation for using F<swish.cgi> are included within the script. Type:
306    
307     perldoc example/swish.cgi
308    
309     from the top-level directory where the Swish-e distribution was unpacked.
310    
311     =item filter-bin/
312    
313     Sample programs to use with Swish-e's "filters". Examples include PDF,
314     MS Word, and binary strings filters.
315     Filters often require installing separate document conversion programs.
316    
317     =item html/
318    
319     The documentation in HTML format.
320    
321     =item perl/
322    
323     The Perl interface to the Swish-e C library. This Perl module provides direct access to
324     Swish-e from within your Perl programs. See the F<perl/README> file for more information.
325    
326     =item pod/
327    
328     The source for all documentation in perldoc (pod) format.
329    
330     =item prog-bin/
331    
332     Example programs and modules to use with the "prog" document source
333     access method. Examples include a web spider, a program to index directly from
334     a MySQL database, and a program to recurse a directory tree.
335     Example Perl modules are provided for converting PDF and MS-Word documents
336     into a format usable by Swish-e. See F<prog-bin/README> for an overview of the
337     programs and modules, and check each file for included documentation.
338    
339     The F<prog-bin/spider.pl> program is a web spider program with many features.
340     It contains its own documentation. Type:
341    
342     perldoc example/spider.pl
343    
344     from the top-level directory where the Swish-e distribution was unpacked.
345    
346     The "prog" document source feature is very powerful, but can be a challange to
347     set up when first using Swish-e. Please contact the Swish-e disussion list
348     if you have any questions.
349    
350     =item src/
351    
352     This directory contains the source code for Swish-e. OS-specific
353     directories are also found here.
354    
355     =item tests/
356    
357     The documents used for running C<make test>.
358    
359    
360     =back
361    
362    
363     =head1 Where do I get help with Swish-e?
364    
365     If you need help with installing or using Swish-e please subscribe to
366     the Swish-e mailing list. Visit the Swish-e web site listed above
367     for information on subscribing to the mailing list.
368    
369     Before posting any questions please read
370     L<QUESTIONS AND TROUBLESHOOTING|INSTALL/"QUESTIONS AND TROUBLESHOOTING">
371     in the L<INSTALL|INSTALL> documentation page.
372    
373     =head1 Speling mistakes
374    
375     Please contact the Swish-e list with corrections to this documentation.
376     Any help in cleaning up the docs will be appreciated!
377    
378     Any patches should be made against the .pod files, not the .html files.
379    
380     =head1 Swish-e Development
381    
382     Swish-e is currently being developed as an open source project on
383     SourceForge http://sourceforge.net.
384    
385     Contact the Swish-e list for questions.
386    
387     =head1 Swish-e's History
388    
389     SWISH was created by Kevin Hughes to fill the need of the growing number
390     of Web administrators on the Internet - many of the indexing systems were
391     not well documented, were hard to use and install, and were too complex
392     for their own good. The system was widely used for several years, long
393     enough to collect some bug fixes and requests for enhancements.
394    
395     In Fall 1996, The Library of UC Berkeley received permission from
396     Kevin Hughes to implement bug fixes and enhancements to the original
397     binary. The result is Swish-enhanced or Swish-e, brought to you by the
398     Swish-e Development Team.
399    
400     =head1 Document Info
401    
402     Each document in the Swish-e distribution contains this section.
403     It refers only to the specific page it's located in, and not to the
404     Swish-e program or the documentation as a whole.
405    
406     $Id: README.pod,v 1.11 2002/08/20 22:24:08 whmoseley Exp $
407    
408     .

  ViewVC Help
Powered by ViewVC 1.1.22