/[MITgcm]/mitgcm.org/devel/buildweb/pkg/swish-e/README
ViewVC logotype

Contents of /mitgcm.org/devel/buildweb/pkg/swish-e/README

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.1.1.1 - (show annotations) (download) (vendor branch)
Fri Sep 20 19:47:29 2002 UTC (21 years, 7 months ago) by adcroft
Branch: Import, MAIN
CVS Tags: baseline, HEAD
Changes since 1.1: +0 -0 lines
Importing web-site building process.

1 NAME
2 The Swish-e README File
3
4 What is Swish-e?
5 Swish-e is Simple Web Indexing System for Humans - Enhanced. Swish-e can
6 quickly and easily index directories of files or remote web sites and
7 search the generated indexes.
8
9 Swish-e it extremely fast in both indexing and searching, highly
10 configurable, and can be seamlessly integrated with existing web sites
11 to maintain a consistent design. Swish-e can index web pages, but can
12 just as easily index text files, mailing list archives, or data stored
13 in a relational database.
14
15 Swish-e version 2.2 represents a major rewrite of the code and the
16 addition of many new features. Memory requirements for indexing have
17 been reduced, and indexing speed is significantly improved from previous
18 versions. New features allow more control over indexing, better document
19 parsing, improved indexing and searching logic, better filter code, and
20 the ability to index from any data source.
21
22 Swish-e is not a "turn-key" indexing and searching solution. The Swish-e
23 distribution contains most of the parts to create such a system, but you
24 need to put the parts together as best meets your needs. You will need
25 to configure Swish-e to index your documents, create an index by running
26 Swish-e, and setup an interface such as a CGI script (a script is
27 included). Swish uses helper programs to index documents of types that
28 Swish-e cannot natively index. These programs may need to be installed
29 separately from Swish-e.
30
31 Swish-e is an Open Source (see: http://opensource.org ) program
32 supported by developers and a large group of users. Please take time to
33 join the Swish-e discussion list at http://Swish-e.org.
34
35 Key features
36
37 * Quickly index a large number of documents in different formats
38 including text, HTML, and XML
39
40 * Use "filters" to index other types of files such as PDF, gzip, or
41 Postscript.
42
43 * Includes a web spider for indexing remote documents over HTTP.
44 Follows Robots Exclusion Rules (including META tags).
45
46 * Use an external program to supply documents to Swish-e, such as an
47 advanced spider for your web server or a program to read and format
48 records from a relational database.
49
50 * Document "properties" (some subset of the source document, usually
51 defined as a META or XML elements) may be stored in the index and
52 returned with search results
53
54 * Document summaries can be returned with each search
55
56 * Word stemming, soundex, metaphone, and double-metaphone indexing for
57 "fuzzy" searching
58
59 * Phrase searching and wildcard searching
60
61 * Limit searches to HTML links
62
63 * Use powerful Regular Expressions to select documents for indexing or
64 exclusion
65
66 * Easily limit searches to parts or all of your web site
67
68 * Results can be sorted by relevance or by any number of properties in
69 ascending or descending order
70
71 * Limit searches to parts of documents such as certain HTML tags
72 (META, TITLE, comments, etc.) or to XML elements.
73
74 * Can report structural errors in your XML and HTML documents
75
76 * Index file is portable between platforms.
77
78 * A Swish-e library is provided to allow embedding Swish-e into your
79 applications. A Perl module is available that provides a standard
80 API for accessing Swish-e.
81
82 * Includes example search scripts
83
84 * Swish-e is fast.
85
86 * It's open source and FREE! You can customize Swish-e and you can
87 contribute your fancy new features to the project.
88
89 * Supported by on-line user and developer groups
90
91 Where do I get Swish-e?
92 The current version of Swish-e can be found at:
93
94 http://Swish-e.org
95
96 Please make sure you use a current version of Swish-e.
97
98 Information about Windows binary distributions can also be found at this
99 site.
100
101 How Do I Install Swish-e?
102 Read the INSTALL page.
103
104 Building from source is recommended. On most platforms Swish-e should
105 build without problems. Information on building for VMS and Win32 can be
106 found in sub-directories of the "src" directory. Check the Swish-e site
107 for information about binary distributions (such as for Windows).
108
109 In addition to the INSTALL page, make sure you read the SWISH-FAQ page
110 if you have any questions, or to get an idea of questions that you might
111 someday ask.
112
113 Problems or questions about installing Swish-e should be directed to the
114 Swish-e discussion list (see the Swish-e web site at
115 http://Swish-e.org).
116
117 The Swish-e Documentation
118 Documetation is provided in the Swish-e distribution package in two
119 forms, POD (Plain Old Documentation), and in html format. The POD
120 documentation is in the pod directory, and the HTML documentation is in
121 the html directory, of course.
122
123 If your system includes the required support files and programs, the
124 distribution make files can also generate the documentation in these
125 formats:
126
127 Postscript
128 PDF (Adobe Acrobat)
129 system man pages
130
131 You may also build a "split" version of the documentation where each
132 topic heading is a separate web page. Building the split version also
133 creates a Swish-e index of the documentation that makes the
134 documentation searchable via the included Perl CGI program.
135
136 Building these other forms of documentation require additional helper
137 applications -- most modern Linux distributions will include all that's
138 needed (at least mine does...). You shouldn't have a problem if you have
139 kept your Perl and Perl libraries up to date.
140
141 Online documentation can be found at the Swish-e web site listed above.
142
143 See INSTALL for information on creating the PDF and Postscript versions
144 of the documentation, and for information on installing the SWISH-*
145 documentation as Unix man(1) pages.
146
147 How do I read the Swish-e documentation?
148
149 The Swish-e documentation included with the distribution is in POD and
150 HTML formats. The POD documentation can be found in the pod directory,
151 and the HTML documentation can be found in the html directory.
152
153 To view the HTML documentation point your browser to the html/index.html
154 file.
155
156 The POD documentation is displayed by the "perldoc" command that is
157 included with every Perl installation. For example, to view the Swish-e
158 installation documentation page called "INSTALL", type
159
160 perldoc pod/INSTALL
161
162 or to make life easier,
163
164 cd pod
165 perldoc INSTALL
166 perldoc SWISH-RUN
167
168 Complain to your system administrator if the "perldoc" command is not
169 available on your machine.
170
171 Included Documentation
172
173 The following documentation is included in this Swish-e distribution.
174
175 If you are new to Swish-e read the INSTALL page to get Swish-e installed
176 and tested. Work through the example in shown in the INSTALL page, and
177 the examples in the conf directory. Also review the SWISH-FAQ.
178
179 * README - This file
180
181 * INSTALL - Installation and basic usage instructions
182
183 * SWISH-CONFIG - Configuration File Directives
184
185 * SWISH-RUN - Running Swish and Command Line Switches
186
187 * SWISH-SEARCH - All about Searching with Swish-e
188
189 * SWISH-FAQ - Common questions, and some answers
190
191 * SWISH-LIBRARY - Interface to the Swish-e C library
192
193 * SWISH-PERL - Instructions for using the Perl library
194
195 * CHANGES - List of feature changes and bug fixes
196
197 * SWISH-BUGS - List of known bugs in the release
198
199 Document Generation
200
201 The Swish-e documentation in HTML format was created with
202 Pod::HtmlPsPdf, a package of Perl modules written and/or modified by
203 Stas Bekman to automate the conversion of documents in pod format (see
204 perldoc perlpod) to HTML, Postscript, and PDF. A slightly modified
205 version of this package is include with the Swish-e distribution and
206 used for building the HTML. As distributed, Swish-e contains only the
207 pod and HTML documentation. See INSTALL for instructions on creating
208 man(1), Postscript, and PDF formats.
209
210 Thanks, Stas, for your help!
211
212 What's included in the Swish-e distribution?
213 Here's an overview of the directories included in the Swish-e
214 distribution:
215
216 conf/
217 Example Swish-e configuration setups to help you get started. After
218 reading the INSTALL page, and its included example, review the sample
219 configuration in this directory.
220
221 conf/stopwords
222 In the "conf/stopwords" sub-directory are a number of stopword files
223 for different languages. Use of stopwords is not required with
224 Swish-e.
225
226 doc/
227 Contains files required for building the HTML, PDF, and Postscript
228 documentation.
229
230 example/
231 This contains a sample CGI script (swish.cgi) for searching with
232 Swish-e. Documentation for using swish.cgi are included within the
233 script. Type:
234
235 perldoc example/swish.cgi
236
237 from the top-level directory where the Swish-e distribution was
238 unpacked.
239
240 filter-bin/
241 Sample programs to use with Swish-e's "filters". Examples include
242 PDF, MS Word, and binary strings filters. Filters often require
243 installing separate document conversion programs.
244
245 html/
246 The documentation in HTML format.
247
248 perl/
249 The Perl interface to the Swish-e C library. This Perl module
250 provides direct access to Swish-e from within your Perl programs. See
251 the perl/README file for more information.
252
253 pod/
254 The source for all documentation in perldoc (pod) format.
255
256 prog-bin/
257 Example programs and modules to use with the "prog" document source
258 access method. Examples include a web spider, a program to index
259 directly from a MySQL database, and a program to recurse a directory
260 tree. Example Perl modules are provided for converting PDF and
261 MS-Word documents into a format usable by Swish-e. See
262 prog-bin/README for an overview of the programs and modules, and
263 check each file for included documentation.
264
265 The prog-bin/spider.pl program is a web spider program with many
266 features. It contains its own documentation. Type:
267
268 perldoc example/spider.pl
269
270 from the top-level directory where the Swish-e distribution was
271 unpacked.
272
273 The "prog" document source feature is very powerful, but can be a
274 challange to set up when first using Swish-e. Please contact the
275 Swish-e disussion list if you have any questions.
276
277 src/
278 This directory contains the source code for Swish-e. OS-specific
279 directories are also found here.
280
281 tests/
282 The documents used for running "make test".
283
284 Where do I get help with Swish-e?
285 If you need help with installing or using Swish-e please subscribe to
286 the Swish-e mailing list. Visit the Swish-e web site listed above for
287 information on subscribing to the mailing list.
288
289 Before posting any questions please read QUESTIONS AND TROUBLESHOOTING
290 in the INSTALL documentation page.
291
292 Speling mistakes
293 Please contact the Swish-e list with corrections to this documentation.
294 Any help in cleaning up the docs will be appreciated!
295
296 Any patches should be made against the .pod files, not the .html files.
297
298 Swish-e Development
299 Swish-e is currently being developed as an open source project on
300 SourceForge http://sourceforge.net.
301
302 Contact the Swish-e list for questions.
303
304 Swish-e's History
305 SWISH was created by Kevin Hughes to fill the need of the growing number
306 of Web administrators on the Internet - many of the indexing systems
307 were not well documented, were hard to use and install, and were too
308 complex for their own good. The system was widely used for several
309 years, long enough to collect some bug fixes and requests for
310 enhancements.
311
312 In Fall 1996, The Library of UC Berkeley received permission from Kevin
313 Hughes to implement bug fixes and enhancements to the original binary.
314 The result is Swish-enhanced or Swish-e, brought to you by the Swish-e
315 Development Team.
316
317 Document Info
318 Each document in the Swish-e distribution contains this section. It
319 refers only to the specific page it's located in, and not to the Swish-e
320 program or the documentation as a whole.
321
322 $Id: README.pod,v 1.11 2002/08/20 22:24:08 whmoseley Exp $
323
324 .

  ViewVC Help
Powered by ViewVC 1.1.22