1 |
adcroft |
1.1 |
=head1 NAME |
2 |
|
|
|
3 |
|
|
SWISH-LIBRARY - Interface to the Swish-e C library |
4 |
|
|
|
5 |
|
|
=head1 What is the Swish-e C library |
6 |
|
|
|
7 |
|
|
It is a C library implementation based on swish-e-2.1-dev, but many of |
8 |
|
|
the functions have been rewritten in order to get a thread safe library. |
9 |
|
|
That's is not to say that it is currently thread safe. |
10 |
|
|
|
11 |
|
|
|
12 |
|
|
The advantage of the library is that the index file(s) can be opened one time |
13 |
|
|
and many queries made on the open index. This saves the startup time required |
14 |
|
|
to fork and run the swish-e binary, and the expensive time of opening up the |
15 |
|
|
index file. Some benchmarks have shown a three fold increase in speed. |
16 |
|
|
|
17 |
|
|
The downside is that your program now has more code and data in it (the index tables can |
18 |
|
|
use quite a bit of memory), and if a fatal error happens in swish it will bring down your |
19 |
|
|
program. These are things to think about, especially if embedding swish into a web server |
20 |
|
|
such as Apache where there are many processes serving requests. |
21 |
|
|
|
22 |
|
|
The best way to learn about the library is to look at two files included with |
23 |
|
|
the swish-e distribution that make use of the library. |
24 |
|
|
|
25 |
|
|
|
26 |
|
|
=over 4 |
27 |
|
|
|
28 |
|
|
=item src/libtest.c |
29 |
|
|
|
30 |
|
|
This file gives a basic overview of linking a C program with the Swish-e library. |
31 |
|
|
Not all available functions are used in that example, but it should give you a good overview |
32 |
|
|
of building a program with swish-e. |
33 |
|
|
|
34 |
|
|
To build libtest run and run libtest: |
35 |
|
|
|
36 |
|
|
$ make libtest |
37 |
|
|
$ ./libtest [optional name of index file(s)] |
38 |
|
|
|
39 |
|
|
You will be prompted for the search words. The default index used is F<index.swish-e>. |
40 |
|
|
This can be overridden by placing a list of index files in a quote-protected string. |
41 |
|
|
|
42 |
|
|
$ ./libtest 'index1 index2 index3' |
43 |
|
|
|
44 |
|
|
=item perl/SWISHE.xs |
45 |
|
|
|
46 |
|
|
The F<SWISHE.xs> file contains more examples of how to read from the perl library. It includes |
47 |
|
|
example code for reading additional information from the index files. |
48 |
|
|
|
49 |
|
|
=back |
50 |
|
|
|
51 |
|
|
Not all available functions are documented here. That's both do to laziness, and the hope |
52 |
|
|
that a better interface will be created for these functions. Check the above files for details. |
53 |
|
|
|
54 |
|
|
You should check for errors after every call. See the F<src/libtest.c> file for examples. |
55 |
|
|
|
56 |
|
|
=head1 Available Functions |
57 |
|
|
|
58 |
|
|
|
59 |
|
|
=over 4 |
60 |
|
|
|
61 |
|
|
=item struct SWISH *SwishInit(char *IndexFiles); |
62 |
|
|
|
63 |
|
|
This functions opens and reads the header info of the index files |
64 |
|
|
included in IndexFiles string. The string should contain a space separated |
65 |
|
|
list of index files. |
66 |
|
|
|
67 |
|
|
SWISH *myhandle; |
68 |
|
|
myhandle = SwishOpen("file1.idx"); |
69 |
|
|
|
70 |
|
|
This function will return a swish handle. You must check for errors, and on |
71 |
|
|
error free the memory used by the handle, or abort. |
72 |
|
|
|
73 |
|
|
Here's an example of aborting: |
74 |
|
|
|
75 |
|
|
SWISH *swish_handle; |
76 |
|
|
swish_handle = SwishInit("file1.idx file2.idx"); |
77 |
|
|
if ( SwishError( swish_handle ) ) |
78 |
|
|
SwishAbortLastError( swish_handle ); |
79 |
|
|
|
80 |
|
|
And here's an example of catching the error: |
81 |
|
|
|
82 |
|
|
SWISH *swish_handle; |
83 |
|
|
swish_handle = SwishInit("file1.idx file2.idx"); |
84 |
|
|
if ( SwishError( swish_handle ) ) |
85 |
|
|
{ |
86 |
|
|
printf("Failed to connect to swish. %s\n", SwishErrorString( swish_handle ) ); |
87 |
|
|
SwishClose( swish_handle ); /* free the memory used */ |
88 |
|
|
return 0; |
89 |
|
|
} |
90 |
|
|
|
91 |
|
|
|
92 |
|
|
=item struct SWISH *SwishOpen(char *IndexFiles); [depreciated] |
93 |
|
|
|
94 |
|
|
This functions opens and reads header info of the index files |
95 |
|
|
included in IndexFiles |
96 |
|
|
|
97 |
|
|
myhandle = SwishOpen("file1.idx"); |
98 |
|
|
|
99 |
|
|
Returns NULL on error. This function is depreciated since there is no way to |
100 |
|
|
find out what error caused an error. Use SwishInit() instead. |
101 |
|
|
|
102 |
|
|
=item void SwishClose(struct SWISH *handle); |
103 |
|
|
|
104 |
|
|
This function closes and frees the memory of a Swish handle |
105 |
|
|
|
106 |
|
|
=item int SwishSearch(struct SWISH *handle,char *words,int structure,char *properties,char *sortspec); |
107 |
|
|
|
108 |
|
|
This function executes a search for a handle. |
109 |
|
|
|
110 |
|
|
Input data: |
111 |
|
|
|
112 |
|
|
handle : value returned by SwishOpen |
113 |
|
|
words : the search string |
114 |
|
|
structure : At this moment always one (it will implement the -t option of Swish-e) |
115 |
|
|
properties : [Depreciated] Set as NULL. See text for comments. |
116 |
|
|
sortspec : Sort specs for the results. Use NULL if sort by rank |
117 |
|
|
|
118 |
|
|
Returns the number of hits or a negative value on error. |
119 |
|
|
|
120 |
|
|
num_results = SwishSearch(swish_handle, "title=test", 1, NULL, "date desc"); |
121 |
|
|
|
122 |
|
|
There is a new feature here that it is not included in swish-e-2.0: |
123 |
|
|
You can specify several sorting properties including a combination |
124 |
|
|
of descending and ascending fields. |
125 |
|
|
|
126 |
|
|
field1 asc field2 desc |
127 |
|
|
|
128 |
|
|
Currently, when num_results is zero there is also an error condition set ("Word not found"). |
129 |
|
|
Therefore, only check and report errors if num_results is a negative number. |
130 |
|
|
|
131 |
|
|
if ( num_results < 0 && SwishError( swish_handle ) ) |
132 |
|
|
SwishAbortLastError( swish_handle ); |
133 |
|
|
|
134 |
|
|
The B<properties> parameter: |
135 |
|
|
|
136 |
|
|
In general, you will find it easiest to use the functions described below to fetch properties: |
137 |
|
|
|
138 |
|
|
SwishResultPropertyStr() |
139 |
|
|
SwishResultPropertyULong() |
140 |
|
|
|
141 |
|
|
You can also pass in a space-separated list of properties to the SwishSearch() function. |
142 |
|
|
This will parse and cache the list of properites and then the property IDs can be used |
143 |
|
|
to fetch the property values. This saves the time of converting the property names from |
144 |
|
|
a string to a property ID value for each result. It's unlikely that the speed-up is |
145 |
|
|
sigificant. See the perl/SWISHE.xs code for an example how this can be done. |
146 |
|
|
|
147 |
|
|
=item int SwishSeek(struct SWISH *handle, int n) |
148 |
|
|
|
149 |
|
|
This function puts the results pointer on the nth result. The first result is |
150 |
|
|
number zero. Returns n if operation goes OK or a negative number on error. |
151 |
|
|
After calling SwishSeek() call SwishNext() to fetch the first record at the position |
152 |
|
|
selected by SwishSeek(); |
153 |
|
|
|
154 |
|
|
Example: |
155 |
|
|
|
156 |
|
|
SwishSeek( swish_handle, 0 ); /* start at the beginning */ |
157 |
|
|
SwishSeek( swish_handle, 5 ); /* start at the sixth record */ |
158 |
|
|
|
159 |
|
|
If you always read results from the very start you do not need to call SwishSeek(). |
160 |
|
|
After a query the position is set to the start of the result list. |
161 |
|
|
|
162 |
|
|
=item struct result *SwishNext(struct SWISH *handle) |
163 |
|
|
|
164 |
|
|
This function returns next result. It must be executed after SwishSearch. |
165 |
|
|
Returns NULL on error or when no more results are available. Call SwishError() |
166 |
|
|
to check for errors. |
167 |
|
|
|
168 |
|
|
The value returned is used to fetch the various I<properties> for a given file (e.g. rank, |
169 |
|
|
title, path name). Typically, SwishNext() is called in a loop to fetch and display all the |
170 |
|
|
properties. |
171 |
|
|
|
172 |
|
|
=item char *SwishResultPropertyStr (SWISH *handle, RESULT *result, char *property ) |
173 |
|
|
|
174 |
|
|
Once you have a result returned from SwishNext() you can call this function to fetch |
175 |
|
|
a string value of any property. |
176 |
|
|
|
177 |
|
|
printf("path = %s\n", SwishResultPropertyStr (swish_handle, result, "swishdocpath" ) ); |
178 |
|
|
|
179 |
|
|
If the property named is not defined (invalid name supplied) swish will return the string "(null)". |
180 |
|
|
If the property does not exist for this result the null string will be returned. |
181 |
|
|
|
182 |
|
|
You must not free the memory returned by the call, and you must copy the string to a new |
183 |
|
|
memory location if you wish to keep the string around longer than just while processing the |
184 |
|
|
current result. |
185 |
|
|
|
186 |
|
|
Currently, a cache of one result's properties (per index) are stored in memory. |
187 |
|
|
|
188 |
|
|
=item unsigned long SwishResultPropertyULong (SWISH *handle, RESULT *result, char *property ) |
189 |
|
|
|
190 |
|
|
This will return numeric (and date) properties as an unsigned long. |
191 |
|
|
|
192 |
|
|
It will return ULONG_MAX on error, which can mean either that the property name specified was |
193 |
|
|
invalid, the property specified was not a numeric or date property, or simply that the no value |
194 |
|
|
exists for the current result. Check SwishError() to determine if it's a real error vs. just |
195 |
|
|
that the result does not have the property. |
196 |
|
|
|
197 |
|
|
=item int SwishError(struct SWISH *handle) |
198 |
|
|
|
199 |
|
|
This function returns the last error code. It's often used as a test to see |
200 |
|
|
if any errors happened on the last operation. |
201 |
|
|
|
202 |
|
|
=item char *SwishErrorString(struct SWISH *handle) |
203 |
|
|
|
204 |
|
|
Returns the string version of the error code. See F<src/error.c> |
205 |
|
|
for possible errors. This is a generic error class. See |
206 |
|
|
SwishLastErrorMsg() for possible specific messages. |
207 |
|
|
|
208 |
|
|
|
209 |
|
|
=item char *SwishLastErrorMsg(struct SWISH *handle) |
210 |
|
|
|
211 |
|
|
This can return additional (more specific) information about the last error. |
212 |
|
|
For example, SwishErrorString() might return: |
213 |
|
|
|
214 |
|
|
Index file error |
215 |
|
|
|
216 |
|
|
But SwishLastErrorMsg might give details like: |
217 |
|
|
|
218 |
|
|
Couldn't open the property file "index1.prop": No such file or directory |
219 |
|
|
|
220 |
|
|
=item SwishAbortLastError( SWISH *handle ) |
221 |
|
|
|
222 |
|
|
This will abort the program, and format and print any error messages. |
223 |
|
|
|
224 |
|
|
=item SwishCriticalError( SWISH *handle ) |
225 |
|
|
|
226 |
|
|
This will return true if the last error was critical. A critical error means |
227 |
|
|
swish is in an unstable state and you must call SwishClose() on the handle. |
228 |
|
|
|
229 |
|
|
=item SwishErrorsToStderr(void) |
230 |
|
|
|
231 |
|
|
Call this after calling SwishInit() and any messages or warning will be sent to |
232 |
|
|
stderr (standard error) instead of to stdout. This might be important when |
233 |
|
|
running swish-e in a web server environment. |
234 |
|
|
|
235 |
|
|
=item SetLimitParameter(handle,propertyname,low,hi) |
236 |
|
|
|
237 |
|
|
This is used to set the limit ranges on a property (as is done with the -L switch |
238 |
|
|
when running swish from the command line. |
239 |
|
|
|
240 |
|
|
=item ClearLimitParameter(handle) |
241 |
|
|
|
242 |
|
|
Clears the limits set by SetLimitParameter(). If you use limits you |
243 |
|
|
will need to clear them after each request. |
244 |
|
|
|
245 |
|
|
=item Stem(char **inword, int *lenword) |
246 |
|
|
|
247 |
|
|
This can be used to convert a word to its stem. Word is modified in place (or reallocated if |
248 |
|
|
needed. |
249 |
|
|
|
250 |
|
|
|
251 |
|
|
=back |
252 |
|
|
|
253 |
|
|
=head1 Bug-Reports |
254 |
|
|
|
255 |
|
|
Please report bug reports to the Swish-e discussion group. |
256 |
|
|
Feel also free to improve or enhance this feature. |
257 |
|
|
|
258 |
|
|
=head1 Author |
259 |
|
|
|
260 |
|
|
Aug 2000 |
261 |
|
|
Jose Ruiz |
262 |
|
|
jmruiz@boe.es |
263 |
|
|
|
264 |
|
|
Updated: Aug 22, 2002 - Bill Moseley |
265 |
|
|
|
266 |
|
|
=head1 Document Info |
267 |
|
|
|
268 |
|
|
$Id: SWISH-LIBRARY.pod,v 1.4 2002/08/22 23:08:07 whmoseley Exp $ |
269 |
|
|
|
270 |
|
|
. |