4.16 Index Searching Routines



All the functions described up to this point work with any RMSfile regardless of the access method. This section describes functions that can be used only if an RMSfile is an indexed sequential file or if there are secondary key fields defined for the RMSfile.

When an RMSfile is created, it is possible to declare that a secondary key be maintained on a particular field. When this is done, RMS maintains all the values of that field in a separate file along with the record number where that value is stored. These values are kept in sorted order in a B-tree index . As records are added, updated or deleted in the RMS data file, the secondary index is also updated by RMS. If the value of the indexed field is changed, the old value is removed from the RMS index file, and the new value is inserted in the proper place. Since the index is organized as a B-tree, it is never necessary to reorganize RMSfiles as with some other index file organizations. For further information about B-tree indexes, see

Sorting and Searching [2].

One secondary index can actually index more than one data field. This allows an index into the subscriber RMSfile not only by zip code, but by zip code and name. Thus all the entries within one zip code are sorted alphabetically by name. This can be continued for up to eight fields. One RMSfile can have up to 127 different one secondary indexes. One index could be created by zip code, and a second index could be created for the city, etc. This would allow quick access to records either by zip code or by city.

All the secondary indexes for one RMSfile are stored in a separate file. This file has the same name as the RMS data file with a .i suffix. The RMS programmer never refers to this file directly. It is accessed and maintained by the RMS functions.

You can use the secondary index facility with any RMSfile regardless of the file access method defined for the file. Sequential RMSfiles, hashed RMSfiles and indexed sequential RMSfiles all can have secondary index fields. In fact, indexed sequential RMSfiles are implemented by using a sequential RMSfile and creating an additional B-tree index on the primary key fields.

There are routines to search the secondary index for a particular field value and return the RMS data file record that contains that value. These secondary index routines work in a manner similar to the primary key routines dfindk(C-3) and dfindnk(C-3). First the values to be found are placed into the user record. Next, the secondary index search routine is called to return the first record satisfying the secondary index field values. Additional function calls return the next record in sequential key order. The function dseti(C-3) selects one of the secondary indexes or the primary key index (if the RMSfile is indexed sequential) to be used by the secondary index routines when searching the RMSfile. The first parameter is the integer number of the secondary index. The first secondary index for a data file is numbered 1. The second parameter to dseti is the file pointer returned by calling dlopen(C-3) or dopen(C-3). Dseti returns a negative value if there is no secondary index corresponding to the first parameter. For valid secondary index numbers, dseti returns the total length of the secondary key fields in the selected index. The selected secondary index is used in subsequent secondary index search function calls until the next dseti call. An example using dseti is:

DFILE *sub;

sub = dlopen ("sub", "r");

index = dseti (1, sub);

if (index > 0) {

/* process records using secondary index #1 */

}

The dseti function can be used to discover whether secondary indexes are associated with an RMSfile. If dseti is called with a secondary index number of 1 and the function returns a negative value, there are no secondary indexes associated with the RMSfile.

If the first parameter to dseti is zero, dseti selects the index used for the primary key values in an indexed sequential RMSfile. Thus, the secondary index search routines can also be used on the primary key field index of an indexed sequential file. If the file access method is not indexed sequential, dseti returns a negative value. Once a particular secondary index has been selected with dseti, the search routines can be used to read records using the selected index.

Dfindm(C-3) finds a record by its secondary key values. First the secondary key values are placed into the user record. Next, dfindm is called to retrieve the first record that matches the secondary key values placed in the user record. If found, the record is returned in the user record, and the record number is returned. Dfindm returns a record number of -1 if no record matches the requested key values in the user record.

A sample call to dfindm might look like:

/*

* retrieve subscription record for rd

* This example assumes that secondary index 1

* is created on the "magazine" data field.

*/

dseti (1, script);

strncpy (scriptbuf.magazine, "rd",

sizeof (scriptbuf.magazine));

recno = dfindm (&subbuf, script);

if (recno > 0) {

/* process record for "rd" */

}

The first parameter to dfindm is the address of the user record. The second parameter is the file pointer returned by calling dlopen(C-3) or dopen(C-3).

RMS keeps track of its current position within the secondary index being used. Dfindnm(C-3) retrieves the next record that matches the secondary key values specified in the dfindm call. The secondary key values do not have to be placed into the user record before calling dfindnm. If there is a next record that matches the secondary key values, it is returned in the user record and dfindnm returns the record number of the retrieved record. Dfindnm returns a record number of -1 if there are no more matching key values. An example using dfindnm is:

/* continued from dfindm example */

dseti (1, script);

/* process all rd subscriptions */

strncpy (scriptbuf.magazine, "rd",

sizeof (scriptbuf.magazine));

for (recno = dfindm (&scriptbuf, script);

recno > 0;

recno = dfindnm (&scriptbuf, script)) {

/* process next record found */

}

When match values and index position have been established with some dfindm call, dfindnm finds a previous matching record. The initial contents of the user record are ignored by dfindnm.

Dfindm and dfindnm find records by secondary key value in the same way that dfindk(C-3) and dfindnk(C-3) find records by primary key value.

The functions dfindm and dfindnm also have counterparts for reading backwards. The first of these, dfindlm(C-3), finds the last matching record in the index. Like dfindm, the desired values are placed into the user record before the call. Dfindlm searches for the last matching record returns it in the user record. If no matching records are found, -1 is returned. Otherwise, the record number is returned.

When match values and index position have been established with some dfindm call, dfindpm(C-3) finds a previous matching record. Like dfindnm, the contents of the user record are ignored by dfindpm. If a previous matching record is found, it is returned in the user record and the record number is returned. If no record is found, -1 is returned. The following example illustrates both dfindlm and dfindpm:

dseti (1, script);

/* process all rd subscriptions */

strncpy (scriptbuf.magazine, "rd",

sizeof (scriptbuf.magazine));

for (recno = dfinlm (&scriptbuf, script);

recno > 0;

recno = dfindpm (&scriptbuf, script)) {

/* process next record found */

}

There are times when an application needs to read an RMSfile using part of a secondary key. Dnumidx(C-3) restricts the number of fields in a secondary key that are compared while searching the RMSfile. The following sample shows how to call dnumidx:

dnumidx (1, sub);

Dnumidx is passed the number of fields to participate in a comparison and a file pointer returned by either dlopen(C-3) or dopen(C-3).

The comparison restrictions established by dnumidx are short-lived. Each time an RMSfile function (one that is passed a RMSfile pointer from dlopen or dopen) is called, this comparison restriction is removed for further function calls. This prevents an earlier call to dnumidx from producing confusing results later in the program (when partial key comparisons are not desired or expected). It also eliminates the need for the program to know how many fields are in the full key.

Having the comparison restriction reset so frequently means the program must call dnumidx immediately before each function where it wants the restriction to be in force. For example, if index 2 of the subscriber file was on the state and zip fields, the following code could be used to print subscribers within the state of Washington sorted by zip-code:

/* select secondary index 2 */

dseti (2, sub);

/* read first subscriber */

dnumidx (1, sub);

strncpy (subbuf.state, "WA",

sizeof (subbuf.state));

recno = dfindm (&subbuf, sub);

/* loop through remaining subscribers */

while (recno > 0L) {

/* ... print subscriber information ... */

/* read next subscriber */

dnumidx (1, sub);

recno = dfindnm (&subbuf, sub);

}

Our earlier example for dfindlm and dfindpm can be made more useful by adding started as a second field in the index. In this case, the code could be used to show the most recent subscriptions to rd:

dseti (1, script);

/* process all rd subscriptions */

strncpy (scriptbuf.magazine, "rd",

sizeof (scriptbuf.magazine));

for (dnumidx (1, script),

recno = dfinlm (&scriptbuf, script);

recno > 0;

dnumidx (1, script),

recno = dfindpm (&scriptbuf, script)) {

/* process next record found */

}

Three additional secondary index functions, dfindi(C-3), dfindni(C-3), and dfindfi(C-3), retrieve records by approximate secondary key values. Dfindi finds the first record that has secondary key values equal to or greater than the key values placed in the user record, puts the data record into the user record and returns the record number of the retrieved record. Before dfindi is called, the beginning secondary key values should be placed in the user record. Dfindi returns a record number of -1 if no record is found with a secondary key value equal to or greater than the secondary key field values in the user record, The following example shows how to call dfindi:

/* select first secondary index */

dseti (1, sub);

/* put beginning key value into buffer */

strncpy (subbuf.name, "Jones", sizeof (subbuf.name));

/* find first record with name equal or greater than "Jones" */

recno = dfindi (&subbuf, sub);

RMS keeps track of its position within the current secondary index. The dfindi function call sets the current position to the record that was returned. Dfindni retrieves the next record in secondary key order. Any initial values placed in the user record before calling dfindni are ignored. If there is another record in secondary key order, it is returned in the user record and dfindni returns the number of the retrieved record. Dfindni returns -1 when the end of the index is reached. The following example shows how to call dfindni:

/* select first secondary index */

dseti (1, sub);

/* print subscribers with name starting with 'J' */

strncpy (subbuf.name, "J",

sizeof (subbuf.name));

for (recno = dfindi (&subbuf, sub);

recno > 0L;

recno = dfindni (&subbuf, sub) > 0) {

if (*subbuf.name != 'J')

break; /* end of J's */

puts (subbuf.name);

}

You can also read the entire data file in secondary key order. Dfindfi retrieves the first record in secondary key order into the user record. Any initial values placed in the user record before calling dfindfi are ignored. Dfindfi returns -1 if there are no records in the RMSfile. Dfindfi also sets the current position in the secondary index. Dfindni can be called after dfindfi to retrieve subsequent records in secondary key order. Dfindni returns -1 when all records have been read. The following example show how to call dfindfi:

/* print subscribers sorted by name */

dseti (1, sub);

for (recno = dfindfi (&subbuf, sub);

recno > 0L;

recno = dfindni (&subbuf, sub)) {

puts (subbuf.name);

}

Once an index position has been established (by using one of the dfindi functions) dfindpi(C-3) reads the record previous to the current record in index order. When the beginning of the index is reached, dfindpi returns -1. The counterpart of dfindfi is dfindli(C-3). This function reads the last record stored in the index. The following example illustrates how to call both dfindli and dfindpi:

/* print subscribers reverse sorted by name */

dseti (1, sub);

for (recno = dfindli (&subbuf, sub);

recno > 0L;

recno = dfindpi (&subbuf, sub)) {

puts (subbuf.name);

}

The last indexing function to discuss has several different applications. The function allows a program to make RMS establish its internal pointers as though it had just read a given record. A sample call to this function is:

recno = dsetpos (recwanted, index, file);

Dsetpos(C-3) is passed three arguments: the number of the record to use for positioning, the secondary index number to be used later, and an open file pointer returned by dlopen(C-3) or dopen(C-3). Record numbers are returned by all of the find functions, dinsert(C-3), and dupdate(C-3). The index number is identical to what you pass to dseti(C-3).

This function has use in interactive, table type programs where the user can scroll forwards and backwards through the file as shown in the following example:

#include <stdio.h>

#include <cbase/dirio.h>

#include <cbase/dtypes.h>

#include <cbase/escape.h>

#include "script.h"

/* maximum of 24 lines displayed on screen */

#define SCRZ 24

/* record images of what's on screen */

ScriptRec *sbufs[SCRZ];

/* record numbers of records on screen */

rno_t srecs[SCRZ];

/* number of records on screen */

int snum;

/* current row number */

int srow;

main()

{

rno_t recno;

register int key;

script = dlopen ("script", "r");

drlist (scriptlist, script);

dseti (1, script);

if (ttyinit() < 0)

exit (1);

escape (CLEARALL);

read_first();

while (1) {

escape (MOVCUR, srow+1, 1);

key = getkey();

if (key == 'Q' || key == 'q')

break;

switch (key) {

case CURUP:

if (srow == 0)

mov_up();

else

srow;

break;

case CURDOWN:

if (srow == snum-1)

mov_dn();

else

++srow;

break;

default:

break;

}

}

quit (0);

}

quit (n)

{

escape (MOVCUR, 24, 1);

ttyrestore();

dclose (script);

exit (n);

}

ScriptRec *

getbuf()

{

char *s, *malloc();

if ((s = malloc (sizeof (ScriptRec))) == NULL) {

fputs ("out of memory\r\n", stdout);

quit (2);

}

movb (s, &scriptbuf, sizeof (scriptbuf));

return (ScriptRec*) s;

}

freebuf (s)

ScriptRec *s;

{

free ((char *) s);

}

showbuf()

{

register int i;

escape (CLEARALL);

for (i=0; i<snum; ++i) {

escape (MOVCUR, i+1, 1);

print (sbufs[i]);

}

}

/* read first screen full into memory */

read_first()

{

rno_t recno;

srow = 0;

snum = 0;

for (recno = dfindfi (&scriptbuf, script);

recno > 0;

recno = dfindni (&scriptbuf, script)) {

sbufs[snum] = getbuf();

srecs[snum] = recno;

++snum;

escape (MOVCUR, snum, 1);

print (&scriptbuf);

if (snum == SCRZ)

break;

}

}

mov_up()

{

register int i;

rno_t recno;

if (snum <= 0)

return;

/* try to read previous record */

if (dsetpos (srecs[0], 1, script) < 0)

quit (3);

if ((recno = dfindpi (&scriptbuf, script)) < 0)

return;

/* shift screen information down */

freebuf (sbufs[snum-1]);

for (i=snum-1; i>0; i) {

srecs[i] = srecs[i-1];

sbufs[i] = sbufs[i-1];

}

/* set up new line information */

sbufs[0] = getbuf();

srecs[0] = recno;

/* update display */

escape (MOVCUR, 1, 1);

if (escape (ROLLDOWN) < 0)

showbuf();

else

print (&scriptbuf);

}

mov_dn()

{

register int i;

rno_t recno;

if (snum <= 0)

return;

/* try to read next record */

if (dsetpos (srecs[snum-1], 1, script) < 0)

quit (4);

if ((recno = dfindni (&scriptbuf, script)) < 0)

return;

/* shift screen information up */

freebuf (sbufs[0]);

for (i=0; i<snum-1; ++i) {

srecs[i] = srecs[i+1];

sbufs[i] = sbufs[i+1];

}

/* set up new line information */

sbufs[snum-1] = getbuf();

srecs[snum-1] = recno;

/* update display */

escape (MOVCUR, 24, 1);

if (escape (ROLLUP) < 0)

showbuf();

else

print (&scriptbuf);

}

print (crec)

ScriptRec *crec;

{

printf ("%-12s%-10s%10s%6d",

crec->subscriber,

crec->magazine,

datetoa (crec->started),

crec->issues);

}

Another use of this function requires a special record number to be passed. When dsetpos is passed a record number of zero (0), the current record is assumed. This feature can be used to locate a record by key or index value, and then switch to another index to start reading from the current record in either direction.