SHASH C-1


NAME



shash - compute statistics about a hashed RMSfile
[Requires: C/Base Utilities Module]

SYNOPSIS



shash lfile

DESCRIPTION



Shash displays on standard output statistics about the named RMSfile
lfile. Lfile must be either the logical name or the pathname of a hashed
RMSfile.

Shash reads each record in the RMSfile by key value, and counts the number
of record accesses to read each record. The number of accesses includes
one read to read the actual record, and one read for each collision.
The following is sample output from shash:

211 records in file
49 records in use
23.2227% loading factor

1 minimum accesses per record
2 maximum accesses per record
1 median accesses per record
1.06122 average accesses per record

0 minimum access time
0 maximum access time
0 median access time
0 average access time

acc #nrec %rec # >= % >=
1 ( 46, 93%, 49, 100% ) ***************
2 ( 3, 6%, 3, 6% ) **

Records in file is the maximum number of records that can be in the RMSfile.
It is the number of records specified when the RMSfile was created or
expanded. This number should normally be a prime number.

Records in use is the number of active records in the RMSfile. Deleted
and never used records are not counted.

Loading factor is the ratio of records in use to records in file expressed
as a percentage. The performance of a hashed file deteriorates as the
loading factor rises. You can change the loading factor by expanding
the maximum number of records in the RMSfile.

Minimum accesses per record is the smallest number of accesses to find
a record. This number is normally one.

Maximum accesses per record is the largest number of accesses to find
a record.

Median accesses per record is the 'midway' point between the minimum
number of accesses and the maximum number of accesses.

Average accesses per record is the average number of accesses to find
a record.

Minimum access time is the shortest access time to find a record. This
has a resolution of 1 second.

Maximum access time is the longest access time to find a record. This
has a resolution of 1 second.

Median access time is the 'midway' point between the shortest and the
longest access time to find a record. This has a resolution of 1 second.

Average access time is the average access time to find a record. This
has a resolution of 1 second.

The remaining output from shash is a histogram showing the number of
records that have a given number of accesses to find them. The acc
column shows the number of record accesses, the #nrec column shows the
number of records that have that number of accesses, the %rec column
shows the percentage of records to the number of active records, the # >=
column shows the number of records that have this or a greater number
of accesses, and the % >= column shows the number of records that have
this or a greater number of accesses as a percentage of the number of
active records.

NOTES



The access timing information is very approximate; the resolution is
one second, and it does not take system load into account.

This program is available only with the C/Base Utilities software package.