STRUCTURE Usage
High Cardinality: Fields that are identifying fields such as CUST# or PO#, and
fields like NAME are high cardinality, that is, lots of different values for
these fields, right on up to absolute uniqueness where every field has a
different value.
Low Cardinality: A field like SEX is an example of low cardinality. Even though
there may be 500 million records in the database, the value in the SEX field
for any one of those records will be one of: Male, Female, Unknown, or blank
(or perhaps M, F, U, or blank). So in all those records there are only four
unique values represented. The STATE field has 51 possible values; the STATUS
field has maybe 10 possible values, and so on. All of these would be low
cardinality fields.
Fields like CITY or ZIP might be high cardinality for one database application
but low for another, depending on how much geography your database spans.
In reality, Low Cardinality is not so low. Experience indicates that
combined cardinality up to 50000 is practical.
The number you must supply in Form 2 is the combined cardinality of all the
fields listed that follow the number. For example,
STRUCTURE/C65 SEX STATE STATUS
we got to the number 65 by using the numbers that were discussed
three paragraphs earlier in the "Low Cardinality" paragraph.
What that number does is tell us how many 4k buffers to set aside to handle the
total cardinality of the fields mentioned, and there must be one buffer for
each possible cardinal value. So if you used a number like 1000, for example,
that means that you have enough RAM to dedicate 4 megabytes to temporary buffers
to be used by STRUCTURE/C, and it also would mean that you have included
different fields or more fields than those shown above.
Form 2 structures all of the fields specified in its list in one pass of the
data, whereas Form 1 takes one pass of the data for each field that it
structures. A quite typical scenario for index building is as follows:
STRUCTURE/C
... (for
the low cardinality keys)
STRUCTURE (for
the rest of the keys)
The rest of the discussion is about Form 1 STRUCTURE.
Form 1 STRUCTURE without the P option requires that no structure exists for any
field that it is to structure. STRUCTURE will automatically act like a NOP (No
Operation), that is, skip structuring any field where existing structure is
encountered.
Form 1 STRUCTURE with the P option is used for add-on structure. So if you had
an existing database with 50000 records in it that are all fully structured,
then you do a LOAD that adds 500 more records, use STRUCTURE/P to include the
new data into the indexes for keyed fields.
STRUCTURE/P is generally faster than using CONSTRUCT in order to build add-on
structure. But not always. You will have to experiment to know which is best
for a given application, but 95% of the time you would be right if your first
guess was to use STRUCTURE/P.
Examples:
STRUCTURE
builds key index structure for every keyed field in the entire database.
STRUCTURE/P
builds add-on key index structure for every keyed field in the database for
which new unstructured records existed. At the end of this operation there
would be no unstructured data left in the database.
STRUCTURE
CUST#
builds key index structure for the CUST# field in every table where it is a
keyed field.
STRUCTURE/P
CUST#
builds add-on key index structure for the CUST# field in every table where it
is known and where there were unstructured records.
Copyright © 2019 , WhamTech, Inc. All rights reserved. This
document is provided for information purposes only and the contents hereof are
subject to change without notice. Names may be
trademarks of their respective owners.