ddlStructure() Usage

STRUCTURE Usage

High Cardinality: Fields that are identifying fields such as CUST# or PO#, and fields like NAME are high cardinality, that is, lots of different values for these fields, right on up to absolute uniqueness where every field has a different value.

Low Cardinality: A field like SEX is an example of low cardinality. Even though there may be 500 million records in the database, the value in the SEX field for any one of those records will be one of: Male, Female, Unknown, or blank (or perhaps M, F, U, or blank). So in all those records there are only four unique values represented. The STATE field has 51 possible values; the STATUS field has maybe 10 possible values, and so on. All of these would be low cardinality fields.

Fields like CITY or ZIP might be high cardinality for one database application but low for another, depending on how much geography your database spans.

In reality, Low Cardinality is not so low. Experience indicates that combined cardinality up to 50000 is practical.

The number you must supply in Form 2 is the combined cardinality of all the fields listed that follow the number. For example,

                STRUCTURE/C65 SEX STATE STATUS

we got to the number 65 by using the numbers that were discussed three paragraphs earlier in the "Low Cardinality" paragraph.

What that number does is tell us how many 4k buffers to set aside to handle the total cardinality of the fields mentioned, and there must be one buffer for each possible cardinal value. So if you used a number like 1000, for example, that means that you have enough RAM to dedicate 4 megabytes to temporary buffers to be used by STRUCTURE/C, and it also would mean that you have included different fields or more fields than those shown above.

Form 2 structures all of the fields specified in its list in one pass of the data, whereas Form 1 takes one pass of the data for each field that it structures. A quite typical scenario for index building is as follows:

                                STRUCTURE/C ...                                                (for the low cardinality keys)
                                STRUCTURE                                                        (for the rest of the keys)

The rest of the discussion is about Form 1 STRUCTURE.

Form 1 STRUCTURE without the P option requires that no structure exists for any field that it is to structure. STRUCTURE will automatically act like a NOP (No Operation), that is, skip structuring any field where existing structure is encountered.

Form 1 STRUCTURE with the P option is used for add-on structure. So if you had an existing database with 50000 records in it that are all fully structured, then you do a LOAD that adds 500 more records, use STRUCTURE/P to include the new data into the indexes for keyed fields.

STRUCTURE/P is generally faster than using CONSTRUCT in order to build add-on structure. But not always. You will have to experiment to know which is best for a given application, but 95% of the time you would be right if your first guess was to use STRUCTURE/P.

Examples:

                STRUCTURE

builds key index structure for every keyed field in the entire database.

                STRUCTURE/P

builds add-on key index structure for every keyed field in the database for which new unstructured records existed. At the end of this operation there would be no unstructured data left in the database.

                STRUCTURE CUST#

builds key index structure for the CUST# field in every table where it is a keyed field.

                STRUCTURE/P CUST#

builds add-on key index structure for the CUST# field in every table where it is known and where there were unstructured records.