A SERVICE OF

logo

40 Using the DBINDEX=, DBKEY=, and MULTI_DATASRC_OPT= Options Chapter 5
DBKEY= does not require that any database indexes be defined; nor does it check
the DBMS system tables. This option instructs SAS to use the specified DBMS column
name or names in the WHERE clause that is passed to the DBMS in the join.
The DBKEY= option can also be used in a SAS DATA step, with the KEY= option in
the SET statement, to improve the performance of joins. You specify a value of
KEY=DBKEY in this situation. The following DATA step creates a new data file by
joining the data file KEYVALUES with the DBMS table MYTABLE. The variable
DEPTNO is used with the DBKEY= option to cause a WHERE clause to be issued by
SAS/ACCESS.
data sasuser.new;
set sasuser.keyvalues;
set dblib.mytable(dbkey=deptno) key=dbkey;
run;
Note: When you use DBKEY= with the DATA step MODIFY statement, there is no
implied ordering of the data that is returned from the database. If the master DBMS
table contains records with duplicate key values, using DBKEY= can alter the outcome
of the DATA step. Because SAS regenerates result sets (open cursors) during
transaction processing, the changes you make during processing have an impact on the
results of subsequent queries. Therefore, before you use DBKEY= in this context,
determine whether your master DBMS file has duplicate values for keys (remembering
that the REPLACE, OUTPUT, and REMOVE statements can cause duplicate values to
appear in the master table).
The DBKEY= option does not require or check for the existence of indexes created on
the DBMS table. Therefore, the DBMS system tables are not accessed when you use
this option. The DBKEY= option is preferred over the DBINDEX= option for this
reason. If you perform a join and use PROC SQL, you must ensure that the columns
that are specified through the DBKEY= option match the columns that are specified in
the SAS data set.
CAUTION:
Before you use the DBINDEX= option, take extreme care to evaluate some characteristics
of the DBMS data.
The number of rows in the table, the number of rows returned in
the query, and the distribution of the index values in the table are among the factors
to take into consideration. Some experimentation might be necessary to discover the
optimum settings.
You can use the DBINDEX= option instead of the DBKEY= option if you know that
the DBMS table has one or more indexes that use the column(s) on which the join is
being performed. Use DBINDEX=index-name if you know the name of the index, or use
DBINDEX=YES if you do not know the name of the index. Use this option as a data set
option, and not a LIBNAME option, because index lookup can potentially be an
expensive operation.
DBINDEX= requires that the join table must have a database index that is defined
on the columns involved in the join. If there is no index, then all processing of the join
takes place in SAS, where all rows from each table are read into SAS and SAS performs
the join.
Note: The data set options NULLCHAR= and NULLCHARVAL= determine how SAS
missing character values are handled during DBINDEX= and DBKEY= processing.