SELEX: An Expert System for Evaluating Published Data on Selenium in Foods

D.W. Bigwood (1) , S.R. Heller (*)(2) , W.R. Wolf (3), A. Schubert (3), and J.M. Holden (3), USDA, ARS (1) Program Resources, Inc., (2) Model and Database Coordination Laboratory, Nutrient Composition Laboratory (3), Beltsville, MD 20705, USA.

* Author to whom correspondence should be addressed.

ABSTRACT

Providing consistent and objective evaluation of published nutrient composition data is critical for planning future analytical studies and effective use of data. Using a commercial expert system shell, a computer system of approximately 200 rules has been created to evaluate and quantitatively rate published data on selenium in foods. The evaluation scheme uses five general categories for its rule-making process: number of samples, analytical method, sample handling, sampling plan, and analytical quality control. For each selenium value to be evaluated, ratings are assigned in each category by the expert system based on input which is derived from the information reported in a given paper. A Quality Index (QI), which is derived from the ratings, is a measure of the reliability of a given selenium value over all categories for a given study. The concepts used in developing SELEX have the potential of establishing criteria for assisting journal editors and their reviewers in their evaluation of many manuscripts submitted for publication.

INTRODUCTION

Increasing interest in the selenium intake of Americans due to the potential relationship of selenium to cancer prevention has generated a need for the compilation, evaluation, and improvement of data on selenium in foods. Reasons for undertaking this work include the concern with the uneven quality of the data and lack of support documentation. A set of criteria were developed to evaluate the quality of existing, peer-reviewed, published selenium data (1). A manual system for post publication evaluation of selenium data (2) using these criteria proved successful in identifying foods for which the quality of data was poor or for which there were no acceptable data. However, this manual system was more tedious, more time consuming, and less consistent than desired. Consequently an expert system, SELEX, was developed to automate the evaluation process. Developed directly from the previously established criteria, this expert system provides users with several advantages over the manual system. These include speeding the evaluation process and production of more consistent numeric ratings. Development of the expert system also allows users who have less expertise than the domain experts to generate ratings.

Figure 1 shows the overview of the entire evaluation procedure, including the selection of the selenium core foods as well as the individual rating process which is addressed by SELEX. For each food within a study, a rating is assigned in each of five different categories. These five categories are: number of samples, analytical method, sample handling, sampling plan, and analytical quality control. The ratings assigned by SELEX, the selenium mean, and ancillary information from the publication are written into a computer file which can be read by a SAS (Statistical Analysis System) program which determines the Quality Index (QI), selenium mean, and Confidence Code (CC) for each particular food. The QI is determined from the five ratings, and with a few exceptions, is equal to the simple mean of the five numbers. The ratings and QI range from 0 to 3. A QI of 1.0 or greater indicates that the selenium mean is considered acceptable. All acceptable means for a particular food are averaged to yield a grand selenium mean for that food. The CC (A, B, or C), derived from the sum of the QI's, represents the confidence that can be attributed to the grand selenium mean.

Using the concepts and methods created for the development of the process of evaluating published selenium data, we have considered the broader implications of these methods. It is hoped that the concepts, principles, and rules developed for the selenium data evaluation system will be considered by journal editors and their reviewers for use in their pre-publication review process. At the least, this work indicates that better defined procedures are possible for analytical chemical data evaluation. By employing such techniques it is anticipated that a better dialog could be developed between the journal editors and authors.

It is well known that the quality of much of the scientific literature is often lower than desired. There is probably far more poor and irreproducible research being published than there should be. As Lide (4) rather bluntly points out that the "scientific literature contains vast amounts of data collected for a specific purpose and presented by authors to support their conclusions... Unfortunately, the quality of the data preserved in the literature leaves much to be desired. This becomes apparent when data on a much-studied subject are systematically retrieved... The measurements for (about 200 values of the thermal conductivity of copper as a function of temperature) were analyzed by the Center for Information and Numeric Data Analysis and Synthesis at Purdue University. The scatter of these data illustrates the pitfalls of relying on a single value retrieved from the literature." Can the scientific community find a way to improve the peer review process? Based upon this system for published data on selenium in foods, it appears this is a goal that is achievable, at least in certain cases.

DATA QUALITY CRITERIA

For each of the five areas or categories used in the evaluation process (1), a detailed description of the criteria was prepared using knowledge of accepted analytical methodology, sample handling procedures, and quality control measures for selenium, as well as a knowledge of statistical methods, including statistically based sampling methods. As stated above, the ratings ranged from 3 (highest and most desirable) to 0 (lowest and unacceptable). For example, the evaluation criteria for the analytical method category are:

Rating 3 (Highest)

The official fluorometric method (reference provided) or other method was used and is documented by a complete write-up with validation studies for the foods analyzed. This includes use of an appropriate Standard Reference Material where available, 95-105% recoveries on a food similar to the samples analyzed which were reported in the same or another paper, and the selenium concentration above the quantitation limit of the method.

Rating 2

A modified fluorometric or other method was used and is partially documented, but validation studies for the foods analyzed are incomplete. There must be as least 90-110% recoveries on a food similar to the samples analyzed which were reported in the same or another paper, or good recoveries but no statistics are given in the paper, and/or the authors have used another method (official fluorometric, isotope dilution, or neutron activation analysis) on the same sample with good agreement (which is defined as within 10%).

Rating 1

A non-fluorometric method was used and is only partly described. Recoveries were either 80-90% or > 110% on a food similar to the samples analyzed, or even better recoveries were obtained or a comparison method was used on food samples with only a somewhat related nature to the sample in question.

Rating 0 (Lowest)

The method used for selenium analysis was not documented or referenced or the reference was inaccessible. No validation studies were performed or selenium levels found in the food sample by the test method compared poorly to those found by the comparison method (>10%).

With the above definitions it is expected that trained evaluators will derive the same ratings. Table 1 reproduces the manual worksheet for raw egg white which shows the ratings assigned to each of 8 selenium values found in the literature.

TABLE 1

Manual Worksheet for Rating Raw Egg Whites

<--------- Data Quality Criteria Ratings --------->

Number Analytical _(b)

Descrip- of Analytical Sample Sampling Quality Quality Comments

tion Samples Method Handling Plan Control Index

White 1 2 1 2 0 1.2 Duplicates,

No Quality

Control(QC)

Document.

Albumen 1 1 1 0 0 0.6 No Sampling

Plan or QC

Document.

White 2 2 2 2 0 1.6 No QC

Document.

White 1 0 1 2 0 0 No Method

Validation

White 3 2 1 0 0 1.2 Canadian

White 1 2 2 0 0 1.0 Canadian;

Part Triplicates

White 2 0 0 1 0 0 No Method

Validation

White 1 1 0 0 0 0 Mercury

Contamin-

ated Feed

Confidence Code = B._(a)

(a) The Confidence Code is derived from the sum of the Quality Indexes of the acceptable studies (QI > 1). In this case the sum is equal to 5.0.

(b) It is interesting and disappointing to see that analytical quality control measures are almost universally not reported.

SELEX IMPLEMENTATION

The initial SELEX implementation was written in ART (the Automated Reasoning Tool) on a VAXStation II. The main inferencing mechanism was backward-chaining (deductive reasoning), although approximately 10% of the rules were forward-chaining (inductive reasoning). The system was driven backwards from the so-called "rating rules" which generated an integer rating from 0 to 3 for each of 5 major categories. The system was rewritten as completely forward-chaining due to the fact that the automatic goal generating mechanism of ART produced unacceptable slowness in response time to users. The forward-chaining ART version was then converted to CLIPS (the C Language Interfacable Production System) (3), a forward-chaining rule-based system which uses the Rete pattern-matching algorithm also used by ART and the computer language OPS5. Examples of two rules and their English translations are given in Figure 2.

CLIPS was written by NASA's Artificial Intelligence Section, Mission Planning and Analysis Division at the Johnson Space Flight Center (3). CLIPS provided three immediate benefits. First, the CLIPS syntax is based closely on ART syntax so that SELEX could be ported quickly. Second, because CLIPS was written in standard C, it will run on any machine which has a suitable C compiler. This is particularly important in light of the fact that ART runs on a limited number of computers. Third, the source code was provided along with a built-in mechanism for adding functions so that extending and customizing CLIPS for SELEX was easily accomplished. For example, two extensions to CLIPS provide SELEX with the capabilities of verifying user input and keeping an audit trail file which contains the sequence of questions and the user's input for each session. The final system consists of approximately 200 rules and currently is implemented on VAX VMS and IBM PC MS-DOS machines, such as the IBM AT.

As already stated, SELEX derives ratings for five major categories of evaluation: number of samples, analytical method, sample handling, sampling plan, and analytical quality control. Information is gathered by SELEX by a process of intelligent questioning of the user. The system was designed so that only pertinent questions are asked. The responses are provided in accordance with information derived from the publication containing the selenium value to be rated. Depending upon the responses, SELEX can produce a rating for each category from as few as 6 and as many as 65 answers. Approximately 90% of the questions require only a yes or no response with the remaining 10% requiring numeric input. A portion of a sample session with SELEX is shown in Figure 3. As soon as SELEX has enough information to determine a rating for each of the five categories, the ratings are written to a file along with associated information such as a publication reference number and a description of the food. Periodically, this file is merged with a master file containing information from previously evaluated data. The master file is then analyzed with a SAS program which calculates a QI, a mean selenium value for each food, and a Confidence Code (CC) for that mean. The CC is derived from the QI's for all acceptable selenium values pertaining to a particular food.

SELEX VALIDATION

During development, SELEX was validated in two distinct ways. First, several of the 65 post-1960 selenium publications which reported original analytical selenium data for foods (from 33 different journals, reports, proceedings, and books) which have been manually evaluated by the domain experts were run through SELEX. In instances where there was a difference between the manual rating assignments and the computer expert system ratings, the differences were compared. When necessary, existing rules were clarified or changed. Also, if needed, additional rules were written to assure a correct evaluation. Second, hypothetical cases were run through the system to validate decision paths which were not encompassed by actual data from the publications. Ongoing validation will continue until the domain experts are satisfied that SELEX performs at an acceptable level.

BENEFITS and CONCLUSIONS

SELEX has several benefits over the original manual rating system. They are:

1. The manual system and the rules developed for SELEX incorporate knowledge from several domain experts who have complementary expertise. Therefore, the knowledge base is both broader and deeper than if only one expert had been used. With these rules incorporated in SELEX, publications can be rated by users who have less expertise than the domain experts.

2. During the process of formally defining the rating criteria as a rule set for SELEX, it was necessary to refine or restate some of the original criteria in more detail. Therefore, SELEX should produce more consistent results.

3. The formalization of the knowledge base facilitates its transfer to other users.

4. SELEX speeds the evaluation process and automatically maintains detailed records (audit trail) for each session.

5. SELEX reduces the "human error" factor by minimizing transcription, data entry, and calculation errors. The determination of a rating for a category, e.g., analytical method, results from the synthesis of several pieces of information. SELEX minimizes the errors that may be caused by the omission of information.

6. Since new publications with selenium data are evaluated intermittently, SELEX eliminates the need for the users to continually refamiliarize themselves with the complex set of heuristics.

The overall benefit, of course, is that SELEX will improve the definition and evaluation of the quality of the information available to identify any selenium-cancer correlation, since the results will be more accurate using an automated (objective method) rather than a manual one.

Although SELEX reduces the need for domain expertise, the user must have a certain level of understanding of analytical chemistry and nutrition science. Further refinement should reduce the level of expertise required by the user. SELEX will be generalized so that it is valid for the evaluation of published data from areas outside the United States. SELEX will provide the foundation of an expert system which can be adapted to evaluate data for a variety of nutrients. Part of this effort will include the engineering of an expert system which will automatically build rule sets for each nutrient. Such an expert system is possible because the structure of SELEX can be utilized as the template for new rules; the five categories nd the rating process for each will be similar for many, if not most, nutrients. For example, the rating criteria for analytical method will always include the use of a standard analytical method or methods, the description of non-standard analytical methods, validation of these analytical methods, and the use of reference materials. The expert system will be able to query the user about the specific details for each nutrient and generate a rule set which is analogous to the SELEX rule set.

REFERENCES:

1. J.M. Holden, A. Schubert, W.R. Wolf, and G.R. Beecher, Food and Nutrition Bulletin, 9 (Suppl. - Food Composition Data: The User's Perspective), (1987).

2. A. Schubert, J. Holden, W. R. Wolf, J. Am. Diet. Assoc., 87 (1987) 285.

3. Gary Riley or Chris Culbert, NASA/Johnson Space Center, Mission Planning & Analysis Division, Artificial Intelligence Section - FM72, Houston, TX 77058.

4. D. R. Lide, Jr., Science, 212 (1981) 1343.

Figure 2.

Two rules used to determine a rating for sample handling. The first rule asserts a rating from information that has been obtained from the user. The second rule is an example of a rule which queries the user for information. Each rule is followed by an English translation.

(defrule Rating-sample-handling-10

(declare (salience 100))

(seeking-rating sample-handling)

(homogenization-validation-data optimal)

(moisture-level-documented false)

(assert (rating sample-handling 2)))

Translation of rule Rating-sample-handling-10:

If you are seeking a rating for sample handling and the homogenization validation data is optimal and the moisture level was not documented, then the rating for sample handling is 2.

NOTE: This rule has a declared salience of 100. The system will "fire" this rule ahead of rules with lower salience. In this case we want rating rules to fire ahead of information gathering rules such as the one below (rules with no declared salience are assigned a default salience of 0) because once SELEX can determine a rating, no further information is needed. This exemplifies one key element of expert systems - intelligent questioning.

(defrule Food-preparation-documented

(seeking-rating sample-handling)

(or (perishable-food false)

(shipping-and-storage-appropriate true)

(shipping-and-storage-documented false))

(not (food-preparation-documented ?))

(if (y-or-n-p 3060 0 "Was the food preparation documented")

then (assert (food-preparation-documented true))

else (assert (food-preparation-documented false))

(assert (food-preparation-appropriate true))))

English translation for rule Food-preparation-documented:

If you are seeking a rating for sample handling and either the food is not perishable or the shipping and storage procedures were appropriate or the shipping and storage procedures were not documented and it is not known whether or not the food preparation was documented, then ask the yes-or-no question "Was the food preparation documented?". If the answer is yes then assert that the food preparation was documented or else assert that the food preparation was not documented and assume that the food preparation was appropriate.

Figure 3. Part of a typical session with SELEX. This portion represents the rating process for sample handling for a hypothetical example.

=============================================================

Now seeking a rating for sample-handling for selenium.

=============================================================

Was the sample handling procedure documented?

Response (Y or N): y

Was the sample food perishable?

Response (Y or N): y

Were the shipping and storage procedures documented?

Response (Y or N): n

Was the food preparation documented?

Response (Y or N): y

Was the method of food preparation appropriate?

Response (Y or N): y

Was only the edible portion of the food analyzed?

Response (Y or N): y

Was homogenization of the sample required?

Response (Y or N): n

Was the sample moisture level documented?

Response (Y or N): y

Was the moisture level of the sample appropriate?

Response (Y or N): y

The rating for sample-handling is 2.

Caption for Figure 1:

System overview of the selection of selenium core foods and evaluation of published data.