* Author to whom correspondence should be addressed
Providing consistent data evaluation is critical to
scientific studies. An expert system for evaluating the efficacy
of the reported methodology for determining aqueous solubility is
described and compared with two other similar manual data quality
evaluation systems. The expert system, SOL, is a post-peer
review filter for data evaluation. SOL has been designed to run
on any IBM-PC compatible computer using the CLIPS public domain
expert system shell.
Data quality is an important issue frequently
underemphasized in scientific publications. Without accurate
data, proper decisions cannot be made. With the current
information explosion, more and more data are being reported. A
thorough evaluation of the quality of the data is often slighted
in the rush to publish. In the past it was possible to either
check a few journals or check with a few colleagues to find the
specific data required for a project. Today, that is no longer
the case since the scientific community has become much larger
and the number and diversity of compounds that people are
studying has become much greater (1). The work described here is
an attempt to define the quality of published data and provide a
way to improve the quality of data that will be published in the
As Lide has pointed out (1), CODATA, the Committee on Data
for Science and Technology of the International Council of
Scientific Unions, has proposed a three category scheme for
classifying scientific and technical data. Class A data are
repeatable measurements on well-defined systems. This would
include data which are subject to verification by repeating the
measurements in different labs at different times. Class B data
are observational data, which are dependent on such variables as
time, or in the case of agrochemical studies, they are dependent
on the nature of the soil in a particular location. Often they
cannot be repeated, and vary from one lab to another. Class C
data are statistical data, which would include health data,
demographic data, and so on. For class A data it should be
possible to devise an acceptable method for evaluation of the
data. That is, it should be possible to provide experimental
evidence that the data are acceptable.
As concern over the quality of the nation's groundwater has
increased in the past few years, the U.S. Agricultural Research
Service (ARS) has been charged with the task of trying to model
or predict the effects of pesticide contamination of groundwater.
With the increased emphasis on the use of simulation and
modeling, it is critical that accurate data be used. The
parameters required for developing models of groundwater
contamination include a number of fundamental chemical and
physical properties of pesticides: aqueous solubility, vapor
pressure, water/octanol partition coefficients, hydrolysis and
photolysis rate constants, Henry's law constant, soil absorption,
and so forth. We believe that all these types of data, except
soil absorption data fall within the CODATA definition of Class A
data. Aqueous solubility is certainly an essential property for
studies in this field. This is because pesticide solubility
plays a major role in determining how much of a chemical can be
dissolved and carried into the groundwater. Providing consistent
and objective evaluation of physico-chemical data is critical for
both planning future physico-chemical studies in addition to the
effective use of data in modeling, assessments, and other
As a search of literature containing aqueous solubility data
was undertaken, it became clear very quickly that the variations
in reported results were much greater than the expected
scientific measurement errors could justify. Examples of the
data variability for the aqueous solubility of representative
pesticides can be seen from the plots of log solubility vs.
1/Temperature (K) in
The plots of log (base 10)
solubility vs. the inverse of the temperature (in Kelvin) should
yield a straight line, but not a vertical straight line (such as
one might consider drawing based on the predominance of the data
(figure 1a)). The data in
Figure 1e and
are labeled as to their source (e.g., a handbook, or the scientific
literature, and in the case of the second plot in
figure 1e the
data measurements using squares represent data from the
literature) and their rating using the SOL expert system.
However, such labeling seems to have little significance since
most of the values used come from handbooks and the literature
without reference or proper experimental information, and hence
the rating of "U" or unknown quality (described later in the
text). In addition to the values shown in
(figure 1a)). The data in
Figure 1e and
there were other values from the scientific literature that had no
specific temperature stated for the solubility reported and hence
could not be plotted. What can be seen from these examples is
that these plots clearly indicate the overall inconsistency among
the quality of the data reported in the literature and in
handbooks, as well as the lack of quality control of data
reported in these sources.
Fenthion was the first pesticide that we examined closely
for the quality of reported aqueous solubility data (2).
Fenthion was introduced in 1957 and is used as an insecticide,
primarily for ornamental plants. In this study, we found that
most of the aqueous solubility values in the latest editions of
handbooks contained old and dubious values, all without reference
to the source of the information. In general, further scrutiny
of the values reported for aqueous solubility of pesticides made
it clear that for any credibility to be given to groundwater
modeling and chemical assessment efforts, the input data must be
We do not use the mean value or the value most frequently reported in the literature (primarily journals and handbooks), since it is not clear that the numbers are from different experiments. Furthermore, reported values can cluster around multiple values, as seen in Figure 1e. There are 8 values for the solubility of atrazine at 20oC - 27oC clustered about 30 mg/L or ppm (1.4 10-04 mole/L), in the same temperature range there are 4 values clustered at 70 mg/L or ppm (3.2 10-04 mole/L). Therefore a majority vote for the "best value" should not be considered an acceptable approach for evaluating solubility data.
Providing consistent and objective evaluation of physico-chemical studies is critical for planning future similar studies
and effectively using data in modeling, assessments, and other
activities. Since the current collection of solubility data was
less than credible, it was desirable to develop a method to
evaluate, to the extent possible, the quality of the data which
has been reported. Lacking the expertise in the analytical
chemistry area of solubility we (SRH and DWB), the ARS
scientists, initiated a joint project between the ARS and the
National Institute for Standards and Technology (NIST), formerly
the National Bureau of Standards (NBS), Organic Analytical
Research Division. The basic assumption in the critical
evaluation of aqueous solubility data, using an expert system
approach, is that sound methods should give good data. While a
good evaluation system cannot rectify a poor measurement, it can
assess the way the measurements were made and allow the users of
this data to be aware of its limitations and/or problems. The
experience gained from the development of the SELEX Expert System
for evaluating published data on selenium in foods (3) led to the
rapid development of the ARS SOL aqueous solubility expert system
described in this paper.
Data Quality Criteria
The first step in developing the Solubility Expert System,
SOL, was to decide which criteria would be used in the evaluation
scheme. Other criteria schemes for evaluating solubility have
been developed by Yalkowsky (4)
(Table 1) and Kollig (5)
The works of Yalkowsky and Kollig provided a foundation on
which this work is based. We produced a list of important
factors that should be considered in a laboratory which plans to
generate high quality aqueous solubility data. The final
criteria were chosen from an analysis of the two existing schemes
plus additional criteria which we felt to be most important and
practical for the task at hand. Practicality was a major
consideration in the final determination of the criteria that
would be included in the SOL expert system. We feel that SOL is
a robust system, but we chose to ask only the most important
questions and deleted any questions which we felt would not
effect the final results of the data quality rating.
The Yalkowsky scheme, shown in
Table 1 , rates the quality
with which the data is reported by assessing points and averaging
the five criteria, provided an excellent starting point. In this
scheme, all five criteria are weighted equally. However, when we
looked into programming the scheme, a number of difficulties were
discovered regarding the specificity of the criteria. The
primary difficulty was felt to be that this scheme was not able
to provide for sufficient differentiating of data quality. This
is due to the broad definitions of rating categories for a given
parameter. For example, in Yalkowsky's scheme the parameter of
purity gives the same rating to a compound of 50%, 90%, and 99.5%
purity, so long as the range of the purity is given. The five
categories used in the Yalkowsky scheme are also in the ARS/NIST
(Figures 2a and
Yalkowsky does not prioritize or
weight his data quality parameters, but rather provides an equal weighting
to all criteria. The fundamental difference between Yalkowsky
and the ARS/NIST scheme presented here is that we use a weighting
factor for all criteria or parameters.
The Yalkowsky solubility database (6,7) is the most comprehensive database of solubility data. Being comprehensive, Yalkowsky has chosen to have the user decide which data to use. There is little critical evaluation of the data, and values are listed for aqueous solubility which, in some cases, are highly questionable (e.g., see fenthion, Figure 1a). The database does have the considerable benefit of referencing all reported values, but many are from handbooks or other compilations which do not refer back to the original literature. Therefore, it is not possible to check the validity of the value or learn about the experimental conditions used to determine the reported value. For example, solubility values where the temperature is not stated (i.e., either explicitly not stated or stated as room temperature) was considered not worth entering into our database. The data which Yalkowsky extracts directly from literature citations may be useful as part of the regulatory process, but are not always internally consistent. For example, listing the solubility as "insoluble", "almost insoluble", "moderately soluble", "nearly insoluble", "practically insoluble", "very soluble", "very slightly soluble", "sparingly soluble", or "soluble", was felt to be too subjective. Certainly, if a student in a quantitative analytical chemistry course reported a solubility using one of the terms in the previous sentence, it is not likely the student would receive a passing grade for that experiment. However, as demonstrated by the data collected by Yalkowsky, data compilations are blindly accepted by some groups in the scientific community. With the recent public concern about the quality of scientific studies (8), an expert system such as SOL has the potential to help alleviate the concerns about good scientific experiments and help reduce cases of blind acceptance of data results. One colleague has even suggested that such "expert systems could become codes of experimental procedure and documentation much more usable and complete than written manuals (9)". While Yalkowsky makes the reader chose which solubility value to use, the ARS/NIST criteria provides the reader with more information, and effectively makes the choice for the user as to which solubility value to use.
For our database and evaluation scheme we decided that,
without a stated temperature, the aqueous solubility value should
not be accepted. Reporting the temperature as "room temperature"
or "ambient" has different meaning to different people. Such
terminology is not very specific since room temperature covers a
range of more than ten degrees celsius. For example, the
solubilities of a few insecticides from the work of Bowman and
Sans (9), shown in
Table 4, reveal considerable variations in
solubility over the temperature range from 20oC to 25oC, often
referred to as "room temperature".
In addition, the purity of the solute must be at least 95% and the actual value for purity must be known if the information is to be useful. While a purity of less than 95% may be acceptable for some purposes, such as those found in regulatory agencies where the actual commercial formulations are evaluated, higher purity should be required for a scientific database of reference data. This is primarily due to the extreme bias impurities introduce. In a recent study (11) it was found that a small amount of the more soluble phenanthrene as an impurity in anthracene caused the apparent solubility of anthracene to be measured as 0.075 mg/mL instead of 0.045 mg/mL. The correct value is obtained when ultra-high purity anthracene is used, or when techniques ar employed that isolate the anthracene signal from phenanthrene are employed, such as fluorescence (12) (for optical resolution) or liquid chromatography (for separation in time; i.e. time resolution).
Yalkowsky's evaluation for the process involved in
generating saturated solutions (equilibrium - agitation time), is
good for any evaluation scheme. In our scheme an acceptable
analytical method must be given. For the "analysis" parameter,
we believe that unless the analytical method is stated, the data
have insufficient merit to be used. Lastly, for the "accuracy
and precision" parameter we felt that the number of significant
figures did not provide the differentiation of the absolute value
of the aqueous solubility. It did not provide sufficient
differentiation between errors in solubility values in the parts
per million (ppm) and higher range as compared to the parts per
billion (ppb) range. In the latter case (ppb) larger percentage
errors are the norm since one is measuring values towards the
lower limit of instrument sensitivity. The Yalkowsky scheme does
not allow for the fact that state-of-the-art experimental
precision may still give a large relative error when a very low
solubility is measured. For example, we feel that reporting a
value of 10 ppb with an error of 2 ppb is quite acceptable. This
would be given a rating of "2" in our system (on a scale of 1-4,
with 1 being the highest possible rating) while 10 ppm with an
error of 2 ppm is less acceptable, and would be given a rating of
"4" (the lowest rating) in our scheme. (Furthermore, a rating of
"4" at this point would end the questioning and exit the user
from the SOL system with the final rating value being "4".) In
the Yalkowsky scheme both would be given a zero, the lowest
rating for that parameter.
The Kollig solubility evaluation scheme (5), shown in Table2 , is part of a larger plan which discusses criteria for evaluating the reliability of data on 12 environmental parameters. Only the criteria for the evaluation of aqueous solubility are included in Table 2. The Kollig evaluation criteria are extensive, having 31 general questions about the experiment and 6 specific questions about the experimental data for solubility. Very few, if any, published reports are detailed enough to answer all of the questions. Even if it were possible to do so, it would be impractical to expect anyone to spend the necessary time answering all these questions in an evaluation scheme. While the Yalkowsky scheme seems rudimentary, the Kollig scheme seems to include more questions that are absolutely necessary to determine data quality. In addition, the Kollig scheme does not have a numerical score. We believe that it is unnecessary to answer all 37 questions to obtain an unambiguous decision on the quality of the data. It is unlikely that anyone would use a rating scheme which required so much time and effort. In addition, some of the questions are too ambiguous to be answered reproducibly by different analytical chemists. For example, any answer to the question "Could you repeat the analytical part with the available information?" is subjective without work on the part of the reader. Other questions are not relevant to the quality of the data. For example, the question "Is a clear objective stated?", has little connection with data quality, but rather probably relates to why a given level of accuracy was acceptable for that experiment. The question "Was the paper peer-reviewed?" may not be meaningful with regard to data quality. There are a number of reasons for this. One is that years after the paper was published part or all of the methodology used may have been discovered to be unreliable. Another reason is that the thrust of the paper may not have had to do with a specific physical or chemical parameter, and hence the reviewer may have not examined or reviewed specific data reported in the paper. It is not clear how one could readily answer the question "Do independent lab data confirm results?", as agreement with other studies are not usually presented. If there are no estimated data or an acceptable method for estimation, the question of "Do estimated data confirm results?", cannot be answered. Lastly, for solubility data, asking "Was the chemical studied at a minimum of three temperatures?" would eliminate the vast majority of the reported data.
Analyzing the Yalkowsky and Kollig strategies led to the
creation of an evaluation scheme with the seven parameters shown
Table 5 and described in detail in
Figures 2a and
considerations in developing the scheme were that the scheme had
to give a reasonable answer (that is, one which a researcher in
the field would reasonably expect) and the system had to be easy
to use. In addition, the method was ordered so as to ask the
minimum number of questions in order to get a realistic data
quality rating. That is, not all questions need be asked if an
early answer in the sequence of questions produces the same
answer equivalent to going through the entire sequence of
questions. For example, if an early answer gives a "2" rating
and subsequent responses would not raise the rating, the program
terminates. If a critical question, such as solute purity or
temperature cannot be answered, then it was also deemed
unnecessary to ask any further questions. We base the rating in
our scheme on the lowest rating for any of the seven criteria and
do not average the sum of all the criteria as is the case in the
The ease of use, credibility of the experts who established
the criteria, acceptability, and practicality of the scheme are
considered critical if the scheme is to be accepted and used.
Thus, additional questions which did not change the rating of a
solubility value, were eliminated from the system. Furthermore,
as described above, questioning terminates as soon as the
ultimate rating is determined. The resulting questions are
considered the minimum needed to obtain an objective and
reproducible rating value. While these criteria of the SOL
expert system may or may not be better than other published
criteria, it is felt that they are clearly more practical.
Expert System Development
Once the criteria were selected, a computer system (of
approximately 40 rules) concerned with the seven factors listed
Table 5, was created to evaluate the quality of reported
aqueous solubility data. The system used was the NASA-developed
C Language Interfaceable Production System, known as CLIPS (12) -
a public domain expert system shell (13). CLIPS is written in
the C programming language and is basically a forward chaining
rule-based system based on the Rete pattern-matching algorithm
(14). Having the program in the C language makes it portable to
other computer systems. In fact, the CLIPS expert system shell
now runs on the IBM compatible PC computers, the Apple Macintosh
computer series, as well as the DEC VAX series of computers.
The evaluation scheme is based on approximately 40 rules which are concerned with seven general categories for its rule-making process, as shown in Table 3. The categories are given in the priority order which we think are of scientific importance in determining the quality of aqueous solubility data. These criteria were ordered into a decision tree structure. The way in which the first four SOL rules used in the expert system were programmed in CLIPS is shown in Figure 3. The purpose of this table is to show how uncomplicated the rules are actually written. One can also see how a rule is a simple set of a few lines of computer code. We hope this table will dispel any notion the reader might have that expert systems must be sophisticated and complex computer coded programs.
Clearly, on one hand the order of the seven categories in Table 5 are subjective and the rating value decisions in Figures 2a and 2b for each category, or question, are subjective. (For example, the decision to rate 99% purity of the solute as the highest, rather than using 98% is subjective.) However subjective some researchers may find these criteria and decisions, the ratings are believed to be defined properly enough and clearly enough so that they are reproducible. (Thus, again for discussion purposes, once we have chosen 99% or higher purity for the solute being needed to give the highest rating, anyone using the system and entering a purity of 99% or greater will get the same rating result.) In addition, the criteria are objective in that they follow a scientific method of development and analysis. The results of the solubility evaluation range from a rating of "1" (the best) to a rating of "U" (unevaluatable), which means the experimental information provided in the publication, or source, is not sufficient to undertake an evaluation. The ratings and their meaning are as follows:
1 - Highest rating. Experimental method of high quality. Not many data values are expected to meet this high level of quality and few data will get this score.
2 - Good rating. Some parts of the experimental method were below the highest standards. Many experiments published in the literature and elsewhere will meet these criteria.
3 - Acceptable rating. Experimental methods were all defined, but work was performed or reported at a minimal scientific level. Many good experimental values will fall in this rating category, owing to poor reporting, either from a lack of space in the journal, or the secondary nature of the solubility data in the particular reference.
4 - Poor rating. Experimental method was given for all parts of the experiment, but the methods or values indicated poor experimental procedures. Some of the older studies fall in this category, as more recent analytical chemistry has shown some problems with older techniques. In some cases, researchers, often not analytical chemists or appropriately trained, did not undertake the solubility measurements as correctly as needed.
U - Unevaluatable. There is insufficient data/information to evaluate the numerical value presented. U could also stand for unknown, but it was felt the word unevaluatable gets across the point of poor experimental data reporting in a more forceful manner.
A number of published solubility studies were used to
evaluate the rating scheme. The SOL expert system provided
results which are consistent with the subjective opinions of the
solubility experts at NIST and ARS who have tested the SOL
program. For example, the Pesticide Manual (15), Merck Index
(16), and the Agrochemical Handbook (17) do not publish
references or methodologies, and therefore all the aqueous
solubility data from these three sources are rated "U".
The expert system, called SOL, is an IBM-PC based computer program, which requires 256K of memory and has a total system storage requirement of less than 300,000 bytes of total disk storage (which allows it to fit on a low density floppy disk). Help messages have been written to assist users at every step of the evaluation scheme. A typical help message is shown in Figure 4. The SOL expert system program creates two disk files to keep track of the results. One file contains the final SOL expert system rating. The second file contains all the answers a user has entered in response to the questions, so it is possible to repeat an evaluation to see if similar answers were given at different times or different people gave different answers. The system is both easy to use and easy to update with new rules. However, in order to assure consistency in ratings, the system is released only in a computer executable version (with no source code). Should, in the future, changes need to be made in the SOL system, these will be made and the new executable version distributed to all users of SOL.
The system can be run from either a floppy or hard disk.
Routinely it is run from 80X86 cpu desk-top computers, as well as
from portable computers. The response time for answering each
question is effectively instantaneous on all computers. The SOL
expert system is available upon request from the USDA (19).
In order to test the SOL expert system, we undertook both some experimental work and analyzed some existing literature studies which we felt were of high quality. For the experimental solubilities we measured the solubility of Chloropyrifos and Bromocil (57). The results of these are shown in Figures 1e and 1f respectively. We believe these results, which lead to high ratings in our evaluation process, show that it is possible to obtain good solubility data by using and reporting good methodology.
Secondly, we analyzed a number of published reports on
aqueous solubility data for three hydrocarbons and one PCB. In
each case the literature citation was obtained and one of the
authors went through the SOL expert system with the literature
reference in hand and answered the questions from the system.
This was repeated for all 22 solubility measurements found in the
15 literature citations (19-33). These results are shown in
Table 5. The ratings are what were expected by the reviewer
based on their subjective opinions concerning the quality of the
solubility studies. The results of this table show that the
values which have been assigned high ratings (2's and 3's) are
consistent with reviewers opinion concerning the data quality.
The SOL expert system also is able to screen out erratic values,
such as the 1.29 10-08 and 6.34 10-08 values for the solubility of
PCB 101. The fact that some apparently good or correct values
have been given low ratings is due to the lack of experimental
detail and information reported. Thus, while some good data may
be rated lower, it does not appear that any reliably reported bad
data has been given a higher rating than it deserves. In a
perfect evaluation system this would not occur, but without the
needed experimental information, correct and honest documentation
of the actual experimental work performed, which must be
available when the SOL expert system is used, it is not possible
to achieve perfection, let alone accurate results.
An easy to use and reproducible data evaluation scheme for
aqueous solubility has been developed. The scheme has been
implemented as an expert system, SOL, for IBM PC based computers.
The SOL program is considered to be user friendly and requires a
minimum number of questions to provide the user with a rating for
the particular solubility study being examined. In testing the
SOL software, the ratings which the system reports are in
agreement with qualitative opinions of the experts who examined
the various articles.
Additional expert systems for evaluating other CODATA Class A physical and chemical parameters are being developed, vapor pressure being the next property under study, and will be used in the evaluation of data for the ARS Pesticide Properties Database (PPD). It is hoped that other solubility data evaluation projects will also make use of the SOL program.
The authors wish to thank Michele M. Schantz and Franklin Guenther, NIST for their experimental work in support of this project. The authors also wish to thank Sam Yalkowsky for his valuable thoughts and comments on solubility data evaluation and Karen Scott for her assistance in the PPD data collection activities.
Figure captions for Figures 1a - 1d:
Title: Examples of Data Variability - Solubility Values are from the ARS PPD or the Yalkowsky SOLUB database. The data are plotted with the y axis being the log (base 10) of the solubility in mg/L and the x axis being 1/Temperature in Kelvin (multiplied by 10-3).
Title: Comparisons of solubility data from the literature and from NIST experiments for Chloropyrifos.
Solubility data from NIST experiments for Bromocil
Footnote to Figures 1a-1f:
The numbers in parenthesis next to each point refer to the source (literature reference) of the data followed by the SOL rating value.
Evaluation Criteria (Weighted 1-5)
1. Is analytical method recognized as an acceptable method? (4)
2. Could you repeat the analytical part with the available information? (5)
3. Was the chemical analyzed within the linear range of the instrumentation? (3)
4. Is the detection limit stated? (2)
5. Was either HPLC or GC used? (4)
6. Is extraction efficiency stated? (3)
1. Could you repeat the experiment with available information? (5)
2. Is a clear objective stated? (1)
3. Is water quality characterized or identified? (2)
1. Were replicate samples analyzed? (3)
2. Were replicate sample systems run? (3)
3. Is precision of analytical technique reported? (3)
4. Is precision of sample analysis reported? (3)
5. Is precision of replicate sample systems reported? (3)
1. Was paper peer-reviewed? (5)
2. Do independent lab data confirm results? (3)
3. Do estimated data confirm results? (2)
Specific Additional Information for Solubility Data
1. Was final equilibrium shown over time? (5)
2. Was the reaction vessel capped? (3)
3. Was the sample kept in a constant temperature bath? (4)
4. Was a thermostated centrifuge used at the same temperature? (4)
5. Was stability of the chemical in water shown? (5)
6. Was the chemical studied at a minimum of three temperatures? (4)
Sample SOL rules as programmed using the CLIPS Expert System Shell program
;;; Start of questions
(printout t crlf "Purity of solute")
(printout t crlf crlf "a. 99% <= Purity < 100%"
crlf "b. 95% <= Purity < 99%"
crlf "c. Purity < 95%"
crlf "d. Purity not stated")
(printout t crlf crlf "Enter purity (a-d): ")
(assert (solute-purity =(read 1010 a d))))
(printout t crlf "Purity of Water Used in Experimental Procedure"
crlf crlf "a. HPLC grade (organic free)"
crlf "b. distilled/demineralized"
crlf "c. demineralized"
crlf "d. distilled"
crlf "e. none of the above"
crlf crlf "Enter purity (1-5): ")
(assert (water-purity =(read 1020 a e))))
(if (y-or-n-p 1030 0 "Was the temperature stated")
then (assert (temperature-stated true))
else (assert (temperature-stated false))))
(if (y-or-n-p 1040 0 "Was the temperature controlled?")
then (assert (temperature-controlled true))
else (assert (temperature-controlled false))))
(if (y-or-n-p 1050 0 "Was the solubility plotted vs. temperature or otherwise corrected for")
then (assert (temperature-corrected true))
else (assert (temperature-corrected false))))
HELP11 - Mean Value for the Solubility
This question is asked to decide what rating will be given
to the standard error reported. Solubility values greater than 1
ppm should be reported to two (2) significant figures. If the
mean solubility value is greater than 1 ppm (answer 1) and there
are less than two significant figures,( i.e., a "no" answer is
given to the question asking if there are at least two
significant figures in the mean value of the solubility), the
solubility is given a "4" rating. If the answer is "yes", the
program then continues on to ask for the standard error
(described in the next paragraph).
If the answer is category 2 or 3 (solubility less than 1 ppm
or solubility less than 100 ppb), you will then be asked for the
standard error for the solubility value as a percentage of the
mean value. The standard error is a number from 0 to 100.
The Standard Error Rating Table used by the SOL expert system is shown below:
|Maximum for a|| Percent Standard
Thus, for example, if the reported solubility is 10 ppm
(category a1), then a 10% error yields a "2" rating. A 10% error
for a solubility value of 10 ppb (category c) yields a "1"
rating. A 20% error for a solubility value of 200 ppb (category
b) yields a "2" rating, but would yield a "4" rating if the
solubility value was 100 ppm.
1. solute purity
2. water purity
3. temperature/temperature control
4. polarity of solute
7. saturated solution methodology
1. Lide, D., Critical Data for Critical Needs, Science, 212,
2. Heller, S. R., Scott, K., and Bigwood, D. W.,
"The Need for
Data Evaluation of Physical and Chemical Properties of
Pesticides", J. Chem. Info. Comput. Sci., 29, 159-162 (1989).
3. Bigwood, D. W., Heller, S. R., Wolf, W. R., Schubert, A., and
"SELEX: An Expert System for Evaluating Published
Data on Selenium in Foods", Anal. Chim. Acta, 200, pages 411-419
4. Yalkowsky, S., Presentation at "The Estimation of Physical
Data for Organic Compounds", Beilstein Workshop, Bolzano, Italy,
16-20 May, 1988.
5. Kollig, H. P., "Criteria for Evaluating the Reliability of
Literature Data on Environmental Process Constants", Tox. Envirn.
Chem., 17, 287-311 (1988).
6. Adb, the ARIZONA dATAbASE of Aqueous Solubility, College of Pharmacy, University of Arizona, Tucson, AZ 85721.
7. See, for example, Science, 244, page 765, 19 May 1989 and
pages 642-646, 12 May 1989.
8. Personal communication, 5 June 1989, from S. Pacenka, Cornell
University, New York State Water Resources Institute, Ithaca, NY
9. Bowman, B. T., and Sans, W. W., "Effect of Temperature on the
Water Solubility of Insecticides", J. Envirn. Sci. Health,
B20(2), pages 625-631 (1985).
10. May, W. E., Wasik, S. P., and Freeman, D. H., "Determination
of the Aqueous Solubility of Polynuclear Aromatic Hydrocarbons by
a Coupled-Column Liquid Chromatographic Technique, Anal. Chem.,
50, 1 (1978).
11. Schwarz, F. J., "Determination of Temperature Dependence of Solubilities of Polycyclic Aromatic Hydrocarbons in Aqueous Solutions by a Fluorescence Method", J. Chem. Eng. Data, 22, 273-277 (1977).
12. CLIPS, Catalog # MSC-21208, is available for the IBM PC,
Macintosh, and VAX computers from the NASA COSMIC Software
Catalog, 1988 Edition, 382 East Broad Street, Athens, GA 30602.
The cost is $250 for the software and $62 for the documentation.
13. Raeth, P. G., "Two PC-based Expert System Shells for the
First-time Developer", Computer, 73-81, November 1988.
14. Bridgeland, D. and Lafferty, L., "Scavenger: An Experimental
Rete Compiler", Proceedings of the International Society for
Optical Engineering, 635, 487-496 (1986), Bellingham, WA.
15. "The Pesticide Manual, A World Compendium", 7th edition,
Edited by Worthing, C. R., and Walker, S. B., The British Crop
Protection Council, 144-150 London Road, Croydon CR0 2TD,
16. "The Merck Index", 10th Edition, G. Delko, Editor, Merck &
Co., Inc., Rahway, NJ 07065-0900, 1983.
17. Royal Society of Chemistry, "The Agrochemicals Handbook", 2nd
Edition, Royal Society of Chemistry, The University, Nottingham
NG7 2RD, England, 1987.
18. Requests for the IBM PC version should be addressed to:
Office of Genome Mapping, USDA, ARS, Bldg. 005, Room 333,
Beltsville, MD 20705. It is available only on a 5 1/4 inch
19. Tewari, Y.B., Miller, M.M., Wasik, S.P., and Martire, D.E.,
Aqueous Solubility and Octanol/Water Partition Coefficient of
Organic Compounds at 25.0C, J. Chem. Eng. Data, 27, 451-454
20. McAuliffe, C., Solubility in Water of Paraffin,
Cycloparaffin, Olefin, Acetylene, Cycloolefin, and Aromatic
Hydrocarbons, J. Phys. Chem., 70, 1267-1275 (1966).
21. Fühner, H., Die Wasserlöslichkeit in Homologen Reihen, Chem.
Ber., 57, 510-515 (1924).
22. Sanemasa, I., Araki, M., Deguchi, T., and Nagai, H.,
Solubility Measurements of Benzene and The Alkylbenzenes in Water
by Making Use of Solute Vapor, Bull. Chem. Soc. Jpn., 55, 1054-1062 (1982).
23. Polak, J. and Lu, B. C.-Y., Mutual Solubilities of
Hydrocarbons and Water at 0 and 25C, Can. J. Chem., 51, 4018-4023 (1973).
24. Nelson, H.D. and DeLigny, C.L., The Determination of the
Solubilities of Some n-Alkanes in Water at Different
Temperatures, by means of Gas Chromatography, Recueil, 87, 528-544 (1968).
25. Miller, M.M., Ghodbane, S., Wasik, S.P., Tewari, Y.B., and
Martire, D.E., Aqueous Solubilities, Octanol/Water Partition
Coefficients, and Entropies of Melting of Chlorinated Benzenes
and Biphenyls, J. Chem. Eng. Data, 29, 184-190 (1984).
26. Dickhut, R. M., Andren, A. W., and Armstrong, D. E., Aqueous
Solubilities of Six Polychlorinated Biphenyl Congeners at Four
Temperatures, Environ. Sci. Technol., 20, 807-810 (1986).
27. Burkhard, L.P., Armstrong, D.E., and Andren, A.W., Henry's
Law Constants for the Polychlorinated Biphenyls, Environ. Sci.
Technol., 19, 590-596 (1985).
28. Haque, R. and Schmedding, D., A Method of Measuring the
Water Solubility of Hydrophobic Chemicals: Solubility of Five
Polychlorinated Biphenyls, Bull. Environ. Contam. & Toxicol., 14,
29. Weil, V. L., Duré, G., and Quentin, K.-E., Wasserlöslichkeit
von Insektiziden Chlorierten Kohlenwasserstoffen und
Polychlorierten Biphenylen im Hinblick auf eine Gewässer-belaslung mit diesen Stoffen, Wasser und Abwasser-Forschung, 6,
30. Chiou, C.T., Freed, V.H., Schmedding, D.W., and Kohnert,
R.L., Partition Coefficient and Bioaccumulation of Selected
Organic Chemicals, Environ. Sci. Technol., 11, 475-478 (1977).
31. Dexter, R.N. and Pavlou, S.P., Mass Solubility and Aqueous
Activity Coefficients of Stable Organic Chemicals in the Marine
Environment: Polychlorinated Biphenyls, Mar. Chem., 6, 41-53
32. Bohon, R.L. and Claussen, W.F., The Solubility of Aromatic
Hydrocarbons in Water, J. Amer. Chem. Soc., 73, 1571-1578 (1951).
33. Sutton, C. and Calder, J.A., Solubility of Alkylbenzenes in
Distilled Water and Seawater at 25.0C, J. Chem. Eng. Data, 20,
34. Brust, H. F., "A Summary of Chemical and Physical Properties
of Dursban", Down To Earth, 22, 21-22(1966).
35. Felsot, A. and Dahm, P., "Sorption of Organophosphorus
Carbamate Insecticides by Soil", J. Agric. Food Chem., 27, 557-563(1979).
36. "The Pesticide Manual", 4th edition, Edited by Martin, H.,The British Crop Protection Council, 144-150 London Road, Croydon CR0 2TD, England, 1977.
37. Table A-2, page 220 in Khan, S. U., "Pesticides in the Soil
Environment", Volume 5 in the series "Fundamental Aspects of
Pollution Control and Environmental Science, Elsevier, Amsterdam,
38. "Farm Chemicals Handbook", Meister Publishing Co., 37841
Euclid Avenue, Willoughby, OH 44094, USA 1988 (216-942-2000).
39. Suntio, L. R., Shiu, W. Y., Mackay, D., Seiber, J. N., and
Glotfelty, D., "A Critical Review of Henry's Law Constants for
Pesticides", Rev. Envirn. Contam. Tox, 103, 1-59(1988).
40. Bowman, B. T., and Sans, W. W., "Further Water Solubility
Determinations of Insecticidal Compounds", J. Envirn. Sci.
Health, B18(2), pages 221-227 (1983).
41. Mobay Corporation, Agricultural Chemicals Division, Report
#94648, "Water Solubility of Fenthion Pure Active Ingredient",
42. Weed Science Society of America, "Herbicide Handbook", 5th
edition, 1983. This handbook is available from the Weed Science
Society of America (WASA), 309 W. Clark Street, Champaign, IL
61820 USA (217-356-3182).
43. Spillner, C. J., Thomas, V. M., Takahashi, D. G., and Scher,
H. B., "A Comparative Study of the Relationships Between the
Mobility of Alachlor, Butylate, and Metolachlor in Soil and Their
Physicochemical Properties", Chapter 12 in ACS Symposium Series
#225, "Fate of Chemicals in the Environment", ACS Washington, DC
44. Y. Eshel, "Phytoxicity, Leachability, and Site of Uptake of
2-chloro-2',6'-diethyl-N-(methoxymethyl) Acetanilide", Weed Sci.,
45. Ciba Geigy, "Toxicology Data", Technical Report 1977, page 1,
Department of Industrial Medicine, Agricultural Division,
Ardsley, NY 10502.
46. Beilstein, P., Cook, A. M., and Hutter, R., "Determination of
Seventeen s-Triazine Herbicides and Derivatives by HPLC", J.
Agric. Food. Chem., 29, 1132-1135(1981).
47. Calvet, R., Terce, M., and Le Renard, J., "Kinetics of
Dissolution of Atrazine, Propazine, and Simazine in Water", Weed
Res., 15, 387-392(1975).
48. Hartley, G. S., and Graham-Byrce, I. J., "Physical Principles
of Pesticide Behavior", Volume 4, Appendix 4, Academic
49. Bartley, C. E. "Triazine Compounds", Fm. Chem., 122, 28-34
50. Melnikov, N. N., "Chemistry of Pesticides, Residue Reviews,
Volume 36, Springer-Verlag, New York (1971).
51. Internal data submitted to the states of Arizona and California, Ciba-Geigy Corporation, Agrochemical Division, Animal Health and State Regulatory Affairs, PO Box 18300, Greensboro, NC 27419.
52. Hormann, W. D., and Eberle, D. O., "The Aqueous Solubility of
Obtained by an Improved Analytical Method", Weed Res., 12, 199-202 (1972).
53. Hurle, K. B. and Freed, V. H., "Effect of Electrolytes on the
Solubility of some 1,3,5-triazines and Substituted Ureas and
their Adsorption on Soil", Weed Res., 12, 1-10(1972).
54. Verschueren, K., "Handbook of Environmental Data on Organic
Chemicals", page 231, 2nd Edition, Van Nostrand Reinhold, New
55. Ward, T. M., and Weber, J. B., "Aqueous Solubility of
Alkylamino-s-Triazines as a Function of pH and Molecular
Structure", J. Agric. Food Chem., 16, 959-961(1968).
56. Getzen, F. W. and Ward, T. M., "Influence of Water Structure
on Aqueous Solubility", Ind. Eng. Chem. Prod. Res. Develop., 10,
57. M. M. Schantz and F. Guenther, private communication. The
chemicals used were of purity of 99% or higher. The chemicals
were not ionizable. HPLC grade, organic free, water was used as
the solvent. The solubility experiments were done over several
days, in a controlled temperature bath. At 25 oC there were 12
different measurements made and the reported value is the
averaged measurement. At the other temperatures there were at
least six different measurements made and then averaged to
produce the one reported. All the data were plotted for the
parameters solubility vs. temperature as shown in
Figures 1e and