DSD is a new design-of-experiments (DOE) technique that is expected to bring huge benefits when using a Six Sigma optimization strategy
Statistical tools are deeply ingrained in the Six Sigma methodology to optimize processes and products during chemical process industries (CPI) operations. In fact, Six Sigma has been an important contributing factor for the widespread use of statistics in many different industrial sectors over the past several decades [ 1]. In Six Sigma’s DMAIC (define-measure-analyze-improve-control) roadmap, many statistical methods are pivotal for the proper collection of data and the ability to translate gathered data into useful information and actionable knowledge. In particular, the design of experiments (DOE) methodology appeals to many chemical engineers (see, for example, Chem. Eng., Nov. 2014 and Sept. 2016 issues [ 2,3]) as a methodology for systematically quantifying cause-and-effect relationships between input and output variables during both manufacturing processes and laboratory research and development (R&D) efforts. DOE is also widely used during the Improve phase of Six Sigma projects.
This article discusses the importance of DOE in R&D, and the new definitive screening design (DSD) technique. It also presents a case study to illustrate the power of the DSD technique by comparing its results with those obtained from a classic DOE process, the latter with a much larger number of experimental runs. Overall, with this practical tutorial, we aim to bring the DSD tool to the attention of individuals throughout the CPI.
DOE for scientific investigation
When a chemical engineer is confronted with a manufacturing or product-related problem, he or she will initially propose a first tentative model or hypothesis (which is usually based on speculation) to explain or solve the issue, as shown at the left side of Figure 1 . From this first model, the engineer will then deduce certain inherent consequences, which, in a next logical step, should be compared with data to support or refute the first model. Here DOE comes into play most of the time, as a technique to acquire new data using a scientific approach.
In many cases, the newly gathered data will not agree (or will only partially match) with the consequences of the initial model. As a result, engineers will usually call for a second, data-driven DOE model, which, likewise, will often lead again to some necessary consequences. Some variables from the DOE will turn out to have or not have an effect on the studied problem; some unexpected insights may appear and the engineer’s way of thinking about the studied problem will change accordingly. Hence, a second cycle in the iteration of the deduction-induction process is initiated, and so on [ 4]. Throughout this process, the discrepancy between model consequences and data diminishes (this is visualized by the converging lines, from left to right, in Figure 1), and new knowledge is generated — the key to process and product improvement.
Through the different stages of experimentation, different types of DOE are typically used, as shown in Figure 1. In the early stages of research, a screening DOE (called fractional factorial) is used to identify the important input variables (called the vital few) and to eliminate the irrelevant ones (called the trivial many) that affect the process performance or product quality. Starting with a large number of potentially important input variables (also called factors), screening DOEs aim to identify the vital few variables that demand further investigation; that is, the active factors that have the largest effect on the response of interest .
In the next stage of research, a full factorial DOE consists of all possible combinations of levels for the active factors. The purpose here is to quantify the effects of the latter on the response in a more precise and reliable way using linear models (main effects and interactions). Because the input variables are changed simultaneously in a DOE, possible synergistic and antagonistic interactions between the input variables can be detected; this is in contrast to a so-called OVAT approach, which involves changing only “one variable at a time.” Equally important to the detection of possible interactions is the recognition of eventual departure from linear relationships between response and input variables by including a center point in the design. This is shown on the left side of Figure 2.
The first-order design can detect global curvature, but it cannot separately estimate the quadratic effects of each factor. At this stage of experimentation, interpretation of the results combined with process and product expertise may allow the identification of the direction of steepest improvement toward an optimum in the response, possibly leading to another series of designed experiments — this time closer to the optimum — which can yield another first-order model before proceeding to the final stage. Finally, when it comes to optimization of the manufacturing process or product, a more elaborate model will be needed to describe the region around the optimal response and to locate the latter. A linear model will no longer be sufficient. Instead, a quadratic model will be used in a so-called response surface method (RSM) to fit the optimum.
Central composite design
Since we initially made use of a central composite design (CCD) in our case study discussed below, we will shortly discuss this type of RSM first. At the left side of Figure 2, a CCD is depicted for three factors. The CCD is very flexible as it can be set up in a modular way: initially, one starts with the execution of a two-level factorial DOE, where each factor is set at its low (–1) and high (+1) level to verify the effects on the response of interest. A center point, depicted as (0,0,0) in the center of the 3–D representation in Figure 2, may help the engineer to detect curvature in the relation between response and the factors. At that moment, star points (at levels –a -and +a) can be added to the design afterward to allow the engineer to properly quantify quadratic effects. This flexibility makes a CCD very popular in industrial process development [ 6]. From the CCD representation shown in Figure 2, it is clear that the design points are uniformly distributed in the experimental space.
Definitive screening design
In contrast to the most familiar screening designs where input variables are set at only two levels (–1 and +1, or low and high level), the definitive screening design (DSD) introduced in 2011 by Jones and Nachtsheim, employs three levels for the variables: –1, 0 and +1; or low, center and high level . For N variables the DSD requires only 2 N+1 experimental runs. In Table 1, the designs for the case of 4 to 6 factors are shown, as presented in the paper from the DSD-inventors Jones and Nachtsheim. The DSDs are comprised of N fold-over pairs plus one overall center run consisting of the center values of all variables (This is indicated in red color on the last rows of Table 1).
The following design pattern further characterizes a DSD :
- Regarding the location of the zeros (highlighted in grey color in Table 1) — The first two runs have zeros in the column of the first variable X1; the next two runs have zeros in the column of the second variable X2 and so on
- Regarding the pair of runs— These are mirrored (folded over), which means that the second run of a pair is found by multiplying the first run of this pair by –1. Hence, the first 2 N runs have exactly one variable at its center value (0), while all other variables are at their extremes (–1 or +1) and are referred to as “edge runs,” because the 3-D projections involving these variables, they are on the edges of the cube . In case the number of factors N is uneven, it is recommended to choose the DSD design for N+1 factors and then to drop the extraneous column, which results then again in an N-factor design with 2 N+3 runs . The two extra runs are, inherently, without center values for any factor and are referred to as “vertex runs” by the DSD-inventors . Likewise, to increase the power of any DSD, one may choose initially a design with a larger number of factors than necessary (for example, in total N+k factors), and then drop the extra k columns [7–9].
For the sake of visualization, a 3-D representation for a DSD with three factors is shown at the right side of Figure 2, although a DSD with only three factors is not recommended [ 9]. For this visualization, we took the DSD design for N= 4 from Table 1 and then dropped the last column. One can clearly see the so-called DSD edge runs and vertex runs (which comes down to cube points), together with the single DSD center run at the right side of Figure 2.
The advantages of a DSD are as follows (For more, see [7–9]):
- Each factor is analyzed at three levels, which makes it possible to analyze quadratic effects, and, thus, to model curvature with a very limited amount of runs, rendering DSDs a relatively inexpensive technique for users in the field
- Advantages are specifically found with regard to confounding variables (note: The term confounding, used in statistics, indicates that the effects of model terms cannot be calculated separately):
-The main effects are completely independent of each other, completely independent of two-factor interactions, and completely independent of quadratic effects (thus, there is no confounding at all)
-The two-factor interactions are not completely confounded with other two-factor interactions, although they may be correlated a little
-The quadratic effects are estimable and are independent of the main effects (no confounding) and not completely confounded, though correlated, with the two-factor interaction effects
-Last but not least, for DSDs with at least six factors, the DSD can fit — with very high level of statistical efficiency — the full quadratic models in case only three or fewer factors turn out to be active. As such, the DSDs become efficient response surface designs with three or fewer factors, rendering follow-up experiments (for the purpose of model optimizing) not necessary in many circumstances . “The capability to project these designs to efficient response surface designs makes possible the screening and optimization of a system in a single step,” according to the co-inventors Jones and Nachtsheim . This potentially huge benefit (making a shortcut from screening straight to optimization) is shown in Figure 1 with the dashed purple arrow. For comparison, a response surface design with three factors would require 20 runs.
In the original DSDs, the factors had to be quantitative to allow for 3 levels. New DSDs have very recently been developed that allow for combinations of continuous and categorical input variables [ 10].
Part of a particular Six Sigma project consisted of making a model using a CCD for the pH of formulations made of six ingredients ( A–F), whereby the ingredients were varied independently of each other. The number of laboratory test formulations prepared and evaluated was 90. In a laboratory-scale environment, this is doable. However, in a manufacturing plant, this would be difficult, perhaps impossible. The different test runs are shown in Table 2, where the cube points, star points and center points are clustered together for the sake of overview. The run order as executed in the laboratory is shown in the first column. The variables are shown with their coded levels, with the value of a (star points) being 2.83. The corresponding concentrations (wt.%) of the variables A– F are summarized in Table 3. The measured pH of the formulations can be found in the last column.
After these CCD experiments were executed, Six Sigma training was given at the Bayer Antwerp site (where the work was carried out), including the new statistical technique of DSD. In the context of the training, the Six Sigma project leader sought to verify which model would be found using a DSD with only 13 runs for the six factors, compared to the CCD model obtained earlier. For this, 13 extra formulations were prepared, as shown in Table 4.
The data sets were analyzed using Design-Expert (from Stat-Ease Inc.), including only model terms at a significance level of 0.05. A summary of both models is depicted in Table 5. The CCD model includes the main effects A, B, Cand F, the interaction effects AB, AC and BC, and the quadratic effects A², B² and C². The DSD model includes fewer terms, namely the main effects A, B and C, and the interaction effects AB and BC. It is remarkable that the coefficients in the DSD model are very similar to the one of the CCD model.
Regarding the CCD model, the R² adj (that is, the raw coefficient of determination, R², being adjusted for the number of model predictors) indicates that 99.5% of the variation in the pH is explained by the variation in the input variables; based on the complete CCD statistical analysis (not shown) the terms F, AC, A², B² and C² — which are not included in the DSD model — account only for 2% of the explained pH variation in the data set of the CCD, which is, after all, a rather minor contribution of these extra CCD model terms. Based on their high R² adj (99.5% for the CCD, and 98.5% for the DSD) we can state that they each fit well with their respective existing data set.
But how do both models compare when it would come down to predict the pH of new mixtures? We will first have a look to their predicted R² ( R²pred). This model summary is calculated by setting aside a single observation from the whole model data set, and then re-estimating the model based on all observations minus the one that is excluded (for example, in the case of the CCD, this would be 90 – 1 = 89 observations). Then the pH is calculated for the mixture that was intentionally ignored to build the model. Next, the predicted residual error is estimated for this single observation; that is the difference between the calculated pH ( pHcalc) and the experimentally determined pH ( pHobs) for the observation. This procedure is repeated for all observations. Finally, the so-called predicted residual error sum of squares ( PRESS) is calculated according to Equation (1):
This PRESS value is compared to the Sum of Squares ( SS) around the mean value of the whole data set (SS tot); in other words, in the latter case, a “mean model” is used that simply takes the mean response as a prediction for every mixture in the data set. The R ² pred is finally found via Equation (2):
The somewhat larger R²pred of 99.2% for the CCD compared to 97.7% for the DSD model suggests that the CCD has some greater predictive ability. However, the DSD model as such can certainly be classified as a model having a great predictive power.
Another way (and by far a better way) to validate a model is to use a completely new data set that was not used during the development of the model. This new data set — used in the stage of model validation — is frequently called a test set, in contrast to a training set that is used to calibrate (build) the model . The test set should ideally span the same “space” of the input variables as was the case for the training set. We then let the calibrated model predict the responses of the test set and compare them to the known, real responses.
In this case study, we will use the data set of the CCD as a test set for the DSD model; likewise, we will use the DSD data as a test set to validate the CCD model. Because the 12 axial points in the CCD data set are beyond the design space, wherein the DSD model was built (remember, the DSD model factor settings were between –1 and +1, whereas the CCD star points were going from –2.83 to +2.83), we will exclude the CCD mixtures with the star points to validate the DSD model; hence, the pH of the remaining 78 CCD mixtures were predicted with the DSD model and compared with the measured pH values. Figure 3 presents the predicted versus measured pH values for the validation of both models. Visually, one can see that both models adequately predict the pH, with some more variation in the case of the DSD model. In the context of model validation, the root mean square error of prediction (RMSEP) can be calculated as an estimate of the prediction error , using Equation (3):
The RMSEP can be interpreted as the average error to be associated with future predictions. In practice,
t 0.025; n -1 times the RMSEP (with t obtained from a Student’s t table , and n the number of data points of the test set; for example, based on the test set to validate the CCD model, t 0.025; 13-1 is 2.16. With larger test sets, the t value will approach 1.96. This may be used as estimated precision for predicted pH values. For the CCD model the RMSEP is 0.07 pH units, and for the DSD model, it amounts to a somewhat higher value of 0.09 pH units. For example, the pH of the mixture of DSD run 13 with an observed pH of 7.40 (first row in Table 4) is predicted by the CCD model to be 7.30 +/– 0.15 (the latter being 2.16 times 0.07).
Chemical engineers and scientists in the CPI embrace the DOE methodology developed by statisticians to efficiently investigate and optimize products and manufacturing processes. We therefore gratefully acknowledge the work done by statisticians to constantly search for even more efficient design-of-experiments approaches.
The DSD methodology lately developed by Jones and Nachtsheim is highly efficient with the number of runs required far below the number needed in classical screening designs. As a low number of experimental runs is in most cases a desirable requirement for many experimenters in the CPI, and certainly in cases where experiments need to be done in manufacturing plants, the DSD can be called a revolution in performing designed experiments. In case the statistical analysis of the experiments, a fortioriindicates that the number of active factors is limited, so for these cases, the use of definitive screening allows the engineer to make a shortcut from screening straight to optimization. It can be expected that the CPI will benefit from this latest statistical research, and that definitive screening designs will be applied with growing frequency over time, especially in CPI Six Sigma projects where a lot of factors need to be tested.
Edited by Suzanne Shelley
Bart Peeters is a manufacturing technologist at Bayer Crop Science (Haven 627, Scheldelaan 460, 2040 Antwerp, Belgium; Phone: +32 3 568 5762; Email: email@example.com), where he has been working since 1998. He first served as a process improvement engineer at Eastman’s PVB polymer manufacturing plant onsite (until 2004). Since then, he has been working at the environmental department of the company. Peeters is a certified Six Sigma Black Belt and coordinates the Six Sigma program at the Bayer Antwerp site. While working at Bayer’s WWTP, he obtained his Ph.D. in engineering from the KU Leuven (Belgium) on the research topic “Effect of activated sludge composition on its dewaterability and sticky phase.”Prior to that, he earned an M.S.Ch.E. degree from the KU Leuven, and an M.Bio.Ch.E. degree in 1996 from the university college De Nayer. Peeters is the author of 20 articles in scientific journals, technical magazines and international conferences.
Guido Desmarets is a senior management consultant at Stanwick NV (Axess Business park- building B-Guldensporenpark 20, 9820 Merelbeke, Belgium; Phone +32 9 210 59 50; Email: firstname.lastname@example.org), where he has been working since 1988. As a Master Black Belt he coaches worldwide companies in the field of Continuous Improvement, Lean Six Sigma, Operational Excellence, with major results in efficiency and quality improvements, cost reductions and delivery performance. Prior to joining Stanwick NV he held positions as R&D manager, process engineer, quality assurance and production management at international companies, where his interest to use DOE as a major process improvement tool was triggered and developed. He holds a master degree in chemistry (University of Ghent 1976). Subsequent research work resulted in several publications and patents on slow-release formulations for pesticides.
Marc Roels is a crop protection products laboratory specialist at Bayer Crop Science Europe N.V. (Phone +32 3 568 5185; Email: email@example.com), where he has been working since 1984. Roels provides analytical support for research, development, registration and manufacturing of herbicides at the Bayer Antwerp site and at different toller operations. For more than 15 years he is an enthusiastic user of DOE to study formulation robustness and formulation optimization. He received his B.S. in pharmaceutical and biological techniques in 1983. He is a certified Six Sigma Green Belt.
Sam Van Aeken was the team lead of the analytical services laboratory at Bayer Crop Science (Haven 627, Scheldelaan 460, 2040 Antwerp, Belgium), where he has been working since 2014. Prior to joining the firm, he worked from 2010 until 2012, as a process chemist in Eastman Ghent, managing new product trials. He was later employed at W.R. Grace / DeNeef as a R&D engineer, where he researched new urethane based grouting systems. Van Aeken graduated as a bio-engineer in 2004 at the Vrije Universiteit Brussel and obtained a Ph.D. in organic chemistry in 2010 with a dissertation on new synthetic routes towards aza-heterocyclic quinone compounds.
1. Steinberg, D.M., Industrial statistics: the challenges and the research, Quality Engineering, 28 (1), pp. 45–59, 2016.
2. Kleppmann, W., Design of Experiments (DOE): Optimizing products and processes efficiently, Chem. Eng., November, pp. 50–57, 2014.
3. Anderson, M.J., Design of Experiments (DOE): How to handle hard-to-change factors using a split plot, Chem. Eng, September, pp. 83–86, 2016.
4. Box, G.E.P., Hunter, J.S. and Hunter, W.G. “Statistics for experimenters: design, innovation, and discovery”, 2nd Ed., John Wiley & Sons, Inc., 2005.
5. Jones, B., 21st century screening experiments: what, why and how. Quality Engineering, 28 (1), pp. 98–106, 2016.
6. Anderson, M.J. “RSM simplified: optimizing processes using surface methods for design of experiments”, 2nd Ed., Productivity Press, 2016.
7. Jones, B. and Nachtsheim, C.J. A class of three-level desings for definitive screening in the presence of second-order effects, Journal of Quality Technology, 43, pp. 1-15, 2011.
8. Jones, B. and Nachtsheim, C.J. Blocking schemes for definitive screening designs. Technometrics 58 (1), pp. 74-83, 2016.
9. Jones, B., JMP Blog on proper and improper use of Definitive Screening Designs (DSDs) (https://community.jmp.com/t5/JMP-Blog/Proper-and-improper-use-of-Definitive-Screening-Designs-DSDs/ba-p/30703), 2016.
10. Jones, B. and Nachtsheim, C.J., Definitive screening designs with added two-level categorical factors. Journal of Quality Technology, 45, pp. 121–129, 2013
11. Esbensen, K.H. “Multivariate Data Analysis – In Practice. An Introduction to Multivariate Data Analysis and Experimental Design”, 5th Ed., CAMO software, 2010.
12. Bass, I. “Six Sigma Statistics with Excel and Minitab,” 1st Ed., Mc. Graw Hill, 2007.
One of the first experiments I ever conducted in my basement chemistry lab was to split water into hydrogen and…
Modern combustion technologies help processors safely meet emissions standards, while optimizing the process Combustion is integral to many chemical processes;…
New technologies can provide competitive advantages compared to established processes, but significant effort is required to transition a promising concept…
Many engineers may wonder if it is possible to have flashing liquid flow in a control valve even if flow…
Last month, news broadcasts announced that an artificial intelligence (AI) news anchor, said to be the world’s first ever, had…
5 ways to Optimize Production of Polymers and Intermediate Petrochemicals
7 Ways to Achieve Process Safety in Chemical Production
Five Reasons Why Chemical Companies Are Switching to Tunable Diode Laser Analyzer Technology
Simplify sensor handling and maintenance with ISM