12. Experimental Design 2
Coagulation process to remove microplastics from water
Factors: Coagulant dosage, pH, stirring speed
Response: Removal efficiency
(% microplastics)
How do process parameters affect microplastic removal efficiency?
Experimental Design: A structured approach to planning experiments
Goal: Maximize information while minimizing resources
Focus: Identify key factors and their interactions
Key Benefits:
Improves efficiency and reliability of results
Provides insights into cause-and-effect relationships
1: Method optimization
Focus: Identify the best combination of factors (dynamically exploring design space)
Goal: Maximize performance (e.g., sensitivity, efficiency)
Example: Optimizing pH, temperature, and time for a chemical reaction
2: Determination of effects
Focus: Quantify the impact of individual factors and their interactions (statically exploring design space)
Goal: Understand cause-and-effect relationships
Example: Assessing how pH, stirring speed, and dosage influence pollutant removal
What is DoE?
A systematic approach to plan, conduct, and analyze experiments to understand how multiple factors (e.g., water pH, pollutant concentration) affect key outcomes (e.g., degradation efficiency).
Goals of DoE
Characterize relationships between factors and the response to understand their influence on system behavior.
Develop a mathematical model that enables interpolation across the chemical space for predictive analysis.
Capture interactions between factors to identify combined effects on the response.
Core Components of DoE
Factors: Controlled variables such as pollutant type, pH, or catalyst concentration.
Levels: Specific values, e.g., pollutant concentrations of 10, 50, and 100 mg/L.
Response: Measurable outcomes like pollutant degradation (%) or toxicity reduction.
Why Structure Matters
A structured design helps uncover relationships like how pH influences heavy metal precipitation.
Efficient Experimentation
DoE minimizes tests needed to determine the optimal dosage of coagulants in water treatment.
Identifies Key Insights
Reveals how interactions between pollutant types and catalysts affect degradation rates.
Improves Process Understanding
Optimizes processes like advanced oxidation or nutrient recovery from wastewater.
Full Factorial Design
Explores all combinations of factors, e.g., pH, temperature, and pollutant type, providing comprehensive insights.
Fractional Factorial Design
Tests a subset of combinations to reduce effort, e.g., identifying key factors in pesticide degradation.
Box-Behnken Design (BBD)
Efficiently explores factor interactions using midpoints of edges and the center of the design space, avoiding extreme conditions.
Objective
Investigate how temperature and pH influence pollutant removal efficiency in water samples.
Factors and Levels
Factor 1: Temperature → Levels: -1 (Low, 20°C), +1 (High, 40°C)
Factor 2: pH → Levels: -1 (Low, pH 5), +1 (High, pH 9)
Design Type
Full factorial design with 2 levels per factor (2 × 2 = 4 experiments).
Design Matrix
All combinations of factors and levels (-1, +1) form the matrix.
| Run | Temperature (X₁) | pH (X₂) | Removal Efficiency (Y) |
|---|---|---|---|
| 1 | -1 | -1 | 70 |
| 2 | -1 | +1 | 75 |
| 3 | +1 | -1 | 85 |
| 4 | +1 | +1 | 90 |
Design Matrix (for Regression)
Every factor \(X_i\) is a column in the design matrix, with each row representing a run.
In additon, the very first column is a column of 1s, representing the intercept term.
Intercept can be interpreted as the part of the response that is not explained by the factors. \[ X = \begin{bmatrix} +1 & -1 & -1 \\ +1 & -1 & +1 \\ +1 & +1 & -1 \\ +1 & +1 & +1 \end{bmatrix} \]
Fitting the Model
Using the data:
Run 1: Y = 70 Run 2: Y = 75 Run 3: Y = 85 Run 4: Y = 90
Estimated Coefficients
β₀: Average response (intercept)
β₁: Effect of temperature on removal efficiency
β₂: Effect of pH on removal efficiency
Example Result
Y = 80 + 7.5X₁ + 2.5X₂
Regression Model
Fit a linear regression model:
Y = β₀ + β₁X₁ + β₂X₂
Y: Removal efficiency (%) X₁: Temperature (-1, +1) X₂: pH (-1, +1)
Apply the Regression
Calculate the coefficients \(\beta_0, \beta_1, \beta_2\) using the design matrix \(X\) and the response vector \(Y\). \[ \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix} = (X^T X)^{-1} X^T Y \]
Insights from Coefficients
β₁ (7.5): Higher temperature significantly improves removal efficiency.
β₂ (2.5): Higher pH has a smaller positive effect.
Predictions
Predict removal efficiency for any combination of temperature and pH using the regression equation.
Visualization
Create 2D or 3D plots to illustrate the effects of temperature and pH on removal efficiency.

Hypothesis Testing
For each coefficient (β₁, β₂):
- Null hypothesis (H₀): The coefficient is not significant (β = 0).
- Alternative hypothesis (H₁): The coefficient is significant (β ≠ 0).
t-Test for Significance
Use the formula for the t-statistic:
t = β / SE(β)
Where SE(β) is the standard error of the coefficient.
Decision Rule
Compare the calculated t-value to the critical t-value from the t-distribution table at a chosen significance level (e.g., α = 0.05).
If |t| > critical value, the coefficient is significant.
Example Results
t₁ (β₁ for temperature): Significant (p < 0.05).
t₂ (β₂ for pH): Not significant (p > 0.05).
Understanding Coefficients
Regression coefficients in coded levels (\([-1, 1]\)) describe the change in response when the factor changes by the full range (\(+2\) units in coded space).
Converting to Real Units
The effect of a factor in real units is calculated by dividing the total response change by the real range of the factor.
Response Change per Real Unit = 2 × β (coded) / Real Range
Example: Temperature
β₁ (coded): 7.5
Real Range: 20°C
Total response change over full range: \( 2 × 7.5 = 15 \)
Change per °C: \( 15 / 20 = 0.75 \) response units per °C
Final Model Interpretation
The temperature coefficient (\( \beta_1 \)) means that for every 1°C increase, the response increases by 0.75 units.
For pH or other factors, use the same formula to adjust the coefficient to the real range.
Number of Experiments
In a full factorial design, the total number of experiments (\(N\)) depends on the number of factors (\(k\)) and the levels per factor (\(L\)).
Formula
\( N = L^k \)
Where: \(L\): Number of levels per factor (e.g., 2 for \([-1, +1]\)) \(k\): Number of factors
Example
For \(k = 3\) factors (\(X_1, X_2, X_3\)) and \(L = 2\) levels (\([-1, +1]\)):
\( N = 2^3 = 8 \) experiments
Key Insight
The number of experiments grows exponentially with the number of factors. For \(k = 5\) factors and \(L = 2\): \( N = 2^5 = 32 \).
Experiment Count Examples
| Number of Factors (\(k\)) | Experiments (\(N = 2^k\)) |
|---|---|
| 2 | 4 |
| 3 | 8 |
| 4 | 16 |
| 5 | 32 |
Degrees of Freedom (df) in Full Factorial Designs
The degrees of freedom (\(df\)) are calculated as:
df = N - p - 1
Where: \(-\) \(N\): Number of experiments \(-\) \(p\): Number of coefficients (factors or interactions) \(-\) \(1\): For the intercept
Model Scenarios
Main Effects Model: Only main factors are included (\(p = k\)).
Main Effects + Pairwise Interactions: Includes main factors and pairwise interactions (\(p = k + \frac{k(k-1)}{2}\)).
Key Insight
The model requires \(df \geq 0\) to estimate coefficients and fit the data. Pairwise interactions significantly increase \(p\), reducing \(df\).
Comparison Table
| Number of Factors (\(k\)) | Experiments (\(N = 2^k\)) | df (Main Effects) | df (Main + Pairwise Interactions) |
|---|---|---|---|
| 2 | 4 | 1 (4 - 2 - 1) |
0 (4 - 3 - 1) |
| 3 | 8 | 4 (8 - 3 - 1) |
1 (8 - 6 - 1) |
| 4 | 16 | 11 (16 - 4 - 1) |
5 (16 - 10 - 1) |
| 5 | 32 | 26 (32 - 5 - 1) |
16 (32 - 15 - 1) |
What is an Interaction?
An interaction occurs when the effect of one factor on the response depends on the level of another factor.
Interactions highlight synergies or conflicts between factors that cannot be explained by their individual effects alone.
Example
Consider two factors:
Temperature
(\(X_1\)) and pH
(\(X_2\)):
Without interaction: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2\).
With interaction: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_{12} X_1 X_2\).
The interaction term (\(\beta_{12} X_1 X_2\)) modifies the response based on the combination of \(X_1\) and \(X_2\).
Impact on the Design Matrix
Adding interactions introduces new columns to the design matrix, representing products of factors.
Design Matrix Comparison
| Experiment | \(X_1\) | \(X_2\) | \(X_1 X_2\) (Interaction) |
|---|---|---|---|
| 1 | -1 | -1 | 1 |
| 2 | -1 | 1 | -1 |
| 3 | 1 | -1 | -1 |
| 4 | 1 | 1 | 1 |
What is a Quadratic Term?
A quadratic term models the nonlinear effects of a factor on the response. It represents the curvature of the relationship between a factor and the response.
Quadratic terms allow for the identification of optimal factor levels, where the response is maximized or minimized.
Example
Consider two factors:
Temperature
(\(X_1\)) and pH (\(X_2\)):
Without quadratic terms: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_{12} X_1 X_2\).
With quadratic terms: \(Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_{12} X_1 X_2 + \beta_{11} X_1^2 + \beta_{22} X_2^2\).
The quadratic terms (\(\beta_{11} X_1^2, \beta_{22} X_2^2\)) allow for curvature in the response.
Impact on the Design Matrix
Adding quadratic terms introduces new columns to the design matrix, representing squared factors.
Design Matrix Comparison
| Experiment | \(X_1\) | \(X_2\) | \(X_1 X_2\) (Interaction) | \(X_1^2\) (Quadratic) | \(X_2^2\) (Quadratic) |
|---|---|---|---|---|---|
| 1 | -1 | -1 | 1 | 1 | 1 |
| 2 | -1 | 1 | -1 | 1 | 1 |
| 3 | 1 | -1 | -1 | 1 | 1 |
| 4 | 1 | 1 | 1 | 1 | 1 |
Is this design plan suitable?
Why are Additional Points Needed?
To model quadratic terms, additional experiments are required to capture the curvature of the response.
These additional points ensure sufficient information to estimate quadratic effects accurately.
Types of Additional Points
1. Center Points (\(X_1 = 0, X_2 = 0\)): Measure the response at the geometric center of the design space.
2. Edge-Center Points (\(X_1 = 0\) or \(X_2 = 0\)): Measure the response at the midpoint of each factor's range, holding the other factor constant.
Key Insight
Without these points, the curvature of the response cannot be determined, and quadratic terms cannot be estimated reliably.
Updated Design Matrix
| Experiment | \(X_1\) | \(X_2\) | \(X_1 X_2\) (Interaction) | \(X_1^2\) (Quadratic) | \(X_2^2\) (Quadratic) |
|---|---|---|---|---|---|
| 1 | -1 | -1 | 1 | 1 | 1 |
| 2 | -1 | 1 | -1 | 1 | 1 |
| 3 | 1 | -1 | -1 | 1 | 1 |
| 4 | 1 | 1 | 1 | 1 | 1 |
| 5 (Center Point) | 0 | 0 | 0 | 0 | 0 |
| 6 (Edge Center) | 0 | -1 | 0 | 0 | 1 |
| 7 (Edge Center) | 0 | 1 | 0 | 0 | 1 |
| 8 (Edge Center) | -1 | 0 | 0 | 1 | 0 |
| 9 (Edge Center) | 1 | 0 | 0 | 1 | 0 |
Why Use Fractional Designs?
Full factorial designs require exponentially more experiments as the number of factors increases (\(N = 2^k\)).
Fractional designs allow us to study key effects with fewer experiments, saving time and resources.
What is a Fractional Design Plan?
A fractional design tests only a subset (fraction) of the combinations in a full factorial design.
It uses mathematical rules to carefully select combinations that maximize the information gained.
Key Idea
Fractional designs assume that higher-order interactions (e.g., \(X_1 X_2 X_3\)) are negligible, focusing only on main effects and lower-order interactions (e.g., \(X_1 X_2\)).
Benefits of Fractional Designs
Reduces the number of experiments without significantly compromising the quality of insights.
Efficiently identifies main effects and key interactions.
Suitable for initial screening of factors.
Steps to Create a Fractional Design
1. Start with a full factorial design for a smaller set of factors (e.g., \(X_1, X_2\)).
2. Define additional factors (\(X_3, X_4, \dots\)) as interactions of existing factors.
3. Construct the design matrix by combining these relationships.
Example: Fractional \(2^{3-1}\) Design
\(k = 3\) factors (\(X_1, X_2, X_3\)).
Full factorial: \(2^3 = 8\) experiments.
Fractional plan: \(2^{3-1} = 4\) experiments.
Define \(X_3 = X_1 X_2\).
Fractional Design Matrix
| Experiment | \(X_1\) | \(X_2\) | \(X_3 = X_1 X_2\) |
|---|---|---|---|
| 1 | -1 | -1 | 1 |
| 2 | -1 | 1 | -1 |
| 3 | 1 | -1 | -1 |
| 4 | 1 | 1 | 1 |
Assumptions of Fractional Designs
Fractional designs assume that higher-order interactions are negligible.
If higher-order interactions are significant, the results can be misleading.
Confounding
In fractional designs, some effects are confounded, meaning they cannot be distinguished from each other (same column pattern).
For example, if \(X_3 = X_1 X_2\), the effect of \(X_3\) is inseparable from the interaction \(X_1 X_2\).
Validation is Required
Fractional designs are suitable for initial screening, but full designs or additional experiments may be needed for confirmation.
Key Challenges
Risk of missing significant higher-order interactions.
Interpretation requires careful consideration of confounding.
Reduced resolution compared to full factorial designs.
| Experiment | \(X_1\) | \(X_2\) | \(X_3\) | \(X_1 X_2\) | \(X_1 X_3\) | \(X_2 X_3\) |
|---|---|---|---|---|---|---|
| 1 | -1 | -1 | -1 | 1 | 1 | 1 |
| 2 | -1 | 1 | 1 | -1 | -1 | 1 |
| 3 | 1 | -1 | 1 | -1 | -1 | -1 |
| 4 | 1 | 1 | -1 | 1 | -1 | -1 |
\(X_3\) is confounded with the interaction term \(X_1 \times X_2\).
\(X_1 X_3\) is a hidden quadratic term (\(X_1^2 \times X_2\)), which is not allowed without center points.
\(X_2 X_3\) is a hidden quadratic term (\(X_1 \times X_2^2\)), which is not allowed without center points.
What is a Box-Behnken Design?
A Box-Behnken design (BBD) is a response surface methodology (RSM) used for modeling nonlinear relationships between factors and responses.
Unlike full factorial designs, BBDs avoid extreme corner points, reducing the risk of running impractical or unsafe experiments.
Key Features
Includes center points (\(X = 0\)) and edge-center points.
Provides sufficient data to estimate quadratic effects.
Requires fewer experiments than a full factorial design.
When to Use?
Box-Behnken designs are ideal when:
Quadratic effects need to be modeled.
Experiments at extreme conditions (e.g., corners) are undesirable.
Benefits of Box-Behnken Designs
Reduces the number of experiments compared to full factorial designs.
Efficiently models curvature (quadratic effects).
Avoids impractical or unsafe experimental conditions.
How is a Box-Behnken Design Built?
Combines edge-center points and center points in the design space.
Requires a minimum of 3 factors (Box-Behnken is not defined for 2 factors).
For 3 factors (\(X_1, X_2, X_3\)), a Box-Behnken plan consists of 12 edge-center points and 1 center point.
Design Matrix Example: 3 Factors
| Experiment | \(X_1\) | \(X_2\) | \(X_3\) |
|---|---|---|---|
| 1 | -1 | 0 | 1 |
| 2 | 1 | 0 | -1 |
| ... | ... | ... | ... |
| 13 (Center Point) | 0 | 0 | 0 |
Key Insight
The Box-Behnken design carefully balances the placement of points to capture quadratic effects while minimizing the number of experiments.
Visualizing a Box-Behnken Plan
A Box-Behnken design for 3 factors creates a cube with edge-center points and a center point.
Key Limitations
Not defined for 2 factors; requires at least 3 factors.
Inefficient for large numbers of factors compared to other RSM methods (e.g., Central Composite Designs).
Assumes quadratic effects dominate; may miss higher-order interactions.
Practical Challenges
Limited flexibility for factor ranges; scaling is often required.
Requires careful validation to ensure results are robust.
Example Limitation
With 5 factors, a Box-Behnken design requires significantly more experiments than a fractional factorial design, reducing efficiency.
Key Insight
While Box-Behnken designs are efficient for 3–4 factors, alternative methods may be better for larger systems.
Key Considerations
The choice of experimental design depends on:
The number of factors (\(k\)) and their levels (\(L\)).
The type of effects to be studied (main effects, interactions, quadratic terms).
The available resources (time, budget, and experimental feasibility).
Design Overview
| Design Type | When to Use | Key Features |
|---|---|---|
| Full Factorial | Small number of factors (\(k \leq 3\)); all interactions important | Comprehensive insights but resource-intensive |
| Fractional Factorial | Screening for main effects when \(k \geq 4\) | Fewer experiments with confounding of higher-order interactions |
| Box-Behnken | Curvature modeling; \(k = 3\) to \(5\) | No corner points; robust for quadratic effects |
| Central Composite | Curvature modeling; \(k \geq 5\) | Includes axial points; flexible for large ranges |
| ... | ... | ... |
Full Factorial
Includes main effects and all interactions:
\( Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_3 + \beta_{12} X_1 X_2 + \beta_{13} X_1 X_3 + \beta_{23} X_2 X_3 + \beta_{123} X_1 X_2 X_3 \)
Fractional Factorial
Includes only main effects and selected interactions (e.g., confounded terms):
\( Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_{12} X_1
X_2 \)
where \(X_3 = X_1 X_2\)
Box-Behnken
Captures quadratic effects without corner points:
\( Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_3 + \beta_{12} X_1 X_2 + \beta_{13} X_1 X_3 + \beta_{23} X_2 X_3 + \beta_{11} X_1^2 + \beta_{22} X_2^2 + \beta_{33} X_3^2 \)
Comparison of Models
Full Factorial: Captures all effects but resource-intensive.
Fractional Factorial: Simplified model, ideal for screening.
Box-Behnken: Focuses on quadratic effects, avoids extremes.
Backstory
A water treatment facility is seeking to optimize its process for removing pollutants. The response variable of interest is the removal efficiency (%) .
Two factors have been identified as critical to the process:
Temperature (\(X_1\)) : Range \(20^\circ\text{C}\) to \(60^\circ\text{C}\)
pH (\(X_2\)) : Range 6 to 8
Objective
To better understand the influence of these factors, you need to design and execute an experimental plan.
The goal is to identify the factor levels that maximize removal efficiency.
Key Questions
What type of experimental design should you use?
How many experiments are necessary to capture the relevant effects?
Which effects (e.g., main effects, interactions, quadratic effects) should be included in the model?
Your Task
Develop a design plan to investigate the effects of temperature and pH on pollutant removal efficiency.
Create a design matrix that captures the desired effects.
Data