Minimum Energy Designs â€“ from Nanostructure Synthesis to Sequential Optimization

Minimum Energy Designs – from Nanostructure Synthesis to Sequential Optimization C. F. Jeff Wu+ (joint with Roshan Joseph+ & Tirthankar Dasgupta*) +Georgia Institute of Technology *Harvard University

What are Nanostructures? • Functional structures designed from atomic or molecular scale with at least one characteristic dimension measured in nanometers (1 nm = 10-9 meter). • Exhibits novel and significantly improved physical, chemical and biological properties, phenomena and processes. • Building blocks for nano-devices. • Likely to impact many fields ranging from electronics, photonics and optoelectronics to life sciences and healthcare.

Statistical modeling and analysis for robust synthesis of nanostructures • Dasgupta, Ma, Joseph, Wang and Wu (2008), J. Amer. Stat. Assoc. • Robust conditions for synthesis of Cadmium Selenide (CdSe) nanostructures derived • New sequential algorithm for fitting multinomial logit models. • Internal noise factors considered.

Fitted quadratic response surfaces & optimal conditions

The need for more efficient experimentation • A 9x5 full factorial experiment was too expensive and time consuming. • Quadratic response surface did not capture nanowire growth satisfactorily (Generalized R2 was 50% for CdSe nanowire sub-model).

What makes exploration of optimum difficult? • Complete disappearance of morphology in certain regions leading to large, disconnected, non-convex yield regions. • Multiple optima. • Expensive and time-consuming experimentation • 36 hours for each run • Gold catalyst required

“Actual” contour plot of CdSe nanowire yield • Obtained by averaging yields over different substrates. • Large no-yield (deep green region). • Small no-yield region embedded within yield regions. • Scattered regions of highest yield.

How many trials needed to hit the point of maximum yield ? Pressure Temperature

A 5x9 full-factorial experiment 17 out of 45 trials wasted (no morphology)! Pressure Yield = f(temp, pressure)

Why are traditional methods inappropriate ? • Need a sequential approach to keep run size to a minimum. • Fractional factorials / orthogonal arrays • Large number of runs as number of levels increase. • Several no-morphology scenarios possible. • Do not facilitate sequential experimentation. • Response Surface Methods • Complexity of response surface. • Categorical (binary in the extreme case) possible. • No clever search algorithm.

The Objective • To find a design strategy that • Is model-independent, • Can “carve out’’ regions of no-morphology quickly, • Allows for exploration of complex response surfaces, • Facilitates sequential experimentation.

Pros and Cons of space filling designs • LHD (McKay et al. 1979), Uniform designs (Fang 2002) are primarily used for computer experiments. • Can be used to explore complex surfaces with small number of runs. • Model free. • Not designed for sequential experimentation. • No provision to carve out regions of no-morphology quickly.

Sequential Minimum Energy Designs (SMED) • Physical connection: treat design points as positively charged particles. Y = 0 Charge inversely proportional to yield, e.g., q = 1-yield q2 = 1.0 E = Kq1q2 / d Pressure Y = 40% q1 = 0.6

What position will a newly introduced particle occupy? q2 = 1.0 Total Potential Energy Minimized !! Pressure q1 = 0.6

Key idea • Pick a point x. • Conduct experiment at x and observe yieldp(x). • Assign chargeq(x) inversely proportional to p(x), e.g., . • Use to update your knowledge about yields at various points in the design space • Pick the next point as the one that minimizes the total potential energy in the design space.

The next design point

How the algorithm works

Inverse distance weighting as interpolator • Not yet an algorithm, q(x) needs to be “predicted”. • Use inverse distance weighting to assign charges to each yellow point based on yields observed at red (sampled) points: . • The yellow point that minimizes the potential energy with the four red points, is the next choice.

The SMED algorithm

Choice of a • Because , where . • Lemma 1: For a = 1/pg , if xn = xg for some n = n0, then xn = xg, for . Once it reaches xg , SMED will stick to the global optimum (i.e., total energy ). • Undesirable to choose a < 1/pg; see Theorem 2 later.

Choice of tuning constants • In practice, pg will not be known. • Thus a will be estimated iteratively. • First, let’s examine the performance for deterministic yield functions with fixed a (a = pg-1)andg.

Performance with known a

Performance with known a (with different starting points and g=1)

Convergence of SMED

Proof (Continued) For any , Since is a convergent sequence and , of as , a contradiction. □

Divergence of SMED with wrong a • Theorem 2. Under same assumptions, if a<1/pg , then is a dense subset of . • Proof based on similar ideas. • Implications: Smed sequence will visit every part of the design region, an erratic behavior like the Peano Curve. • The proofs reveal how and work together to move the sequence toward the optima.

Accelerated SMED • For a convergent , its d values → 0. Then the corresponding q values must also go to 0, i.e., , explaining why a = 1/pg. • By flipping this argument, we can move SMED subsequence quickly out of a region with low q values (i.e., get out of a peak already identified) by redefining the q values for this subsequence to a much higher value. This will force SMED to move quickly out of the region.

Performance Comparison SMED Accelerated SMED

Criteria for estimator of a

Iterative estimation of a Fit the logistic model Where is the asymptotic value of the fitted logistic curve. Use

Some performance measures for n0 - run designs .

Performance evaluation with nanowire yield data

Modified Branin function • A standard test function in global optimization: , has three global minima. • To create a large nonconvex and disconnected no-yield region, use modified Branin function where

Performance with modified Branin function

Performance with modified Branin function (contd.)

Random functions • In actual practice the yield function is random. • We actually observe

Performance of usual algorithm with random functions • Result of 100 simulations, starting point = (0,0). • Concern: as r decreases, the number of cases in which the global optimum is identified reduces.

ImprovedSMED for random response • Instead of an interpolating function, use a smoothing function to predict yields (and charges) at unobserved points. • Update the charges of selected points as well, using the smoothing function. • Local polynomial smoothing used. • Two parameters: • nT (threshold number of iterations after which smoothing is started). • l (smoothing constant; small l: local fitting).

Improved performance with smoothing algorithm, r = 10

Summary • A new sequential space-filling design SMED proposed. • SMED is model independent, can quickly “carve out” no-morphology regions and allows for exploration of complex surfaces. • Origination from laws of electrostatics. • Some desirable convergence properties. • Modified algorithm for random functions. • Performance studied using nanowire data, modified Branin (2 dimensional) and Levy-Montalvo (4 dimensional) functions.

Predicting the future What the hell! I don’t want to use this stupid strategy for experimentation ! Use my SMED ! Stat Nano Image courtesy : www.cartoonstock.com

Thank you

How many trials? Let’s try one factor at-a-time! • Could not find optimum • Almost 50% trials wasted (no yield) • Too few data for statistical modeling Pressure Temperature

Sequential experimentation strategies for global optimization • SDO, a grid-search algorithm by Cox and John (1997) • Initial space-filling design. • Prediction using Gaussian Process Modeling. • Lower bounds on predicted values used for sequential selection of evaluation points. • Jones, Schonlau and Welch (1998) • Similar to SDO. • Expected Improvement (EI) Criterion used. • Balances the need to exploit the approximating surface with the need to improve the approximation.

Why they are not appropriate • Most of them good for multiple optima, but do not shrink the experimental region fast. • Algorithms that reduce the design space (Henkenjohann et al. 2005) assume connected and convex failure regions. • Initial design may contain several points of no-morphology. • Current scenario focuses more on quickly shrinking the design space.

Performance in higher-dimensions (Levy-Montalvo function)

Minimum Energy Designs â€“ from Nanostructure Synthesis to Sequential Optimization