370 likes | 492 Views
Krannert School of Management. Statistical Modeling in Stochastic Dynamic Programming for a Decision-Making Framework. Dr. Julia C. Tsai Krannert School of Management Purdue University. December 15, 2003. Outline. Decision-Making Framework Stochastic Dynamic Programming
E N D
Krannert School of Management Statistical Modeling in Stochastic Dynamic Programming for a Decision-Making Framework Dr. Julia C. TsaiKrannert School of ManagementPurdue University December 15, 2003
Outline • Decision-Making Framework • Stochastic Dynamic Programming • Statistical Modeling within the DMF • Multivariate Adaptive Regression Splines • Parallel MARS • Flexible Implementations of MARS • DMF Results • Conclusions
A Modular Decision-Making Framework For time period/level/stage t: xt = state of systemut = decision/control
Stochastic Dynamic Programming • To solve a problem of different periods/levels/stages • Applications: • Inventory Forecasting: • -- Up to 9 dimensions (Chen 1999) • Airline Revenue: • -- 31 flight legs (Chen, Günther, Johnson 2000) • Wastewater Treatment System: • -- 20 dimensions (Tsai et al. 2002)
Inventory Forecasting Modeled by Heath and Jackson (1991) using the Martingale Model of Forecast Evolution Objective:Minimize inventory holding and backorder costs. Time Periods/Levels/Stages:Months, weeks. Statextat the beginning of Stage t:Inventory levels and product forecasts. Decisionutin Stage t:Amount ordered. Constraints:Capacities on order quantities. Random Variables:Errors in the forecasts. Transition:For inventory xt+1 = xt demand + order quantity.
Airline Revenue Management Research with Ellis Johnson (Georgia Tech), Dirk Günther (Sabre), and Jay Rosenberger (UTA) Objective:Maximize revenue before a specified departure date. Time Periods/Levels/Stages:weeks, days. Statextat the beginning of Stage t:Remaining capacities on the flight legs in the network. Decisionutin Stage t:Accept or Reject a customer’s airfare request for a specified origin-destination itinerary. Constraints:Capacities on flight legs. Random Variables:Customer demand. Transition:xt+1 = xt # seats sold in stage t.
Wastewater Treatment System[1] • 11-level liquid line and 6-level solid line • At each level, select one of several unit processes to complete the treatment system • Objectives: • Evaluate various technologies in different levels • Identify which technologies should be explored more in the future • 1 Developed by Dr. Bruce Beck and Dr. Jining Chen
Technology Units Liquid Line:
Objectives of the SDP: • To minimize • Economic Cost (Capital & Operating) • Odor Emissions • Size of treatment system (land area or volume) • or Maximize • Robustness against extreme conditions • Desirability of the global environment • Constraints: • 1. Cleanliness of the influent entering each level • 2. Stringent clean water targets exiting the final level of the system
Stochastic Dynamic Programming (SDP) Objective:Minimize expected cost over T stages. Optimal Value FunctionFt(xt)in Stage t:Minimum expected cost to operate the system over stages t through T.
Algorithm for Continuous-State SDP • ChooseS discretization pointsin the state space. • In eachstage t= T,…,1: • At eachdiscretization point xj, j = 1, … , S: minimize the expected cost value of • Approximate with (Chen, Ruppert, Shoemaker 1999)
Statistical Modeling Process SDP Period/Stage/Level t+1 SDP Period/Stage/Level t Experimental Design State Vector Values Optimization Data for the Future Value Function Statistical Model Estimated Future Value Function SDP Period/Stage/Level t-1
Design of Experiments Eachexperimental run • sets eachfactorat a specific level • corresponds to apointin the n-dimensional space
Design of Experiments Options • FF: Full factorialor complete grid designs • OA: Orthogonal arraydesigns (Bose and Bush 1952, Chen 2001) • LH: Latin hypercubedesigns (McKay et al. 1979) • OA-LH: Hybrid (Tang 1993)
Orthogonal Array Designs • OA Parameters: • n factors • strength d (d < n) • p levels • frequency • Whenprojecteddown onto anyd dimensions, it produces aFF gridof pd points replicated times. • A LH designis equivalent to anOA of strength 1.
Cubic Regression Splines Univariate cubic regression splines commonly have the form:
MultivariateAdaptive Regression Splines – + 1 va ka – + 2 vb kb 3 4 B1 = H[–(Xva–ka)] , B2 = H[+(Xva–ka)] B3 = H[–(Xva–ka)]H[–(Xvb–kb)] B4 = H[–(Xva–ka)]H[+(Xvb–kb)]
MARS Forward Stepwise • Loop through potential new basis functions: • Select parent basis function m • Select variable v • Select knot k • For each m, v, k: • Compute lack-of-fit • Compare to current best based on lack-of-fit • For the best m, v, k: Create two new basis functions • 4. Continue searching for new basis functions until the stopping rule (e.g. Mmax) is met
Parallel MARS • Master-Slave paradigm • Software:MPI (Message-Passing Interface)
Parallel MARS Algorithm C0 : Initialization/Data Processing. C1 C2 CP-1 C0 : Select the overall best knot and update b.f. Meet Mmax? NO YES STOP
Parallel Performance Measure: • tP: Time using Parallel MARS with P processors • t1: Time using Parallel MARS with 1processor • Speedup (SP) = t1/ tP • Computing Facility: • Processor: 550 MHz Pentium III Xeon • Storage: 4 GB RAM, 18 GB SCSI disk • OS: RedHat Linux 7.1
Results Speedup vs. No. of Processors [ N = 289, K = 35]
The Drawbacks of MARS • Mmax is difficult to select • Different SDP time periods may require different Mmax for a good approximation • Computational effort required to identify the best Mmax for each time period is impractical • Multiple basis functions can be “equivalently” good based on lack-of-fit • MARS is a greedy algorithm • Final approximation may involve more higher-order interaction terms than necessary
ASR-MARS(Automatic Stopping Rule) • Use of R2 and R2a: (adjusted) coefficient of determination • ASR-I:Stop MARS approximation search process when R2 < or R2a < • ASR-II:Stop MARS approximation search process when R2 / R2 < or R2a / R2a <
Results Mmax Relaxation: Slow vs. ASR-I ( =0.0002) Run Time: MAD (mean absolute deviation) & M (number of basis functions):
Robust MARS • Choose lower-order interaction terms For example: The highest allowable interaction term is 3, then three I(i, Bi) are used to store the best basis function (Bi): I(1, B1) = among univariate options I(2, B2) = among two-way interaction options I(3, B3) = among three-way interaction options
AssumeI(3, B3) > I(2, B2) > I(1, B1) Start NO The best b.f. is B3 YES NO The best b.f. is B2 YES The best b.f. is B1
Results Robust MARS Results
DMF Evaluation Measures • Count = # times chosen as best • MOD = mean overall deviation • MLD = mean local deviation • MLRD = mean local relative deviation • A promising technology hashigherCount andlowerMOD, MLD, MLRD
Results DMF Solution (Count): Slow vs. ASR
Conclusions • Parallel-MARS: Speedup becomes more significant as Mmax increases • ASR-MARS: Tremendously reduced runtime for the statistical modeling process, and selected the same promising technologies as “Slow” Mmax relaxation • Robust MARS: Reduced the mean absolute deviation of the test data set, which suggested a better statistical model