Dynamic Single Machine Scheduling Using Q-Learning

Dynamic Single Machine Scheduling Using Q-Learning

Outline • Description of Problem(Dynamic Single Machine Sheduling Problem) • The work of Q-learning • Action&State • Simulated annealing-based Q-learing

Description of Problem • A finite set of n jobs • Each job consists of a chain of operations • A finite set of m machines • Each machine can handle at most one operation at a time • Each operation needs to be processed during an uninterrupted period of a given length on a given machine • Purpose is to find a schedule, that is, an allocation of the operations to time intervals to machines, that has minimal length

Other Machines Other Machines Bottleneck Machine Single Machine Scheduling

Description of Problem(cont.) • Dispatching Rules • Essentially “selecting” the next job from the queue in front of a machine based on a rule. • Common rules: • SPT – shortest processing time • EDD – earliest due date • FCFS – first come, first served • LTWK – least total work • LWKR – least work remaining • WINQ – work in next queue • Probably hundreds of dispatching rules

Description of Problem(cont.) • No single rule has been found to perform well for all system object. • So, we need an intelligent Agent-based scheduling system. • And the Agent’s function is using the new Q-learning technique to select a dispatching rule for the machine.

Outline • Description of Problem(Dynamic Single Machine Sheduling Problem) • The work of Q-learning • Action&State • Simulated annealing-based Q-learing

The work of Q-learning • Reinforcement Learning problem • Direct utility estimation • Adaptive dynamic programming(ADP) • Temporal-difference(TD)

The work of Q-learning(Cont.) • Q-learning (Q(a, s)) • A Q-learning learns a Q-function, giving the expected utility of taking a given action in a given state. • Calculate whenever action a is executed in state s leading to state s’ • It can compare the values of its available choice without needing to know their outcomes. • U(s)=max Q(s,a) • More details Chapter 21.

Outline • Description of Problem(Dynamic Single Machine Sheduling Problem) • The work of Q-learning • Action&State • Simulated annealing-based Q-learning

Action&State(cont.) • Action A={a1, a2, a3} • a1: SPT – shortest processing time • a2: EDD – earliest due date • a2: RR– round-robin • Environment’s states(Buffer) • we define different descriptions of the states according to different system objectives. • AST:average slack of the jobs waiting in buffer. • MST:the maximum value of slacks of those jobs waiting in the buffer. • PTJ:Number of the tardy jobs of the time.

Action&State(cont.) • AST(S1) • ct:the current time Agent making decision. • ddi: job I’s due date • epti : job I’s expected processing time • n: the number of the jobs waiting in buffer at that time

Action&State(cont.) • Reward function • The tardiness of the job(TT) • TT= ddi – fti(Theactual finishing time of job i)

Action&State(cont.) • MST(S2)

Action&State(cont.) • Reward function

Action&State(cont.)

Action&State(cont.) • Reward function

Outline • Description of Problem(Dynamic Single Machine Sheduling Problem) • The work of Q-learning • Action&State • Simulated annealing-based Q-learning

Simulated annealing-based Q-learning

Action&State(cont.)

Simulated annealing-based Q-learning(cont.) • where, ti is the job i’s arrival time, k is the coefficient of tightness which represents the pressure of job’s due date.

Simulated annealing-based Q-learning(cont.)

Conclusion • How to set Q-learning parameters? • If the state’s parameters change, what will happen? • Is Using another dispatching rules better?

Dynamic Single Machine Scheduling Using Q-Learning

Dynamic Single Machine Scheduling Using Q-Learning

Presentation Transcript

Dynamic Scheduling System

Digit Recognition Using Machine Learning

Dynamic Scheduling

Dynamic Scheduling

Dynamic scheduling

Q-Learning and Dynamic Treatment Regimes

Dynamic Scheduling Using Pools and Dynamic Pools in Dynamic Domains

Dynamic Scheduling

Dynamic Scheduling

Scheduling Policy Design using Stochastic Dynamic Programming

Tomasulo Dynamic Scheduling

Dynamic instruction scheduling

L14: Dynamic Scheduling

Dynamic Scheduling Using Tomasulo’s Approach

On single machine scheduling with processing time deterioration

Dynamic scheduling

L15: Dynamic Scheduling

Machine learning using spark

Smart Phones using Machine Learning

Topic Detection using Machine Learning

Develop Machine Learning using Python

Dynamic Scheduling and Dynamic Percolation