620 likes | 794 Views
SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODIC and APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS. Paolo Palazzari Luca Baldini Moreno Coli ENEA – Computing and Modeling Unit University “La Sapienza” – Electronic Engineering Dep’t. Outline of Presentation.
E N D
SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODICand APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS Paolo Palazzari Luca Baldini Moreno Coli ENEA – Computing and Modeling Unit University “La Sapienza” – Electronic Engineering Dep’t
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Problem Statement • We want to synthesize a synchronous pipelined system which executes both the task PSy , sustaining its throughput, and m mutually exclusive tasks PAs1, PAs2, …, PAsm whose activations are randomly triggered and whose results must be produced within a prefixed time.
Problem Statement • We represent the tasks as Control Data Flow Graphs (CDFG) G = (N, E) N = {n1, n2, …, nN}: operations of the task E= (data and control dependencies)
Problem Statement • Aperiodic tasks, characterized by random execution requests and called asynchronous to mark the difference with the synchronous nature of periodic tasks, are subjected to Real-Time constraints (RTC), collected in the set RTCAs = {RTCAs1, RTCAs2, ..., RTCAsm}, where RTCAsi contains the RTC on the ith aperiodic task. • Input data for the synchronous task PSy arrive with frequency fi = 1/Dt, being Dt the period characterizing PSy.
Problem Statement We present a method to determine • The target architecture: a (nearly) minimum set of HW devices to execute all the tasks (synchronous and asynchronous); • The feasible mappingonto the architecture: the allocation and the scheduling on the HW resources so that PSyis executed sustaining its prefixed throughput and all the mutually exclusive asynchronous tasks PAs1, PAs2, …, PAsmsatisfy the constraints in RTCAs.
Problem Statement • The adoption of a parallel system can be mandatory when Real Time Constraints are computationally demanding • The iterative arrival of input data makes pipeline systems a very suitable solution for the problem.
Problem Statement • Example of a pipeline serving the synchronous task PSy
Problem Statement • Sk = (k-1)Tck and Sk = kTck • In a pipeline with L stages, SL denotes the last stage. • DII = t/Tck
Problem Statement • We assume the absence of synchronization delays due to control or data dependencies: Throughput of the pipeline system =1/DII.
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Asynchronous events • We assume the asynchronous tasks to be mutually exclusive, i.e. the activation of only one asynchronous task can be requested between two successive activations of the periodic task
Asynchronous events In red the asynchronous service requests in a pipelined system.
Asynchronous events • Like the synchronous events, we represent the asynchronous events {PAs1, PAs2, ..., PAsm} through a set of CDFG ASG = {AsG1(NAs1,EAs1), ... , AsGm(NAsm,EAsm)}
Asynchronous events • We consider a unique CDFG made up by composing the graph of the periodic task with the m graphs of the aperiodic tasks: G(N, E) = SyG(NSy, ESy) AsG1(NAs1, EAs1) AsG2(NAs2, EAs2) ..……. AsGm(NAsm, EAsm)
Asynchronous events • Aperiodic tasks are subjected to Real-Time constraints (RTC): • As all RTC must be respected, mapping function M has to define a scheduling so that - Di 0 RTCAsiRTCAs
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Mapping methodology In order to develop a pipeline system implementing G • a HW resource rj = D(nj) and • a time step Sk must be associated to each njN
Mapping methodology • We must determine the mapping function M: NURS • UR is the set of the used HW resources (each rj is replicated kj times),
Mapping methodology • rj = D(ni) is the HW resource on which ni will be executed • S(ni) is the stage of the pipeline, or the time step, in which ni will be executed
Mapping methodology We search for the mapping function M’ which, for a given DII: • Respects all the RTC • Uses a minimum number ur of resources • Gives the minimum pipeline length for the periodic task
Mapping methodology • The mapping is determined by solving the following minimization problem: • C1(M) is responsible of the fulfillment of all the RTC • C2(M) minimizes the used silicon area • C3(M) minimizes the length of the pipeline.
Mapping methodology • While searching for a mapping of G, we force the response to aperiodic tasks to be synchronous with the periodic task • The execution of an aperiodic task, requested at a generic time instant t0, is delayed till the next start of the pipeline of the periodic task.
Mapping methodology In a pipelined system with DII=1 • the used resource set is maximum • the execution time of each AsGi on the pipeline is minimum • A lower bound for the execution time of AsGi is given by the lowest execution time of the longest path of AsGi: LpAsi is such a lower bound, expressed in number of clock cycles
Mapping methodology Maximum value allowed for DII, compatible with all the RTCAsiRTCAs: • LpAsi Tck gives the minimal execution time for AsGi • The deadline associated to AsGi is Di.
Mapping methodology Maximum value allowed for DII, compatible with all the RTCAsiRTCAs (continued): • The request for the aperiodic task can be sensed immediately after the pipeline start, the aperiodic task will begin to be executed DIITck seconds after the request: at the next start of the pipeline.
Mapping methodology Maximum value allowed for DII, compatible with all the RTCAsiRTCAs (continued): • A necessary condition to match all the RTCAsiRTCAs is that the lower bound of the execution time of each asynchronous task must be smaller than the associated deadline diminished by the DII, i.e. DiDII Tck + LpAsiTck , i = 1, 2, ..., m
Mapping methodology • Combining previous relations with a congruence condition between the period of the synchronous task (Dt) and the clock period (Tck), we obtain the set DIIp wich contains all the admissible DII values.
Mapping methodology Steps of the Mapping methodology: • A set of allowed values of DII is determined • Sufficient HW resource set UR0 is determined • At the end of optimization process the number of used resources ur could be less than ur0 if mutually exclusive nodes are contained in the graph
Mapping methodology Steps of the Mapping methodology (continued): • An initial feasible mapping M0 is determined; SL0 is the last time step needed to execute P by using M0. • Starting from M0, we use the Simulated Annealing algorithm to solve the minimization problem
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Searching space • In order to represent a mapping function M we adopt the formalism based on the Allocation Tablest(M) • t(M) is a table with ur horizontal lines and DII vertical sectors Osi with i=1,2,...,DII • Each Osi contains time steps Si+kDII (k=0, 1, 2, ...) which will be overlapped during the execution of P
Searching space • Each node is assigned to a cell of t(M), i.e. it is associated to an HW resource and to a time step. • For example, we consider the 23-node graph AsG1
Searching space • For DII=3, a possible mapping M is described through the following t(M)
Searching space An allocation table t(M) must respect both 1. Causality condition And the 2. Overlapping condition
Searching space • We define the Ω searching space over which minimization of C(M) must be performed. • Ω is the space containing all the feasible allocation tables: ={t(M) | t(M) is a feasible mapping}; • is not feasible.
Searching space • We can write the minimization problem in terms of the cost associated to the mapping M represented by the allocation table:
Searching space • We solve the problem by using a Simulated Annealing (SA) algorithm • SA requires the generation of a sequence of points belonging to the searching space; each point of the sequence must be close, according to a given measure criterion, to its predecessor and to its successor.
Searching space • As consists of allocation tables, we have to generate a sequence of allocation tables t(Mi)Neigh[t(Mi-1)] being Neigh[t(M)] the set of the allocation tables adjacent tot(M) according to some adjacency criteria
Searching space • Searching space connection: Theorem 2. The searching space is connected adopting the adjacency conditions.
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Optimization by RT-PSA Algorithm • We start from a feasible allocation table t(M0)W • We entrust in the optimization algorithm to find the wanted mapping M
Optimization by RT-PSA Algorithm • We iterate over all the allowed values of DII • The final result of the whole optimization process will be the allocation table characterized by minimal cost.
Outline of Presentation • Problem statement • Asynchronous events • Mapping methodology • Searching space • Optimization by RT-PSA Algorithm • Results • Conclusions
Results • In order to illustrate the results achievable through the presented RT-PSA algorithm, we consider the following graphs