1 / 29

Blocking, Monotonicity, and Turing Completeness in a Database Language for Sequences and Streams

Blocking, Monotonicity, and Turing Completeness in a Database Language for Sequences and Streams. Yan-Nei Law, Haixun Wang, Carlo Zaniolo 12/06/2002. Outline. Database Application in ATLaS Turing Completeness Data Streams Blocking and Monotonicity. User Defined Aggregates (UDAs).

lyn
Download Presentation

Blocking, Monotonicity, and Turing Completeness in a Database Language for Sequences and Streams

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Blocking, Monotonicity, and Turing Completeness in a Database Language for Sequences and Streams Yan-Nei Law, Haixun Wang, Carlo Zaniolo 12/06/2002

  2. Outline • Database Application in ATLaS • Turing Completeness • Data Streams • Blocking and Monotonicity

  3. User Defined Aggregates (UDAs) • Important for decision support, stream queries and other advanced database applications. • UDA consists of 3 parts: • INITIALIZE • ITERATE • TERMINATE

  4. Standard aggregate average AGGREGATE myavg(Next Int) : Real { TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE : { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; } TERMINATE : { INSERT INTO RETURN SELECT tsum/cnt FROM state; } }

  5. Online aggregation • Standard average aggregate returns results at TERMINATE state. • Online aggregate: Get results before input all the tuples • e.g. return the average for every 200 input tuples. • RETURN statements appear in ITERATE instead of TERMINATE.

  6. Online averages AGGREGATE online_avg(Next Int) : Real { TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE: { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; INSERT INTO RETURN SELECT sum/cnt FROM state WHERE cnt % 200 = 0; } TERMINATE : { } }

  7. Table from external source Access tables stored in external source 'C:\mydb\employees'. TABLE employee(Eno Int, Name Char(18), Sal Real, Dept Char(6)) SOURCE 'C:\mydb\employees'; SELECT Sex, online_avg(Sal) FROM employee WHERE Dept=1024 GROUP BY Sex;

  8. Calling other UDAs • UDAs can call other UDAs, including recursively calling themselves. • Example: Computation of Transitive Closure (the graph contains no directed cycle). • Find all the nodes that is reachable from ‘000’.

  9. Transitive Closure TABLE dgraph(start Char(10), end Char(10)) SOURCE mydb; AGGREGATE reachable(Rnode Char(10)) : Char(10) { INITIALIZE: ITERATE: { INSERT INTO RETURN VALUES (Rnode) INSERT INTO RETURN SELECT reachable(end) FROM dgraph WHERE start=Rnode; } } SELECT reachable(dgraph.end) FROM dgraph WHERE dgraph.start='000';

  10. Turing Completeness • Turing Completeness is not a trivial property for a language. • SQL is not Turing Complete. • A Turing Machine is a tuple M=(Q,,,,q0,!,F). • TM is determined by 4 elements • Transition map ; • Accepting states F; • Input tape ; • Initial state q0.

  11. Turing Machine for L3eq = {anbncn|n≥1} • Transition map: • Input tape: aaabbbccc • Accepting state: {z} • Initial state: p

  12. Turing Machine for L3eq INSERT INTO transition VALUES ('p','a',1,'q','x'), ('q','a',1,'q','a'), ('t','a',-1,'u','a'), ('u','a',-1,'u','a'), ('q','b',1,'r','x'), ('r','b',1,'r','b'), ('t','b',-1,'t','b'), ('r','c',1,'s','x'), ('s','c',1,'s','c'), ('t','c',-1,'t','c'), ('q','x',1,'q','x'), ('r','x',1,'r','x'), ('t','x',-1,'t','x'), ('u','x',1,'p','x'), ('v','x',1,'v','x'), ('s','!',-1,'t','!'), ('t','!',1,'v','x'), ('v','!',0,'z','x'); INSERT INTO accept VALUES ('z'); INSERT INTO tape VALUES ('a',0),('a',1), ('a',2), ('b',3), ('b',4), ('b',5), ('c',6), ('c',7),('c',8);

  13. Turing Machine in ATLaS(1) • For each iteration: • pass the current state, current symbol, position on the tape to UDA called turing. • If  is defined, obtain the next state, next symbol and movement of the head. • current state  next state; • current symbol  next symbol; • next position = current position+movement. If  is not defined, TM halts. check whether current state is an accepting state.

  14. Turing Machine in ATLaS(2) TABLE current(stat Char(1), symbol Char(1), pos Int); TABLE tape(symbol Char(1), pos Int); TABLE transition(curstate Char(1), cursymbol Char(1), move int, nextstate Char(1), nextsymbol Char(1)); TABLE accept(accept Char(1));

  15. Turing Machine in ATLaS(3) AGGREGATE turing(stat Char(1), symbol Char(1), curpos Int) : Int { INITIALIZE: ITERATE: { /*If TM halts, return 1 or 0 (accept or reject)*/ INSERT INTO RETURN SELECT R.C FROM (SELECT count(accept) C FROM accept A WHERE A.accept = stat) R WHERE NOT EXISTS ( SELECT * FROM transition T WHERE stat = T.curstate AND symbol = T.cursymbol);

  16. Turing Machine in ATLaS(4) /* write tape */ DELETE FROM tape WHERE pos = curpos; INSERT INTO tape SELECT T.nextsymbol, curpos FROM transition T WHERE T.curstate = stat AND T.cursymbol = symbol;

  17. Turing Machine in ATLaS(5) /* add blank symbol if necessary */ INSERT INTO tape SELECT '!', curpos + T.move FROM transition T WHERE T.curstate = stat AND T.cursymbol = symbol AND NOT EXISTS ( SELECT * FROM tape WHERE pos = curpos +T.move);

  18. Turing Machine in ATLaS(6) /* move head to the next position */ INSERT INTO current SELECT T.nextstate, A.symbol, A.pos FROM tape A, transition T WHERE T.curstate = stat AND T.cursymbol = symbol) AND A.pos=curpos+T.move; } }

  19. Turing Machine in ATLaS(7) INSERT INTO current SELECT 'p', A.symbol, 0 FROM tape A WHERE A.pos = 0; SELECT turing(stat, symbol, pos) FROM current;

  20. Data Streams(1) • For stream processing and continuous queries, results must be returned promptly, without waiting for future input. • Non-blocking operators are suitable. • No terminate state is needed.

  21. Data Streams(2) • ATLaS UDAs is a powerful tool for approximate aggregates, synopses, and adaptive data mining. • Stream is a source of data instead of table. • Data from a stream are time-stamped. • Example: Compute the average of long distance calls placed by each customer.

  22. Data from a stream STREAM calls(customer_id Int, type Char(6), minutes Int, Tstamp: Timestamp) SOURCE mystream; SELECT periodic_avg(S.minutes) FROM Calls S WHERE S.type = 'Long Distance‘ GROUP BY S.customer_id

  23. Average of the last 200 inputs AGGREGATE periodic_avg(Next Int, Time: Timestamp) : (Real, Timestamp) { TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE : { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; INSERT INTO RETURN SELECT tsum/cnt, Time FROM state WHERE cnt % 200 = 0; UPDATE state SET tsum=0, cnt=0 WHERE cnt % 200 = 0; } }

  24. Average on a five-minute window(1) AGGREGATE window_avg(Next Int, Time Timestamp): (Real,Timestamp) { TABLE state(tsum Int); TABLE memo(mNext Int, mTime Timestamp); INITIALIZE : { INSERT INTO state VALUES (Next); INSERT INTO memo VALUES (Next, Time); }

  25. Average on a five-minute window(2) ITERATE: { INSERT INTO memo VALUES (Next, Time); UPDATE state SET tsum=tsum+Next - ( SELECT sum(mNext) FROM memo WHERE mTime+ 5 MINUTES < Time); DELETE * FROM memo WHERE mTime+ 5 MINUTES < Time; INSERT INTO RETURN SELECT tsum, Time FROM state; } }

  26. Blocking and Monotonicity • A blocking query operator is a query operator that is unable to produce the first tuple of the output until it has seen the entire input • e.g. myavg in Example 1 • A non-blocking query operator is one that produces all the tuples of the output before it has detected the end of the input. • e.g. online_avg in Example 2

  27. Definition • Sequence of Length n: S=[t1,…,tn]. [ ] has length 0. • Presequence: Sk=[t1,…tk], 0<k≤n. • S  L: Lk = S for some k. ( is partial order.) • For an operator G, Gj(S) is the cumulative output produced up to step j by G. • blocking: Gj(S) = [ ] for j<n. • non-blocking: Gj(S)= Gj(Sj)=G(Sj). • partial blocking: [ ] < Gj(S) < G(Sj).

  28. ATLaS UDA is • blocking: returns answers only in TERMINATE state; • non-blocking: returns answers only in INITIALIZE or ITERATE states; • partially blocking: returns answers in TERMINATE and other states.

  29. A function F can be computed using a non-blocking operator if F is monotonic wrt . • ATLaS is complete wrt non-blocking computations. • Every monotonic function can be computed by an ATLaS program with no TERMINATE state.

More Related