1 / 45

Claudio F. R. GeyerII - UFRGSIn

Andorra-I. ACE. SBA. ParAKL. Penny. DAOS. 30. Andorra-I. Determinate and-parallelism. or ... Andorra-I 1047 214918 835 8496 5757 1517. Reduction in search space. 38 ...

Download Presentation

Claudio F. R. GeyerII - UFRGSIn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

    1. Claudio F. R. Geyer II - UFRGS Inęs de Castro Dutra COPPE - Sistemas - UFRJ

    2. Outline

    Introduction Sequential Implementation Parallel Implementation Performance Conclusions Future Work

    3. Introduction

    Why logic programming? Formal basis expression power implicit parallelism suitability to some problems Main Language: Prolog syntax declarative and operational semantics

    4. parent(arthur,carol). parent(carol,john). grandparent(X,Y) :- parent(X,Z), parent(Z,Y). length([H|T],N) :- length(T,N1), N is N1+1. length([],0).

    Introduction Sintxe semantica unificacao backtrackingSintxe semantica unificacao backtracking

    5. Sequential Implementation

    Interpreters x Compilers WAM (WarrenAbstract Machine) structure copying environments choicepoints heap trail

    6. Sequential Implementation

    7. Parallel Implementation

    Control Parallelism ORP: Or-parallelism ANDP: And-parallelism And + Or Data Parallelism Unification Path

    8. Parallel Implementation: ORP

    Problems representation of multiple bindings to the same variable Solutions stack sharing stack copying

    9. Parallel Implementation: ORP

    10. Parallel Implementation: ORP

    Stack sharing binding arrays hash windows version vectors variable importation …

    11. Parallel Implementation: ORP

    Speculative work Prolog semantics? Side-effects and pruning Scheduling

    12. Parallel Implementation: ANDP

    IAP: Independent and-parallelism DAP: Dependent and-parallelism DetAP: Determinate and-parallelism

    13. Parallel Implementation: IAP

    Goals that do not share variables can proceed in parallel. Compiler support CGEs: Conditional Graph Expressions Example of iap, execution tree and programExample of iap, execution tree and program

    14. Parallel Implementation: IAP

    paper(P,A,D,L) :- author(A), date(D), loc(P,A,D,L). Possible CGE: indep(A) & indep(D) => author(A) & date(D),loc(P,A,D,L)

    15. Parallel Implementation: IAP

    Cross-product of solutions Recomputation qsort([], []). qsort([P|T],L) :- partition(T,P,A,B), qsort(A,L1), qsort(B,L2), append(L1,[P|L2],L).

    16. Parallel Implementation: DAP

    Goals that share variables can proceed in parallel Producer and consumer Chosen at compile-time or runtime one value or stream Compiler support

    17. Parallel Implementation: DAP

    producer(N,Out) :- N > 0, N1 is N - 1, Out = [ferrari|Ms], producer(N1,Ms). producer(0,Out) :- Out = []. consumer([ferrari|Ms]) :- go-ride-ferrari, consumer(Ms). consumer([]).

    18. Parallel Implementation: DetAP

    Goals that match at most one clause can be executed first and in parallel Compiler support Reduction of search space Tree and programTree and program

    19. Parallel Platforms

    Shared-memory Distributed memory Distributed-shared memory Implicit x Explicit Parallelism Programming Model Process or processor-based

    20. Shared-memory Or-Parallel Systems

    Aurora WAM-based processor-based shared stacks binding arrays

    21. Aurora: Binding Arrays

    22. Shared-memory Or-Parallel Systems

    Scheduling in Aurora Wavefront Argonne Manchester Bristol Dharma

    23. Shared-memory Or-Parallel Systems

    Wavefront, Manchester and Argonne: topmost dispatching Bristol and Dharna: bottom-most dispatching speculative work

    24. Shared-memory Or-Parallel Systems

    Muse WAM-based processor-based stack copying

    25. Muse: Stack Copying

    Multiple environments maintained via stack-copying Memory space divided into identical address spaces to avoid pointer relocation Incremental copying Tree and example of MuseTree and example of Muse

    26. Shared-memory Or-Parallel Systems

    Scheduling in Muse Sophisticated operations to avoid data race workers keep data structures about idle and busy workers below their subtrees Shadowing Preference to leftmost work

    27. Shared-memory And-Parallel Systems

    &-Prolog &ACE DASWAM

    28. Shared-memory And-Parallel Systems

    &-Prolog RAP-WAM CGEs compiler support &ACE based on &-Prolog DASWAM DAP and IAP, producer determined at runtime

    29. Shared-memory And+Or Systems

    Andorra-I ACE SBA ParAKL Penny DAOS

    30. Andorra-I

    Determinate and-parallelism or-parallelism side-effects, cuts and commits teams of workers scheduling reduction of search space

    31. Andorra-I

    DetAP phase ORP phase #det goals = 0 #det goals <> 0

    32. Shared-memory And+Or Systems

    ACE IAP + ORP Stack copying IAP a la &-Prolog Composition tree Last parallel call optimisation

    33. ACE

    34. SBA

    IAP + ORP Stack sharing Shared Binding Arrays IAP a la &ACE Binding array divided into fixed segment sizes Conditional variable bound to a pair <seg#,offset>

    35. Performance Andorra-I

    36. Performance

    prog name Andorra-I JAM Aurora Muse nrv400 8.25 8.37 ---- ---- bt_cluster 9.37 9.70 ---- ---- bt_wms 3.32 ---- ---- ---- road_markings 6.24 ---- ---- ---- chat_80_db5 7.30 ---- 7.30 5.91 5x4x3_puzzle 9.66 ---- 9.51 8.69 warplan 1.20 ---- 2.63 1.06 protein_all 6.81 ---- 9.49 8.64 protein_1st 2.78 ---- 4.10 3.12 fly_pan 6.88 ---- ---- ---- scanner 5.47 ---- ---- ---- cipher 5.65 ---- ---- ----

    37. Performance

    Pgm map 8queen Xword 8queenp zebra flypan Prolog 5003 383146 6377 133612 19404 10539 Andorra-I 1047 214918 835 8496 5757 1517 Reduction in search space

    38. Performance: bt_cluster

    39. Performance: chat-80

    40. Performance: floorplan design

    41. Applications

    Optimisation Problems Databases Natural Language Processing Data Mining Constraint Satisfaction Problems ….

    42. Conclusions

    Logic programming: high level of abstraction Favours Implicit Parallelism Several applications Good performance on small to medium parallel architectures High performance is coming!

    43. Future Work

    More efficient methods to combine and + or parallelism Scheduling is an important issue Sophisticated compiler support Memory management Parallel constraint logic programming Efficient cluster implementations Applications

    44. Future Work

    Ideal System

    45. Perspectives

More Related