290 likes | 302 Views
Explore fast user notifications in large-scale digital libraries, including the model, notification service, matching algorithms, and experimental results. Understand how information propagates and subscriptions are expressed. Programming theories, languages, and algorithms are also covered.
E N D
Fast User Notification in Large-Scale Digital Libraries: Experiments and Results Belhaj Frej Hanen, Philippe Rigaux & Nicolas Spyratos Laboratoire de Recherche en Informatique (LRI) Université Paris-Sud 11
Outline • Introduction • The model • The notification service • Refinement Relation • Matching Algorithms • Optimizations • Experimental results • Related work, Future work & Conclusion Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Introduction Introduction The model The notification service Experimental results Related work, Future work & Conclusion Notification Service Notifies when necessary Register Informs about changes Subscribers Publishers Query the repository Add documents Provides document access Remove/Modify documents Digital Library Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Introduction Introduction The model The notification service Experimental results Related work, Future work & Conclusion • How information is propagated from publishers to subscribers Main issues : • How are subscriptions expressed? • How is the notification service implemented? Matching/Filtering algorithms Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Outline • Introduction • The model • The notification service • Refinement Relation • Matching Algorithms • Optimisations • Experimental results • Related work, Future work & Conclusion Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Programming Theory Languages Algorithms DataBases Logics OOL Sort C++ Java MergeSort QuickSort BubbleSort JSP JavaBeans The Model Introduction The model The notification service Experimental results Related work, Future work & Conclusion • A document : (Id, D) • A document description D: one or several terms from a taxonomy • A taxonomy (T, ): • A set of terms • A subsumption relation between terms : • OOL Languages : Languages subsumes OOL Languages is more general than OOL (id1, {Java, MergeSort}) Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Programming Theory Languages Algorithms DataBases Logics OOL Sort C++ Java MergeSort QuickSort BubbleSort JSP JavaBeans The Model Introduction The model The notification service Experimental results Related work, Future work & Conclusion • A user subscription : a “registered” query A conjunctive query : • Sort • JavaΛ C++ • Sort Λ Database … Seen as a set of terms : • JavaΛ C++ {Java, C++} • Sort Λ Database {Sort, Database} • Users Directory : U • For each user u with subscrition q, a pair (u,q) is added to U Sort, QuickSort, MergeSort, or BubbleSort (Java, JSP or JavaBeans) and C++ Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Outline • Introduction • The model • The notification service • Refinement Relation • Matching Algorithms • Optimisations • Experimental results • Related work, Future work & Conclusion Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Notification:When? Whom? Introduction The model The notification service Experimental results Related work, Future work & Conclusion • Event e= (id,D) : addition, removal or modification of a document id described by D. • Whom to notify? Notify(e) = {u | (u,q) Є U and id Є ans(q)} ? Compute ans(q) and verify that it contains the new document Compare q vs D (D= the description of the document) Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Programming Theory Languages Algorithms DataBases Logics OOL Sort C++ Java MergeSort QuickSort BubbleSort JSP JavaBeans Notification:When? Whom? Introduction The model The notification service Experimental results Related work, Future work & Conclusion • Whom to notify? Notify(e) = {u | (u,q) Є U and id Є ans(q)} • Without the taxonomy : If q D • With the taxonomy : if every term in q subsumes a term in D Example: e1 = (id1, {Java, C++, Sort}) Notify(e1) = {U2} Notify(e1) = {U1, U2, U4} ? Compare all the subscriptions vs D Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Programming Theory Languages Algorithms DataBases Logics OOL Sort C++ Java MergeSort QuickSort BubbleSort JSP JavaBeans Notification :The refinement relation(1/3) Introduction The model The notification service Experimental results Related work, Future work & Conclusion The intuition : q3 = {OOL, Sort} q7 = {OOL, MergeSort} q6 = {Sort, C++} Not comparable q4 = {C++, MergeSort, BubbleSort} q5 = {OOL, Sort, Theory} More specific = finer More specific = finer q is finer than q’ iff every term in q’ subsumes a term in q Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Programming Theory Languages Algorithms DataBases Logics OOL Sort C++ Java MergeSort QuickSort BubbleSort JSP JavaBeans Notification :The refinement relation(2/3) Construct a graph of Subscriptions : Introduction The model The notification service Experimental results Related work, Future work & Conclusion q0 = {Programming} q2 = {Languages, Algorithms} q3 = {Algorithms} q4 = {Sort} q1 = {Languages} q2 = {Languages, Algorithms} q4 = {Sort} q10 = {C++, DataBases} q7 = {Java} q5 = {QuickSort, MergeSort} q6 = {QuickSort, BubbleSort} q8 = {JSP, Databases} q9 = {JSP, C++} Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
If D q then q’ q, D q’ A Notification: The matching (1/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion Recall : Notify(e) = {u | (u,q) Є U and id Є ans(q)} An event e (id, D) matches a subscription q iff D q • With the taxonomy : if every term in q subsumes a term in D • A top down traversal of the graph e = (id1, {QuickSort, MergeSort}) q1 = {Languages} q2 = {C++, DataBases} q3 = {C++, DataBases, QuickSort} Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Notification: The matching (2/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion Top down traversal of the graph : q0 = {Programming} q3 = {Algorithms} q1 = {Languages} q2 = {Languages, Algorithms} q4 = {Sort} q10 = {C++, DataBases} q7 = {Java} q5 = {QuickSort, MergeSort} q6 = {QuickSort, BubbleSort} q8 = {JSP, Databases} q9 = {JSP, C++} e = (id1, {QuickSort, MergeSort, BubbleSort}) Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Optimizations :Spanning tree (1/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion • The maintenance of the whole graph is very costly • A top down traversal of the graph Construct aspanning treeof the graph Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Optimizations :Spanning tree (2/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion A spanning tree of the profiles graph : q0 = {Programming} q3 = {Algorithms} q1 = {Languages} q2 = {Languages, Algorithms} q4 = {Sort} q10 = {C++, DataBases} q7 = {Java} q5 = {QuickSort, MergeSort} q6 = {QuickSort, BubbleSort} q9 = {JSP, C++} q4 = {Java,Sort} q8 = {JSP, Databases} The matching process remains the same Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Optimisations :Clustering Introduction The model The notification service Experimental results Related work, Future work & Conclusion The performance of the algorithm decreases when some profiles have a huge number of children! q0 = {Programming} q1 = {Java} q3 = {JSP, Logic} q4 = {JSP, DB} q5 = {JSP, MS} q6 = {Java, C++, QS} q7 = {Java, BS, QS} q8 = {Java, QS, logic} D = {Java} Solution : clustering Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Optimisations :Clustering Introduction The model The notification service Experimental results Related work, Future work & Conclusion Dynamic clustering by computing the least upper bound (lub) of the children q0 = {Programming} q1 = {Java} q5 = {JSP} q6 = {Java, C++, QS} q8 = {Java, QS} q7 = {Java, BS, QS} q3 = {JSP, Logic} q4 = {JSP, C++} q6 = {JSP, MS} q6 = {Java, C++, QS} q9 = {Java, QS, logic} q7 = {Java, BS, QS} D = {Java} Cost Model Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Outline • Introduction • The model • The notification service • Refinement Relation • Matching Algorithms • Optimisations • Experimental results • Related work, Future work & Conclusion Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Experimental results (1/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion • Insertions : NCmatching : without clustering Cmatching : with clustering Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Experimental results (1/2) Introduction The model The notification service Experimental results Related work, Future work & Conclusion • Notifications : NCmatching : without clustering Cmatching : with clustering Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Outline • Introduction • The model • The notification service • Refinement Relation • Matching Algorithms • Optimisations • Experimental results • Related work, Future work & Conclusion Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Related work Introduction The model The notification service Experimental results Related work, Future work & Conclusion • Two techniques relying on two-step approach : 1/predicates are : i) evaluated with respect to the event’s values, ii) the matching subscriptions are determined by counting their number of satisfied predicates Can’t be applied for our system : indexing of equality predicates does not extend to the subsumption relation. 2/ subscriptions are : i) organized the subscription directory in a special structure, ii) use this special structure to filter the incoming events some of these approaches organize the subscriptions in a redundant way, space requirements are important, the structure proposed is more suitable for equality tests. Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Future Work & Conclusion Introduction The model The notification service Experimental results Related work, Future work & Conclusion • A model and an optimized algorithm for the filtering of notifications in a digital library. Future Work • A preference function on the terms • Adding backgrounds to the subscriptions • Adding prerequisites to the document descriptions • Constructing trails of documents Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Merci! Questions? Fast User Notification in Large-Scale Digital Libraries: Experiments and Results Belhaj Frej Hanen (hanen@lri.fr), Philippe Rigaux (rigaux@lamsade.dauphine.fr) & Nicolas Spyratos (spyratos@lri.fr) Laboratoire de Recherche en Informatique (LRI) Université Paris-Sud 11 Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Experimental results Introduction The model The notification service Experimental results Future work & Conclusion • Notification : NCmatching : without clustering Cmatching : with clustering Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
clustering q0 = {Programming} q1 = {OOL} q4 = {JSP, C++} q6 = {JSP, MS} q3 = {JSP, Logic} q0 = {Programming} q0 = {Programming} q1 = {OOL} q1 = {OOL} q5 = {Java} q5 = {JSP} q4 = {JSP, C++} q6 = {JSP, MS} q3 = {JSP, Logic} q4 = {JSP, C++} q6 = {JSP, MS} q3 = {JSP, Logic} D = {Java} Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Filtering rates Selectivity : probability for a description to contain t or a term subsumed by t Filetring rate of S : probability for a description to match S Fast User Notification in Large-Scale Digital Libraries: Experiments and Results
Filtering rates Fast User Notification in Large-Scale Digital Libraries: Experiments and Results