1 / 44

Restrictions on Concept Lattices for Pattern Management

This article explores the use of restrictions on concept lattices to manage patterns in data sets, including the concepts of projection and selection. Various algorithms are discussed, such as DFS, BFS, LBS, and BUS.

murphey
Download Presentation

Restrictions on Concept Lattices for Pattern Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Restrictions on Concept Lattices for Pattern Management Léonard Kwuida, Rokia Missaoui, Beligh Ben Amor, Lahcen Boumedjout, Jean Vaillancourt October 20, 2010

  2. Outline • Introduction • Pattern management • Restrictions on concept lattices • Projection • Selection • Algorithms • Depth-first Search (DFS) • Breadth-first Search (BFS) • Leading Bits Sort (LBS) • Bottom-up Search (BUS) • Experiments • Conclusion

  3. Objectives • Adapt the relational operators (e.g. projection) to the formal concept analysis framework to manipulate set of concepts. • Manage patterns using restriction on objects or attributes of a given data set. • Query a concept latticethrought a restriction (projection or selection). • Compare restriction on formal contexts vsrestriction on concept lattices.

  4. Pattern Management • Objective • Store, process and retrieve patterns defined over raw data. • Different types of patterns • Rules, clusters, decisiontrees, …. • Basic operations • Selection, projection, join, union, difference, … • Cross-over operations • Drill-through: from a pattern to raw data • Covering: does a pattern hold for a givendataset? • Approximation (Quafafou, Missaoui & Kwuida)

  5. Pattern Management • European PANDA Project • a generic framework to model various classes of patterns. • SQL operators • CINQ Project • Inductive databases. • Terrovitis and al. (2007) • A uniformframework for data and pattern management. • Links between data and pattern spaces. • Jeudy and al.(2007) • A Model for Managing Collections of Patterns.

  6. Restrictions on Concept lattices

  7. Restrictions on Concept lattices • Projection of a concept set on to N . • The projection of a concept set r over a set of attributes N  M isgiven by: N(r)= Project(r, N) ={c1=(Ext(c), Int(c)N)  c  r and c1 is maximal in its equivalence class}. • Two concepts c1 and c2 are equivalent if Int(c1)N= Int(c2)N.

  8. Restrictions on Concept lattices • Selection on a concept set . • The selection on a concept set r w.r.t. a (conjunctive) restriction F on attributes Ai (i  N) is a set of concepts cthatlogicallysatisfythat restriction. Select(r, F= {A1=a1 … AN=aN})={c  c  r and c = F} • The output corresponds to the orderidealin r generated by i  N(ai) where (ai)=(ai’, ai”) • For simplicityreasons, we assume that F is in a conjunctiveform.

  9. Example • Basket marketanalysis • Transactions and items (products) • Context K:= (G, M, I) Properties - Items i a b c d e g h Objects f X 1 X X Tr ansact ions X X 2 X X X X 3 X X X X X X 4 X X 5 X X X X 6 X X X X X 7 X X X X 8 X X X X

  10. Example 12345678 a ab ag ac 34678 ad 1234 5678 12356 adf agh 234 568 abg acd acgh abc abdf 123 34 678 56 36 abgh acdf 23 68 4 abcdf abcgh acde 7 3 6 acghi abcdefghi Concept Lattice

  11. Example - Projection a 12345678 Project(r, {abcd}) 34678 12356 ab ac ag ad 5678 1234 568 agh 234 adf acd 34 123 abc abdf acgh abg 678 56 36 abgh acdf 23 68 abcdf 4 abcgh acde 3 7 6 acghi  abcdefghi

  12. Projection Projection on {S; T;U; V } of the initial concept lattice. On the left we can see equivalence classes marked on the initial lattice. On the right we note that each equivalence classis represented by a single node (behind which a whole class is attached).

  13. Algorithms - Projection • Depth-first Search (DFS) • Breadth-first Search (BFS) • Leading Bits Sort (LBS) • Bottom-up Search (BUS)

  14. Depth-first Search • Input lattice B • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 12345678 a

  15. Depth-first Search • Input lattice B • Algorithm idea: • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Output latticeB1 12345678 a

  16. Depth-first Search • Input lattice B • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 12345678 a

  17. Depth-first Search • Input lattice B • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 12345678 a ac 34

  18. Depth-first Search • Input lattice B • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 12345678 a ac 34

  19. Depth-first Search • Input lattice B • Set the first class with the top element. • Test if the current node is in the same class with one of his marked parents or children. • If they do not belong to the same class, then create a new membership class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 12345678 a ac 34 34678 123 ab 3 abc abcd

  20. Breadth-first Search • Input lattice B • Start with the top element e. • Move to each node in the children of this element and compare it with e. • If it is not in the same class, then check whether all parents are marked. If so, then we create a new class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 a 12345678

  21. Breadth-first Search • Input lattice B • Start with the top element e. • Move to each node in the children of this element and compare it with e. • If it is not in the same class, then check whether all parents are marked. If so, then we create a new class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 a 12345678

  22. Breadth-first Search • Input lattice B • Start with the top element e. • Move to each node in the children of this element and compare it with e. • If it is not in the same class, then check whether all parents are marked. If so, then we create a new class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 a 12345678 ab 12356 ac 345678 ad 5678

  23. Breadth-first Search • Input lattice B • Start with the top element e. • Move to each node in the children of this element and compare it with e. • If it is not in the same class, then check whether all parents are marked. If so, then we create a new class for it. • Set up the links between the representatives of equivalence classes. • Algorithm idea: • Output latticeB1 a 12345678 ab 12356 ac 345678 ad 5678

  24. Leading Bits Sort • Intents of the input lattice B • The lectic order on subsets of M states that A precedes B if the first position in which A and B differ contains 0 in A and 1 in B. • The equivalent concepts/intents are necessarily consecutive. • Use the iPred procedure of Baixerie and al. to set links between the representatives of equivalence classes. • Algorithm idea: Project(r, {abcd})

  25. Leading Bits Sort • Intents of the input lattice B • The lectic order on subsets of M states that A precedes B if the first position in which A and B differ contains 0 in A and 1 in B. • The equivalent concepts/intents are necessarily consecutive. • Use the iPred procedure of Baixerie and al. to set links between the representatives of equivalence classes. • Algorithm idea: Project(r, {abcd})

  26. Leading Bits Sort • Output latticeB1 • The lectic order on subsets of M states that A precedes B if the first position in which A and B differ contains 0 in A and 1 in B. • The equivalent concepts/intents are necessarily consecutive. • Use the iPred procedure of Baixerie and al. to set links between the representatives of equivalence classes. • Algorithm idea: Project(r, {abcd}) a 12345678 ac 34678 ab 5678 ad 12356 678 acd abd 68 36 abc 6 abcd

  27. Bottom-up Search • Input lattice B • Westart the exploration of the lattice (upwards from the bottom) with the most general concept c,whose intent contains N. • There are twopossibilities: • If the concept c has exactly N as intent then the output of the projection is the filter generated by c. • If N is not an intent, then the attributes that are in N” ∩ N will be deleted one by one from the intent of concepts in the filter c. • Algorithm idea:

  28. Bottom-up Search • Input lattice B • Westart the exploration of the lattice (upwards from the bottom) with the most general concept c,whose intent contains N. • There are twopossibilities: • If the concept c has exactly N as intent then the output of the projection is the filter generated by c. • If N is not an intent, then the attributes that are in N” n N will be deleted one by one from the intent of concepts in the filter c. • Algorithm idea:

  29. Bottom-up Search • The filter c • Westart the exploration of the lattice (upwards from the bottom) with the most general concept c,whose intent contains N. • There are twopossibilities: • If the concept c has exactly N as intent then the output of the projection is the filter generated by c. • If N is not an intent, then the attributes that are in N” n N will be deleted one by one from the intent of concepts in the filter c. • Algorithm idea:

  30. Bottom-up Search • Output lattice B1 • Westart the exploration of the lattice (upwards from the bottom) with the most general concept c,whose intent contains N. • There are twopossibilities: • If the concept c has exactly N as intent then the output of the projection is the filter generated by c. • If N is not an intent, then the attributes that are in N” n N will be deleted one by one from the intent of concepts in the filter c. • Algorithm idea:

  31. Experiments • Environment • Java,1.9 GHz processor and 3GB memory • Parameters • Nb of concepts in K= (G, M, I) • Density of K: 40%, 50%,60% • Ratio N/M: (10%,...,80%) • data: from 71114 to 234946 concepts

  32. Experiments • Results • Better performance for LBS and BUS when the percentage of projection ishigherthan 40% • LBS has lower variation than BUS • DFS is the worstalgorithm • Projection on contextis not the best choice!

  33. Experiments

  34. Conclusion • Focus on projection • Work can be adapted for the selection • Possibility to handle the twooperations in one shot on a given concept lattice • Projection on lattices vs on contexts • Special cases where the projection on latticesis more efficient • More experiments are needed

  35. Future Work • An important fact: the projection is the inverse operation of the assembly of twolattices! • Projection on implication sets • Algorithmimprovement • Execution time and memoryconsumption • Otheroperations on concept lattice

  36. THANK YOU!

  37. Projection • K=(G, M, W, I) • Projection on a set N of attributes

  38. Selection • K=(G, M, W, I) • Selection on a set of objects

  39. DFS complexity • To analyze the complexity of this procedure, we consider the number of accesses to each node and the number of comparisons. • Each node is visited at least twice (on the way down and back). • If q is the number of equivalence classes, then there are in average q/2 comparisons to mark a node.

  40. BFS complexity • To evaluate the complexity of this algorithm, we consider two parameters: the number of needed comparisons and the number of times each node is accessed. Each node o is visited exactly #parent(o) + 1 times. Then, the overall access to nodes is :

  41. LBS complexity • The sortingprocesswith respect to the lectic order can be done in O(n x ln(n)), where n is the number of concepts in B. The marking of equivalence classes on B is straightforward since there is one linear pass in the linearly sorted set of concepts. Thus, the overall process has a complexity of O(n x ln(n)).

  42. ipred • It sorts the elements of the lattice by size. • All the Δ[ci] in each element of the input set is initialized to the empty set. • This Δ[ci] will contain the accumulation of faces for each element. • The first element in the border is the first element in the sequence • All remaining elements in the input sequence are processed in the order in • which they appear in the enumeration. • The candidate set is computed by intersecting the current element ciwith • all the elements in the border. • We check if the current element belongs to the upper set of the elements that are in the candidate set • If the test result is positive, ci ≺ ˜c, so we can add this connection to the output set, then we add that face to the set of accumulated faces of ˜c and finally, we remove ˜c from the Border • Before the next element is processed, we make sure that ci is added to the • border

  43. BUS complexity • The complexity of this procedure depends on two factors: • When we find the most general concept whose intent contains the set of attributes N. • The number of attributes to be deleted

  44. Work of Jeudy and al. • Sort the concepts on the topological order • Find the equivalence classes and their representatives. • Scan an other time the input lattice to built links between the representatives of equivalence classes.

More Related