390 likes | 452 Views
Learn about OPTICS, density-reachable objects, core distances, and more in clustering algorithms. Understand the challenges and approaches to effective clustering in data science.
E N D
p is an (r,M)-coreobject iff |D(p, r)| M (e.g. M=4, r = ) x is directly density reachable from a coreobject, p, iff xD(p,r) (disk of radius r about p) THM: if pi , pi+1 are coreobjects and pi+1 is DDR from pi then pi is DDR from pi+1 x is density reachable from a coreobject, p, iff p=p1…pn=x : each pi DDR from pi-1 x is density connected to y iff coreobject, o, : x and y are each DR from o A cluster is a set of (mutually?) density reachable objects which is maximal wrt density reachability (DB-scan definition). This means the set on the left has two clusters as shown on the right (blue is in both). Is that as you would want clusters defined? Note that density connected is symmetric but not transitive. Did DB-scan people intended that? Maybe so. But then a clustering is not a partition. OPTICS walk – animated (DataSURG) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
If p is a coreobject, its core-distance is the distance to its Mth nearest neighbor (smallest distance with respect to which it is a coreobject) If p is any object, its reachability-distance is the smallest distance, r’, such that p is density-reachable from an r’-coreobject. Note: if RD(x)=r sequence of coreobjects,p=p1, p2, …, pn-1, pn=x : each pi r-DR from pi-1 and if you decrease a it is no longer true. Then x is r-DR from pn-1 and if you decrease r, xwon’t be r-DR from pn-1 Therefore, RD(x)=r iff r is the smallest number such that within D(x,r) there is an r-coreobject. Note that there can be a coreobject, q, even closer than r,so the coreobject that causes the definition of RD(x) may not be the closest (i.e., RD(x) is not necessarily just the distance to the closest coreobject) In fact, as you build out r-rings from x it is the first radius :a coreobject, q, is encountered with CD(q) ≤ r ● ● ● ● ● ● ● ● ● ● ● ●
Next, I argue that OPTICS does not address the curse of dimensionality very well (said another way, the example on the first slide ought to be one cluster, not two). SIZE(n-sphere)0 as n∞ means all objects are in the middle-range corners of the n-cube, not in the major corners (the “points” on the n-cube) or minor corners (the dimples on the n-cube). The curse of dimensionality says “as you build out n-spheres you get no nbrs, no nbrs, …, no nbrs, too many nbrs!” This is because if you look at the n-cube (say of radius = 1) the number of “points” that have radius=1 in k ≤ n dimensions (k=1 is a minor corner or intercept or dimple and k=n is a major point or main diagonal) is given by the kth coefficient in the binomial formula (b+c)n =C(n,0)bnc0 + C(n,1)bn-1c1 +…+ C(n,k)bn-kck +…+ C(n,n)b0cn Where C(n,k) = n!/(n-k)!k! and as k goes from 0 to n, it is: 1, n, n(n-1)/2, n(n-1)(n-2)/2*3 ,…, n(n-1) , … , (n-k+1)/2*3*…*k ,…, 1 For large n the big numbers are in the middle. The point is that the L∞ spheres may not suffer from the curse of dimensionality like the Euclidean ones do. And therefore they may be preferable in OPTICS! (we can calculate them quickly and then a “square OPTICS” would make the group on slide-1 into one cluster!
Next, I ask some questions (in quest of the “perfect” walk-based clustering method. OPTICS walks the space by stepping to a nearby coreobject, based on some continual reordering of a temporary ordering of local nbrs (can Amal continue his animation to include new rings with multiple new points? I’m still not sure how the OPTICS walk proceeds and whether there is a scalable vertically determined walk which will be better. Elizabeth walked to the closest point? (Elizabeth, I’m sure I don’t have that quite right or complete. Could you fill in the correct details. Hilbert (and Peano) walk is entirely spatial (next step has to do with spatial arrangements only). There must be a better way in the middle between spatial only and OPTICS’ strange walk which involves (it seems) a continual revision of the order_seed file and the order_file.
This is a corepoint also. Its nbr to the left, then has RD given by length of green arrow! (less that that shown by red arrow).????? Some Definitions The generating-distance is the largest distance considered for clusters (largest radius for a disk of count at least Mp points to be in a cluster. I.e., = largest Mp-core-radius possible). Cluster core-disks radii, i are : 0i . The core-distance of a core point, p, is the smallest radius, ’ ( )about p that encircles at least Mp points. (smallest Mp-core-radius for p). The reachability-distance of p is the smallest distance such that p is density-reachable from a core object, o.
OrdSeeds Order File Obj | RchDist Obj | RchDist OPTICS 6 6 OPTICS(Objects, e, M, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, M) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,M) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 4 2 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRchdist = max(d, dist(obj, centerObj)) if obj.RchDist == NULL: obj.RchDist = newRdist insert(obj, newRdist) elif newRdist < obj.RchDist: obj.RchDist = newRdist decrease(obj, newRdist) Pt2 | 4 Pt1 | UnDf Pt2 | 4 Pt3 | 2 Pt3 | 4 Pt4 | 2 Pt4 | 4 Pt5 | 3
OrdSeeds Order File Obj RD Obj RD OPTICS Obj = 3 e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 e’ 2 3 5 7 4 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) Pt1 UnDef Pt3 2 Pt3 2 Pt2 4 Pt4 2 Pt4 1.2 Pt3 2 Pt5 1.3 Pt5 2 Pt6 3.5
OrdSeeds Order File Obj RD Obj RD OPTICS Obj = 4 e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 3 e’ 5 7 4 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) Pt1 UnDef Pt4 1.2 Pt4 1.2 Pt5 1.3 Pt2 4 Pt3 2 Pt6 3.2 Pt6 3.5 Pt4 1.2
OrdSeeds Order File Obj RD Obj RD OPTICS Obj = 5 e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 3 e’ 5 7 4 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) Pt1 UnDef Pt5 1.3 Pt5 1.3 Pt6 3.2 Pt2 4 Pt3 2 Pt7 3.1 Pt4 1.2 Pt5 1.3
Order File OrdSeeds Obj RD Obj RD OPTICS Obj = 7 OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 3 5 7 4 6 e OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) Pt1 UnDef Pt7 3.1 Pt7 3.1 Pt6 3.2 Pt2 4 Pt3 2 Pt4 1.2 Pt5 1.3 Pt7 3.1 Skip the other statements after OrderFile.write(obj) because obj.coreDistance == NULL or UNDEFINED
Order File OrdSeeds Obj RD Obj RD Skip these parts because all neighbors of Obj have been processed. OPTICS Obj = 6 OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 3 e 5 7 4 6 e’ OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) Pt1 UnDef Pt6 3.2 Pt6 3.2 Pt2 4 Pt3 2 Pt4 1.2 Pt5 1.3 Pt7 3.1 Pt6 3.2
OPTICS walk OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 3 5 7 4 6 UnDf OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) RDist The reachability-distanceof p is the smallest distance such that p is density-reachable from a core object o. Visited Data Points 1 2 3 4 5 6 7
Acknowledgements • [OPTICS] Ankerst, M., Breunig, M., Kreigel, H.-P., and Sander, J. 1999. OPTICS:Ordering points to identify clustering structure. In Proceedings of the ACM SIGMODConference, 49-60, Philadelphia, PA. • OPTICS Presentation by Chris Mueller November 4, 2004 • http://www.osl.iu.edu/~chemuell/projects/presentations/optics-v1.pdf • DataSURG 108 Meeting on 04/26/2005 (Dr. Perrizo, Elizabeth,DongMei,Amal)
Once one has the OPTICS walk ordering and the derived attribute, RchDist, then what? • Build a set of basic P-trees on the derived attribute, RchDist (Log2(MaxRchDist) of them). This is a one time process, the cost of which can be amortized over the value lifetime of the (presumably high value) data set. • Build out rings around MinRchDist to find clusters (in density order). • Each time a ring has a substantial number of objects, the mask gives one the objects at that density level. • The ordering separates that set into “Density Connected” components • (either using the DC equivalence relation or the DC’ equivalence relation.
DC and DC’ equivalence relation definition and discussion • I start with what I see as wrong with the approach as it is put forth in OPTICS (involving walks). I am not dealing with DENCLUE since it is not walk based. • I start with an insistence that clustering IS partitioning (into mutually exclusive and collectively exhaustive subsets). • Partitions are the mathematical dual of equivalence relations - binary relations that arereflexive ( (x,x) in R for every x ),symmetric ( (x,y) in R then (y,x) in R ) andtransitive ( x,y) & (y,z) in R then (x.z) in R ).The duality is:A partition defines an equivalence relation by (x,y) in R iff x & y belong to the same partition component.An equivalence relation defines a partition into its equivalence classes.
DC and DC’ equivalence relation definition and discussion-2 • It appears as though the OPTICS people are trying to define an equivalence relation (and thereby, a partition or clustering). They define 3 binary relations (presumably working toward one that’s reflexive, symmetric, transitive on core-objects plus border-objects).It goes as follows • assuming epsilon and M have been chosen. x is a core-object iff |{y: d(x,y)<epsilon| > M (x,y) in DDR (Direct Density Reachable) iff x is a core-object and d(x,y)<epsilon.DDR is not reflexive ( if x is not a core-object, then (x,x) not in R)DDR is not symmetric ( e.g., if x is core-object and y is border-object)DDR is not transitive ( (x,y) & (y,z) in DDR, d(x,z) may > epsilon, thus (x,z) not in DDR)So to get the transitivity, they define Density Reachable, DR, (x,y) in DR iff there exists x=P1,...,Pn=y such that (Pi,Pi+1) in DDR (implies P1,...,Pn-1 must be core-objects) • DR not reflexiveDR not symmetric: (x,y) in R, y may not be core or border object) so (y,x) may not be in R • DR is transitive: x=P1,...,Pn=y (Pi,Pi+1) in DR & y=Pi+1,...,Pi+m=z (Pi+j,Pi+j+1) in DR, then x=P1,...,Pi+m=z (Pk,Pk+1) in DR)
DC and DC’ equivalence relation definition and discussion-3 • OPTICS get symmetry (& reflexivity on core-objects and border-objects, CB) by defining • Density Connected, DC, (x,y) in DC iff there is core-object, o, such that (o, x), (o, y) in DR.DC is reflexive on CB (coreo-bjects plus border objects)DC is symmetric on CB ( use the same o for (x, y) in DC and (y, x) in DC )DC is NOT TRANSITIVE (see first example on first slide, basically b bx c b c y b c b z In this example, (x, y) in DC and (y, z) in DC but (x, z) not in DCSo they would put y in either the x-cluster or the z-cluster depending upon which was processed first? That doesn't seem correct (not sure they realize they lost transitivity when moving to DC). Note the following pathological case on the next slide • Or would they put y in both?
DC and DC’ equivalence relation definition and discussion-4 b b bx c y c z c uThis could end up clustered as b b bx c y c z c u or b b bx c y c z c u or • b b bx c y c z c u or several other clusterings. • I think it should all be one cluster.So my solution is to redefine DC as DC' (so that it is reflexive, symmetric and transitive on CB) (x,y) in DC' iff thereis x=z1,...,zn=y such that ((zi,zi+1) in DC. • Now the trick is to figure out how to actually do DC' clustering vertically and in a scalable fashion.
This is a corepoint also. Its nbr to the left, then has RD given by length of green arrow! (less that that shown by red arrow).????? Amal’s versionDefinitions Yes, But this is an example of a moment during the walk where the red is the core object used to find the RDs The generating-distance is the largest distance considered for clusters (largest radius for a disk of count at least Mp points to be in a cluster. I.e., = largest Mp-core-radius possible). Cluster core-disks radii, i are : 0i . The core-distance of a core point, p, is the smallest radius, ’ ( )about p that encircles at least Mp points. (smallest Mp-core-radius for p). The reachability-distance of p is the smallest distance such that p is density-reachable from a core object, o.
3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS e e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) e’ e’ e 1 e’ 2 4 5 3 7 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 2.4 Pt2 | 3 Pt1 | UnDf Pt2 | 3 Pt3 | 2 Pt3 | 3 Pt3 | 2 Pt4 | 3 Pt4 | 2 Pt5 | 3 Pt5 | 2.4 Pt6 | 3.5
3 3 2 2 1 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) e’ 1 2 4 5 3 7 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.6 Pt2 | 3 Pt1 | UnDf Pt3 | 2 Pt2 | 3 Pt4 | 2 Pt3 | 2 Pt5 | 2.4 Pt5 | 1.6 Pt4 | 2 Pt6 | 3.5
3 2 1 OrdSeeds Order File Obj | RchDist Obj | RchDist OPTICS e’ OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) e 1 2 4 5 3 7 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 2.5 Pt2 | 3 Pt1 | UnDf Pt3 | 2 Pt2 | 3 Pt4 | 2 Pt3 | 2 Pt5 | 1.6 Pt4 | 2 Pt5 | 1.6 Pt6 | 3.3 Pt6 | 3.5 Pt7 | 3
3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS e OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 4 5 NULL 3 7 6 OrderSeeds::update(neighbors, centerObj): Pt2 | 3 Pt1 | UnDf Pt3 | 2 Pt2 | 3 Pt4 | 2 Pt3 | 2 Pt5 | 1.6 Pt4 | 2 Pt5 | 1.6 Pt7 | 3 Pt7 | 3 Pt6 | 3.3
3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e, MinPts, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 1 2 e’ 4 5 3 7 e 6 OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: Done ! 3.6 Pt2 | 3 Pt1 | UnDf Pt3 | 2 Pt2 | 3 Pt4 | 2 Pt3 | 2 Pt5 | 1.6 Pt4 | 2 Pt7 | 3 Pt5 | 1.6 Pt6 | 3.5 Pt7 | 3 Pt6 | 3.5
e 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist Amal’s L-shaped OPTICS NULL OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: ● ● ● ● ● ● ● 1 ● ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): Pt1 | UnDf
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS 5 OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 3 ● ● ● ● ● ● ● 4 2 1 ● ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 2
e e’ 3 2 1 OrdSeeds Order File Obj | RchDist Obj | RchDist OPTICS 5 OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) 3 ● ● ● ● ● ● ● 4 2 1 6 ● ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 2 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 2
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 2 1 ● ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 2 Pt6 | 1.25 Pt4 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 2 1 ● ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 2 Pt5 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 2 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● ● ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) 2.00 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25 Pt9 | 2 Pt7 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● 10 ● 11 ● ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25 Pt9 | 1.25 Pt9 | 2 Pt7 | 1.25 Pt10 | 1.25 Pt8 | 1.5 Pt11 | 2
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● 10 ● 11 ● 12 ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25 Pt9 | 1.25 Pt7 | 1.25 Pt10 | 1.25 Pt8 | 1.5 Pt11 | 2 Pt11 | 1.25 Pt9 | 1.25 Pt12 | 2
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● 10 ● 11 ● 12 ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25 Pt9 | 1.25 Pt7 | 1.25 Pt10 | 1.25 Pt8 | 1.5 Pt11 | 1.25 Pt9 | 1.25 Pt12 | 1.25 Pt12 | 2 Pt10 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● 10 ● 11 ● 12 ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt3 | 1.25 Pt1 | UnDf Pt4 | 1.25 Pt2 | 1.25 Pt5 | 1.25 Pt3 | 1.25 Pt6 | 1.25 Pt4 | 1.25 Pt7 | 1.25 Pt5 | 1.25 Pt8 | 1.5 Pt6 | 1.25 Pt9 | 1.25 Pt7 | 1.25 Pt10 | 1.25 Pt8 | 1.5 Pt11 | 1.25 Pt9 | 1.25 Pt12 | 1.25 Pt10 | 1.25 Pt13 | 2 Pt11 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● 9 ● 10 ● 11 ● 12 ● OrderSeeds::update(neighbors, centerObj): d = centerObj.coreDistance for each unprocessed obj in neighbors: newRdist = max(d, dist(obj, centerObj)) if obj.reachability == NULL: obj.reachability = newRdist insert(obj, newRdist) elif newRdist < obj.reachability: obj.reachability = newRdist decrease(obj, newRdist) 1.25 Pt1 | UnDf Pt3 | 1.25 Pt2 | 1.25 Pt4 | 1.25 Pt3 | 1.25 Pt5 | 1.25 Pt4 | 1.25 Pt6 | 1.25 Pt5 | 1.25 Pt7 | 1.25 Pt6 | 1.25 Pt8 | 1.5 Pt7 | 1.25 Pt9 | 1.25 Pt8 | 1.5 Pt10 | 1.25 Pt9 | 1.25 Pt11 | 1.25 Pt10 | 1.25 Pt12 | 1.25 Pt11 | 1.25 Pt13 | 2 Pt13 | 1.25 Pt12 | 1.25
e e’ 3 2 1 Order File OrdSeeds Obj | RchDist Obj | RchDist OPTICS Done ! OPTICS(Objects, e=2, MinPts=3, OrderFile): for each unprocessed obj in objects: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e, MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) for obj in orderSeeds: neighbors = Objects.getNeighbors(obj, e) obj.setCoreDistance(neighbors, e,MinPts) OrderFile.write(obj) if obj.coreDistance != NULL: orderSeeds.update(neighbors, obj) ● ● 5 3 ● ● ● ● ● 6 4 7 2 1 ● 8 ● NULL 9 ● 10 ● 11 ● 12 ● OrderSeeds::update(neighbors, centerObj): Pt1 | UnDf Pt3 | 1.25 Pt2 | 1.25 Pt4 | 1.25 Pt3 | 1.25 Pt5 | 1.25 Pt4 | 1.25 Pt6 | 1.25 Pt5 | 1.25 Pt7 | 1.25 Pt6 | 1.25 Pt8 | 1.5 Pt7 | 1.25 Pt9 | 1.25 Pt8 | 1.5 Pt10 | 1.25 Pt9 | 1.25 Pt11 | 1.25 Pt10 | 1.25 Pt12 | 1.25 Pt11 | 1.25 Pt13 | 1.25 Pt12 | 1.25 Pt13 | 1.25
3 3 2 2 1 1 Pt1 | UnDf Pt2 | 1.25 Order File Pt3 | 1.25 Obj | RchDist Pt4 | 1.25 Pt5 | 1.25 Pt6 | 1.25 Pt7 | 1.25 Pt8 | 1.5 Pt9 | 1.25 Pt10 | 1.25 Pt11 | 1.25 Pt12 | 1.25 Pt13 | 1.25 OPTICS ● ● 5 3 At e=2, 1 Cluster ● ● ● ● ● 6 4 7 2 1 At e=1.25, 2 Clusters ● 8 ● 9 ● 10 ● 11 ● 12 ●