Annotation Inference in KAnnotator

Annotation Inference in KAnnotator

Annotation Inference: An Overview • Annotation Inference includes following steps: • Load external/previously inferred annotations from specified sources (XML files, class files, annotation logs, etc.) • Load descriptions of classes to analyze • Invoke inference algorithm on loaded classes and annotations to produce set of inferred annotations • Update inferred annotations with propagation algorithm to ensure extension to other methods in inheritance hierarchy • Process conflicts between inferred and external annotations • Inference process is parameterized with algorithms which implement inference for specific kinds of annotations (e.g. nullability): • Infer field annotations given its value • Infer method annotations given its bytecode

Annotations Structure • Annotations are represented as a map from annotation position to actual annotation values (e.g. NULLABLE/NOT_NULL) • Annotation position consists of • Class member (field/method) • Declaration position – annotated component of class member: • Field type position – annotates field type • Return type – annotates return type of the method • Parameter position – annotates type of method parameter (given its index)

Annotation Lattice • Inference assumes that annotations form a lattice, hence for any pair (a, b) of annotations of the same kind (e.g. nullability) least upper LUB(a, b) and greatest lower GLB(a, b) bounds are defined • Nullability: NOT_NULL < NULLABLE • Mutability: READ_ONLY < MUTABLE • Unification: given two annotations (a, b) and declaration position p unified annotation unify(a, b, p) is defined as: • LUB(a, b) if p is covariant • GLB(a, b) if p is contravariant • b if p is invariant (assuming a == b) • Unification is naturally extended to annotation sets • Annotation a (“parent”) subsumes annotation b (“child”) at position p if unify(a, b, p) == a

Field/Method Dependency • Field dependency is a map which associates field descriptions with pair (readers, writers) where • readers is a set of all methods (within given class set) which access field value through GETFIELD or GETSTATIC instructions • writers is a set of all methods (within given class set) which mutate field value through PUTFIELD or PUTSTATIC instructions • Method dependency is a graph with methods as vertices. The graph contains egde (a, b) if one of the following conditions holds: • If method a invokes method b through one of INVOKE*** instructions • If there is a non-primitive field such that method a is its reader and method b is its writer

Annotation Inference (without propagation) • Input: • classSource: set of classes to analyze • externalAnn: external annotations (e. g. loaded from classfiles/XMLs) • existingAnn: previously inferred annotations • Output: • inferredAnn: inferred annotations • inferredAnn := copy of existingAnn • fieldDep := build field dependency map for all classes in classSource • methDep := build method dependency graph for all classes in classSource • depComps := list of SCCs of methDep ordered with the respect to topological sorting • For each field f in fieldDep, use annotation-specific algorithm to infer annotations from initial value of f and copy them to inferredAnn • For each component comp in components infer annotation for methods within comp (see below)

Infer Annotations on Dependency Graph SCC • Input: • methods: set of methods which form dependency graph SCC • ann: current annotations (to be updated by the inference) • queue := new queue containing all items from methods • while (queue is not empty) • m := remove first method from queue • cfg := build control-flow graph of m • inferredAnn := Invoke annotation-specific inference algorithm (e.g. nullability) for method m, graph cfg and predefined annotations ann • Copy all changed annotations from inferredAnn to ann • If (at least one annotation was changed/added/removed) then • Add to the queue: • all dependent methods of m which belong to the same SCC as m itself • method m

Propagation: An Overview • Propagation algorithm extends given annotation set to methods within the same inheritance hierarchy • Propagation proceeds in the following steps: • Resolve annotation conflicts • Parent and child method have conflicting annotations at some position if child annotation does not subsumes parent annotation. Conflicts are fixed by updating parent annotation to be least upper bound of child annotation and previous parent annotation • Unify parameter annotations • Methods in the same inheritance hierarchy are assigned identical annotation at corresponding parameter. The annotation is computed as least upper bound over annotations already present at that parameter • Apply propagation overrides • Propagation override is an exception to unification algorithm which states that given method and all its descendants must have some specific annotation at given parameter

Propagation: Conflict Resolution • Input • leaves: set of “leaf” methods • lat: annotation lattice • ann: annotations (to be updated) • For each method leaf in leaves • propagatedAnn := Annotations(ann) • Perform breadth-first traversal of method hierarchy graph starting from m (moving from child to parents) and for each traversed method m and each parent method pm of m and each annotation position ppos in pm • p := declaration position of ppos • pos := position corresponding to ppos in method m • child := propagatedAnn[p], parent := ann[pp] • If (child is defined) then • If (parent is defined) then • ann[pp] := lat.unify(child, parent, p) • else propagatedAnn[pp] := child • If (p == RETURN_TYPE) then ann[pp] := child

Propagation: Conflict Resolution Example (I) publicclassXHierarchyAnnotatedMiddle{ publicinterface Top1 { @NotNull Object m(@NullableObject x); } publicinterface Top2 { @NotNull Object m(@NullableObject x); } publicinterface Middle extends Top1, Top2 { @Nullable Object m(Object x); } publicinterface Leaf1 extends Middle { Object m(@NotNullObject x); } publicinterface Leaf2 extends Middle { Object m(Object x); } }

Propagation: Conflict Resolution Example (II) publicclassXHierarchyAnnotatedMiddle{ publicinterface Top1 { @Nullable Object m(@NotNullObject x); } publicinterface Top2 { @Nullable Object m(@NotNullObject x); } publicinterface Middle extends Top1, Top2 { @Nullable Object m(Object x); } publicinterface Leaf1 extends Middle { Object m(@NotNullObject x); } publicinterface Leaf2 extends Middle { Object m(Object x); } }

Propagation: Parameter Unification • Input • methods: set of methods • lat: annotation lattice • ann: annotations (to be updated) • descriptors := set of method descriptors found in methods • For each method descriptor desc in descriptors • descMethods := subset of methods with descriptor desc • For each parameter declaration position p in desc • paramAnn := set of all annotations from ann defined at such position pos that its method is from descMethods and its declaration position is p • If (paramAnn is not empty) then • unifiedAnnotation := lat.unify(paramAnn, p) • For each method m in descMethods • pos := annotation position of m corresponding to declaration position p • ann[pos] := unifiedAnnotation

Propagation: Parameter Unification Example (I) publicclassXHierarchy{ publicinterface Top1 { Object m(Object x, Object y); } publicinterface Top2 { Object m(@NotNull Object x, Object y); } publicinterface Middle extends Top1, Top2 { Object m(@Nullable Object x, @NullableObject y); } publicinterface Leaf1 extends Middle { Object m(Object x, Object y); } publicinterface Leaf2 extends Middle { Object m(Object x, @NullableObject y); } }

Propagation: Parameter Unification Example (II) publicclassXHierarchy{ publicinterface Top1 { Object m(@NotNull Object x, @NullableObject y); } publicinterface Top2 { Object m(@NotNull Object x, @NullableObject y); } publicinterface Middle extends Top1, Top2 { Object m(@NotNullObject x, @NullableObject y); } publicinterface Leaf1 extends Middle { Object m(@NotNull Object x, @NullableObject y); } publicinterface Leaf2 extends Middle { Object m(@NotNull Object x, @NullableObject y); } }

Propagation: Overriding Rules • Input • graph: method hierarchy graph • overrides: annotations specifying overriding rules • ann: annotations (to be updated) • For each method annotation ann at position opos in overrides • method := method corresponding to annotation position opos • Perform breadth-first traversal of method hierarchy graph starting from method(moving from parent to children) and for each traversed method m • pos := position corresponding to oposin method m • ann[pos] := overrides[opos]

Propagation: Overriding Rule Example (I) Rule: Top1.m(Object, Object) at 0 is NULLABLE publicclassXHierarchy{ publicinterface Top1 { Object m(@NotNull Object x, @NullableObject y); } publicinterface Top2 { Object m(@NotNull Object x, @NullableObject y); } publicinterface Middle extends Top1, Top2 { Object m(@NotNullObject x, @NullableObject y); } publicinterface Leaf1 extends Middle { Object m(@NotNull Object x, @NullableObject y); } publicinterface Leaf2 extends Middle { Object m(@NotNull Object x, @NullableObject y); } }

Propagation: Overriding Rule Example (II) Rule: Top1.m(Object, Object) at 0 is NULLABLE publicclassXHierarchy{ publicinterface Top1 { Object m(@Nullable Object x, @NullableObject y); } publicinterface Top2 { Object m(@Nullable Object x, @NullableObject y); } publicinterface Middle extends Top1, Top2 { Object m(@NullableObject x, @NullableObject y); } publicinterface Leaf1 extends Middle { Object m(@Nullable Object x, @NullableObject y); } publicinterface Leaf2 extends Middle { Object m(@Nullable Object x, @NullableObject y); } }

Conflict Processing • Input: • existingAnn: predefined annotations • inferredAnn: inferred annotations • lat: annotation lattice • excPositions: set of excluded annotation positions • Output: • conflicts: list of triples (position, existing annotation, inferred annotation) • conflicts := empty list • positions := set of all positions in existingAnn • For each annotation position pos in positions • inferred := inferredAnn[pos], existing := existingAnn[pos] • p := declaration position corresponding to pos • If (existing does not subsume inferred at p) then • If (pos in excPositions) then inferredAnn[pos] := existing • else add (pos, existing, inferred) to conflicts

Conflict Processing: Example publicclassXHierarchy{ publicinterface Top1 { Object m(Object x, Object y); } publicinterface Top2 { Object m(@Nullable@NotNull Object x, Object y); } publicinterface Middle extends Top1, Top2 { @NotNull @Nullable Object m(@Nullable Object x, @NullableObject y); } publicinterface Leaf1 extends Middle { Object m(Object x, Object y); } publicinterface Leaf2 extends Middle { Object m(Object x, @NullableObject y); } }

Control-Flow Graph • Method control-flow graph describes transitions between bytecode instructions • Each instruction and transition has corresponding frame state which describes content of local variables and stack • Interesting stack values correspond to method parameters • Also each instruction has one outcome value which reflects possible terminations of outgoing control-flow paths: • ONLY_RETURNS • ONLY_THROWS • RETURNS_AND_THROWS • Instruction outcomes are computed on demand

Computation of Instruction Outcomes • Outcome of given instruction srcInsn is computed by depth-first traversal of all instructions reachable from srcInsn and merging outcomes of all visited termination instructions such that • Outcome of any *RETURN instruction is ONLY_RETURNS • Outcome of ATHROW instruction is ONLY_THROWS • Traversal can be stopped earlier if RETURNS_AND_THROWS outcome is produced • Outcomes are merged according to the rules: • a + a = a • a + b = RETURN_AND_THROWS if a != b

Mutability Inference: Mutability Invocations • Mutating invocations: • Collection.{add, remove, addAll, removeAll, retainAll, clear} • Set.{add, remove, addAll, removeAll, retainAll, clear} • List.{add, remove, set, addAll, removeAll, retainAll, clear} • Map.{put, remove, putAll, clear} • Map.Entry.setValue • Iterator.remove • Mutability propagating invocations: • {Collection, Set, List}.iterator • List.listIterator • Map.{keySet, values, entrySet}

Mutability Inference • Input: • method: method to be analyzed • cfg: control-flow graph of method • ann: predefined annotations set • Output: • inferredAnn: inferred annotations set • mutabilityMap := empty map from values to mutability annotations • For each invocation instruction insn in cfg • If (insn is mutating invocation of some method m) then • Mark each possible value of m’s receiver as MUTABLE • For each parameter param of m • pos := annotation position corresponding to param of m • If (ann[pos] is MUTABLE) then mark each possible value of param as MUTABLE • For each value v in mutabilityMap which is parameter of method • pos := annotation position corresponding to v in method • ann[pos] := convert mutabilityMap[v] to annotation value

Mutability Inference: Value Marking • Input • value: stack value • mutabilityMap: map from values to mutability annotations (to be updated) • mutabilityMap[value] := MUTABLE • If (value is created by method invocation instruction insn and insn propagates mutability) • m := method invoked by insn • Recursively mark each possible values of m’s receiver as MUTABLE

Nullability Inference: Nullability Values • Inference process assigns nullability to stack values: • UNKNOWN: not enough information to infer nullability • NULLABLE • NOT_NULL • NULL • CONFLICT: contradicting nullabilities (value is not realizable) • Nullability merge rules: • a + a = a • a + CONFLICT = CONFLICT + a = a • a + NULL = NULL + a = NULLABLE • a + NULLABLE = NULLABLE + a = NULLABLE • NOT_NULL + UNKNOWN = UNKWNON + NOT_NULL = UNKNOWN

Nullability Inference: Nullability Maps • Nullability map is used to keep association between stack values and nullability. In particular, nullability map is computed for each instruction and transition in control-flow graph of a method • Additional structures: • Set of method annotation position • Set of existing annotations (external or previously inferred) • Declaration index (used to look up fields and methods by their descriptors in bytecode) • Optional frame state: • In case of transition-related map it’s a state AFTER originating instruction • In case of instruction-related map it’s a merged state BEFORE the instruction • Assuming state is defined if some value is present in map, but absent in its state, it’s said to be lost • Set of spoiled values (i.e. values which are no longer associated with parameters due to assignment)

Nullability Inference: Nullability Map Lookups • Stored: m.getStored(v) • Return actual nullability previously stored in map, or UNKNOWN if nullability is not defined • Full: m[v] • If (v is lost) then return CONFLICT • If (some nullability x was previously stored for value v) then retun x • If (v is created by some instruction insn) then • If (insn is NEW, NEWARRAY, ANEWARRAY, MULTIANEWARRAY, or LDC) then return NOT_NULL • If (insn is ACONST_NULL) then return NULL • If (insn is AALOAD) then return UNKNOWN • If (insn is GETFIELD or GETSTATIC) then return nullability corresponding to the field annotation (or UNKNOWN if undefined) • If (insn is INVOKE***) then return nullability corresponding to return value of the invoked method (or UNKNOWN if undefined) • If (v is interesting) then return nullability corresponding to existing annotation at position encoded by v • If (v is null) then return NULL • If (v is primitive) then return CONFLICT • Otherwise return NOT_NULL

Nullability Inference: Merging Nullability Maps • Input: • srcMaps: Set of nullability maps • Output: • mergedMap: merged nullability map • mergedValues: set of values which have different nullability in at least two maps in srcMaps • mergedMap := new empty nullability map • mergedValues := new empty set • affectedValues := set of all stack values in srcMaps key sets • For each map m in srcMaps • Add all values from m.spoiledValues to mergedMap.spoiledValues • For each value v in affectedValues • If (v is already in mergedMap keys) then • If (mergedMap[v] != m[v]) then add v to mergedValues • mergedMap[v] := merge m[v] with mergedMap[v] • else mergedMap[v] := m[v] • If (v is lost in m and m.getStored(v) != NOT_NULL) then • Add v to mergedMap.spoiledValues

Nullability Inference: Infer from Field Value • Input: • field: field description • Output: • ann: nullability annotation • If (field is final and field type is not primitive and field initial value is not null) then • ann := NOT_NULL • else • ann := UNKWNOWN

Nullability Inference: Infer from Method • Input: • method: method to be analyzed • cfg: control-flow graph of method • ann: predefined annotations set • Output: • inferredAnn: inferred annotations set • ovrMap := new empty nullability map • mergedMap := new empty nullability map • returnValueInfo := UNKNOWN • fieldInfo := new empty map from fields to nullability values • For each instruction insn in cfg • insnMap := compute nullability map for insn • If (insn is *RETURN) then • Process return instruction (insn, mergedMap, returnValueInfo) • If (insn is PUTFIELD or PUSTATIC) then • Process field write (insn, fieldInfo) • inferredAnn := create annotations (ovrMap, mergedMap, returnValueInfo, fieldInfo)

Nullability Inference: Process Returns • Input: • insn: instruction • mergedMap: nullability map (to be updated) • returnValueInfo: return value nullability (to be updated) • Merge insnMap to mergedMap • If (insn is ARETURN) then • For each possible return value v • retValue := if ovrMap contains v then ovrMap[v] else insnMap[v] • Merge retValue to returnValueInfo

Nullability Inference: Process Field Write • Input: • insn: instruction • fieldInfo: map from fields to nullability values (to be updated) • field := field mutated by insn • If (field has reference type and is final) then • nullability := Merge all possible nullabilities of new field value in insn • If (fieldInfo contains key field) then • fieldInfo[field] := fieldInfo[field] merge nullability • else • fieldInfo[field] := nullability

Nullability Inference: Create Annotations • Input: • ovrMap: override nullability map • mergedMap: merged instruction nullability map • returnValueInfo: return value nullability • fieldInfo: map from fields to nullability values • Output • ann: annotations set • ann:= new empty annotations set • ann[return type position] := convert returnValueInfo to annotation • For each interesting value v in mergedMap.keySet • pos := annotation position corresponding to v • If (v in ovrMap.keySet) then • nullability := ovrMap.getStored(v) • else If (v in mergedMap.spoiledValues) • nullability := NULLABLE • else nullability := mergedMap.getStored(v) • ann[pos] := convert nullability to annotation

Compute InstructionNullability Map • Input: • ann: existing annotations set • insn: instruction • cfg: control-flow graph • ovrMap: overriding nullability map (to be updated) • Output: • insnMap: instruction nullabilitymap • insnMap, mergedValues := merge maps from incoming edges of insn • inheritedValues := insnMap.keySet – mergedValues • Process dereferencing (ann, insn, insnMap, cfg, ovrMap) • If (insn is null check) then • Process null-branching (insn, insnMap, cfg, ovrMap) • Else If (insn is equality check preceded by instanceof) then • Process instanceof-branching (insn, insnMap, cfg, ovrMap) • Else for each outgoing transition e of insn • e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state

Process Dereferencing Instruction • Input: • ann: existing set of annotations • insn: instruction • insnMap: instruction nullability map (to be updated) • cfg: control-flow graph • ovrMap: overriding nullability map (to be updated) • If (insn is invocation of some method m) then • Mark each possible value of m’s receiver as NOT_NULL • For each parameter param of m • pos := annotation position corresponding to param of m • If (ann[pos] is NOT_NULL) then mark each possible value of param as NOT_NULL • If (insn is GETFIELD, ARRAYLENGTH, ATHROW, MONITORENTER, MONITOREXIT, *ALOAD, *ASTORE, or PUTFIELD) then • Mark each possible value of insn receiver as NOT_NULL

Process NULL-Branching Instruction • Input: • insn: instruction • insnMap: instruction nullability map (to be updated) • cfg: control-flow graph • ovrMap: overriding nullability map (to be updated) • For nullable transition e • e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state and nullability of condition subjects replaced according to the rule: • If CONFLICT or NOT_NULL then CONFLICT, otherwise NULL • For non-nullabletransition e • Similar to above, but replacement rule is • If CONFLICT or NULL then CONFLICT, otherwise NOT_NULL • For each remaining transition e • e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state • If (outcome of nullable transition target is ONLY_THROWS) then • For each possible value v of condition subject ovrMap[v] := NOT_NULL

Process InstanceOF-Branching Instruction • Input: • insn: instruction (IFEQ/IFNE) preceded by INSTANCEOF • insnMap: instruction nullability map (to be updated) • cfg: control-flow graph • ovrMap: overriding nullability map (to be updated) • For instance-of (non-nullable) transition e • e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state and nullability of condition subjects replaced according to the rule: • If CONFLICT or NULL then CONFLICT, otherwise NOT_NULL • For not-instance-of (nullable) transition e • Similar to above, but replacement rule is • If UNKNOWN then NULLABLE, otherwise do not change • For each remaining transition e • e.nullabilityMap := Copy of insnMap with state replaced with e’s own frame state

Nullability Value Marking • Input • value: stack value • inheritedValues: set of inherited values • insnMap: instruction nullability map (to be updated) • ovrMap: overriding nullability map (to be updated) • If (insnMap.getStored(value) is neither CONFLICT, nor NULL) • insnMap[value] := NOT_NULL • If (value is interesting and inheritedValues is empty and value is not in insnMap.spoiledValues) then • ovrMap[value] := NOT_NULL

Annotation Inference in KAnnotator

Annotation Inference in KAnnotator

Presentation Transcript

Annotation

Inference in FOL

Annotation

Biological Annotation in R

Annotation

Annotation

Annotation

Annotation

Annotation

Annotation

Annotation

Inference in Biology

Inference in FOL

Annotation

Semantic Annotation in SALSA

Annotation

ANNOTATION

Annotation