540 likes | 946 Views
Semi-Supervised Entity Alignment via Knowledge Graph Embedding with Awareness of Degree Difference. Shichao Pei, Lu Yu, Robert Hoehndorf , Xiangliang Zhang KAUST Group Meeting. Background. Knowledge Graph. There are several descriptions:.
E N D
Semi-Supervised Entity Alignment via Knowledge Graph Embedding with Awareness of Degree Difference Shichao Pei, Lu Yu, Robert Hoehndorf, Xiangliang Zhang KAUST Group Meeting
Knowledge Graph • There are several descriptions: Ehrlinger, Lisa, and Wolfram Wöß. "Towards a Definition of Knowledge Graphs." SEMANTiCS (Posters, Demos, SuCCESS). 2016.
Knowledge Graph • knowledge graph is a knowledge-based system that contains a knowledge base and a reasoning engine. • Essential characteristic: collection, extraction, and integration of information from external sources extends a pure knowledge-based system with the concept of integration systems. Ehrlinger, Lisa, and Wolfram Wöß. "Towards a Definition of Knowledge Graphs." SEMANTiCS (Posters, Demos, SuCCESS). 2016.
Knowledge Graph • Interesting Definition: • KG is usually represented using triple facts of (head entity, relation, tail entity). A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge. Ehrlinger, Lisa, and Wolfram Wöß. "Towards a Definition of Knowledge Graphs." SEMANTiCS (Posters, Demos, SuCCESS). 2016.
Multi-Relational data • directed graphs whose nodes correspond to entities and edges of the form (head, edge, tail) • each of edges indicates that there exists a relationship between the entities, head and tail. • The modeling process boils down to extracting local or global connectivity patterns between entities. • Single-relational: ad-hoc but simple modeling assumptions can be made after some descriptive analysis of the data. • Multi-relational: the notion of locality may involve relationships and entities of different types at the same time
KG Embedding • To embed components of a KG including entities and relations into continuous vector spaces. • To simplify the manipulation while preserving the inherent structure of the KG. • Those entity and relation embedding can further be used to benefit all kinds of tasks, such as KG completion, relation extraction, entity classification, and entity resolution.
KG Embedding • Many works focus on the knowledge representation learning. • TransE [Bordeset al., 2013] projects both entities and relations into a continuous low-dimensional vector space. • TransE assumes that in the vector space we have h + r ≃ t, which is simple and effective.
Translation-based model • hierarchical relationships are extremely common in KBs and translations are the natural transformations for representing them.
KG Embedding Wang, Quan, et al. "Knowledge graph embedding: A survey of approaches and applications." IEEE Transactions on Knowledge and Data Engineering 29.12 (2017): 2724-2743.
Entity Alignment - Motivation • Various methods, sources, and languages have been explored to construct KGs, and most existing KGs are developed separately. • These KGs are inevitably heterogeneous in surface forms and typically supplementary in contents. • It is thus essential to align entities in multiple KGs and join them into a unified KG for knowledge-driven applications.
Entity Alignment Chen, Muhao, et al. "Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment.”. IJCAI 2018.
Entity Alignment Zhu, Hao, et al. "Iterative entity alignment via joint knowledge embeddings." Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 2017.
Problem Definition • Describe knowledge in knowledge graph as triples (h, r, t), in which h and t denote head and tail entities and r denotes the relations between entities. • A knowledge graph is formalized as KG = (E,R,T), where E,R,T are the set of entities, relations and triples respectively. • Suppose there are multiple knowledge graphs Σ = {KGi|KGi = (Ei, Ri, Ti)} of heterogenous and complementary triples. An entity in a KG has its counterparts in other KGs in different languages or surface names.
Problem Definition • some synonymous entities among KGs are already known, defined as aligned seeds. • each pair of entities from alignment seeds is also called aligned entities. • The task of entity alignment is to automatically find and align more synonymous entities based on known alignment seeds.
Related Works • Feature Engineering. • the semantics of OWL properties [Hu et al., 2011] • compatible neighbors and attribute values of entities [Suchaneket al., 2012] • structural information of relations [Lacoste-Julien et al., 2013] • make use of external lexicons, machine translation, Wikipedia links [Suchaneket al., 2012; Wang et al., 2013] • crowdsourcing [Vrandecˇic ́ and Kro ̈tzsch, 2014] • well-designed hand-crafted features [Mahdisoltaniet al., 2014] • These works can achieve high alignment accuracies, while the human-involved approach is time-consuming, labor-expensive and usually suffers from extension inflexibility.
Related Works • Embedding-based models: • MTransE [Chen et al., 2017] uses TransE to represent different KGs as independent embeddings, and learns transformation between KGs via five alignment models. • IPTransE [Zhu et al., 2017] employs PTransE to embed a single KG and integrates three modules (translation-based, linear transformation and parameter sharing) for jointly embedding different KGs. • JAPE [Sun et al., 2017] learns embeddings for entities and relations of different KGs in a unified embedding space. It also embeds attributes and leverages attribute correlations to refine entity embeddings. • KDCoE [Chen et al., 2018] leverages a weakly aligned multilingual KG for semi-supervised cross-lingual learning using entity descriptions. • BootEA [Sun et al., 2018] tries iteratively enlarge the labeled entity pairs based on the bootstrapping strategy.
Introduction • Knowledge graphs have been constructed and widely applied to organize and represent the knowledge of different domains. • Even in the same domain, knowledge graphs are generated by different methods in different languages. • It is thus essential to connect multiple knowledge graphs in same domain.
Introduction • The number of accessible prior alignment is usually a small proportion of a whole knowledge graph. • Most of these methods require a sufficient number of labeled entities to generalize well in downstream applications. • Our work targets on designing semi-supervised entity alignment model, which learns from both labeled and unlabeled entities.
Introduction • Knowledge graph embedding methods show significant improvement on entity alignment. • Our work also takes advantage of embedding methods for building a semi-supervised entity alignment model. • we address an important issue in the embedding process, which is caused by the degree difference of entities in different knowledge graphs.
Introduction • issue in the embedding process Blue: Popular EntityOrange: Normal EntityYellow: Rare Entity
Analysis low and high degree values (in blue and red) normal degree values (in green) The influence of entity’s degree is less severe after our model.
Degree-Aware KG embedding • Based on TransE
Degree-Aware KG embedding • Design the degree-aware KGE model by training the knowledge graph embedding in an adversarial framework.
Semi-Supervised Entity Alignment • Semi-Supervised loss: • Inspired by the work of CycleGAN in computer vision. • Define cycled consistent loss.
Experiments • Dataset
Contributions • We propose to solve entity alignment in a semi-supervised way, not only using the given aligned entity, but also incorporating the unaligned entity to enhance the performance. • We investigate the impact of entity’s degree difference on embedding of knowledge graph, and address the problem under the adversarial training framework.