DNMKG: A method for constructing domain of nonferrous metals knowledge graph based on multiple corpus
(1. Youke Publishing Co., Ltd., Beijing 100088, China;
2. School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China)
2. School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China)
Abstract: To address the underutilization of Chinese research materials in nonferrous metals, a method for constructing a domain of nonferrous metals knowledge graph (DNMKG) was established. Starting from a domain thesaurus, entities and relationships were mapped as resource description framework (RDF) triples to form the graph’s framework. Properties and related entities were extracted from open knowledge bases, enriching the graph. A large-scale, multi- source heterogeneous corpus of over 1×109 words was compiled from recent literature to further expand DNMKG. Using the knowledge graph as prior knowledge, natural language processing techniques were applied to the corpus, generating word vectors. A novel entity evaluation algorithm was used to identify and extract real domain entities, which were added to DNMKG. A prototype system was developed to visualize the knowledge graph and support human-computer interaction. Results demonstrate that DNMKG can enhance knowledge discovery and improve research efficiency in the nonferrous metals field.
Key words: knowledge graph; nonferrous metals; thesaurus; word vector model; multi-source heterogeneous corpus