Hypernym detection and discovery are fundamental tasks in natural language processing. The former task aims to identify all possible hypernyms of a given hyponym term, whereas the latter attempts to determine whether the given two terms hold a hypernymy relation or not. Existing hypernym detection and discovery methods generally face the problem of capturing the hierarchical structure and polysemy relationships between hypernyms.
To solve the problems, a research team led by Richong Zhang published their
new research on 15 Apr 2025 in
Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed a Multi-Projection Recurrent model (MPR) for hypernym detection and discovery. MPR consists of a multi-projection recurrent model that leverages a multi-projection mechanism to address the polysemy phenomenon and a multi-hop recurrent structure to model the hierarchical hypernym relations. By applying the attention-based aggregation module in both the multi-projection mapping block and hierarchy-aware recurrent block, MPR integrates the output representations to predict hypernym relationships using a unified scoring function.
Specially, the mapping block consists of a multi-projection unit with multiple projection functions followed by an aggregation module that uses an attention mechanism to incorporate the outputs of different projections as an integrated representation. The attention weights act as a filter, selecting the most relevant hypernym terms and aggregating representations at the same semantic level. Because each projection has separate parameters, the multi-projection unit can extract meanings from various aspects. This representation allows the model to handle multiple hypernyms at the same semantic level caused by the polysemy phenomenon.
The basic idea of the hierarchy-aware recurrent module is that the hypernym at the highest semantic level can be viewed as transformed multi-hops from the hyponym term. The recurrent module executes the mapping block multiple times, and the multi-hop transformation rolls from the hyponym term representation to the hypernym term representation at the highest semantic level. A cross-semantic-level aggregation module is then used to integrate the output hidden representations as a unified representation. The hidden representation at each hop of the recurrent structure can be viewed as a latent hypernym at a specific semantic level. Thus, the recurrent module can model the hierarchical structure of hypernymy relations.
Experiments on 11 benchmarks demonstrate that MPR is effective on both hypernym detection and discovery tasks.
Future work can focus on building a lightweight hypernym discovery module, exploring the combination of large language models and hypernym discovery models, and promoting the development of downstream applications with hypernyms.
DOI: 10.1007/s11704-024-3638-7