File size: 5,160 Bytes
493fde8 6875878 493fde8 6875878 493fde8 6875878 493fde8 6875878 c80a92f 493fde8 6875878 493fde8 6875878 493fde8 6875878 493fde8 c80a92f 493fde8 6875878 c80a92f 8bc3205 6875878 c80a92f 493fde8 6875878 493fde8 6875878 493fde8 6875878 56e50f0 493fde8 76b7453 493fde8 6875878 493fde8 6875878 493fde8 76b7453 493fde8 b218b15 493fde8 6875878 493fde8 6875878 493fde8 6875878 0d5206c 493fde8 6875878 493fde8 6875878 493fde8 76b7453 6875878 493fde8 6875878 9b345ee 6875878 493fde8 92da0f5 c80a92f bc266fc 493fde8 6875878 493fde8 6875878 e4eb7bf 6875878 56e50f0 f3b5d9a 56e50f0 6875878 493fde8 c80a92f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
---
library_name: hierarchy-transformers
pipeline_tag: feature-extraction
tags:
- hierarchy-transformers
- feature-extraction
- hierarchy-encoding
- subsumption-relationships
- transformers
license: apache-2.0
language:
- en
metrics:
- precision
- recall
- f1
base_model:
- sentence-transformers/all-MiniLM-L6-v2
---
# Hierarchy-Transformers/HiT-MiniLM-L6-WordNetNoun
A **Hi**erarchy **T**ransformer Encoder (HiT) model that explicitly encodes entities according to their hierarchical relationships.
### Model Description
<!-- Provide a longer summary of what this model is. -->
HiT-MiniLM-L6-WordNet is a HiT model trained on WordNet's subsumption (hypernym) hierarchy of noun entities.
- **Developed by:** [Yuan He](https://www.yuanhe.wiki/), Zhangdie Yuan, Jiaoyan Chen, and Ian Horrocks
- **Model type:** Hierarchy Transformer Encoder (HiT)
- **License:** Apache license 2.0
- **Hierarchy**: WordNet's subsumption (hypernym) hierarchy of noun entities.
- **Training Dataset**: [Hierarchy-Transformers/WordNetNoun](https://huggingface.co/datasets/Hierarchy-Transformers/WordNetNoun)
- **Pre-trained model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
- **Training Objectives**: Jointly optimised on *Hyperbolic Clustering* and *Hyperbolic Centripetal* losses (see definitions in the [paper](https://arxiv.org/abs/2401.11374))
### Model Versions
| **Version** | **Model Revision** | **Note** |
|------------|---------|----------|
|v1.0 (Random Negatives)| `main` or `v1-random-negatives`| The variant trained on random negatives, as detailed in the [paper](https://arxiv.org/abs/2401.11374).|
|v1.0 (Hard Negatives)| `v1-hard-negatives` | The variant trained on hard negatives, as detailed in the [paper](https://arxiv.org/abs/2401.11374). |
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/KRR-Oxford/HierarchyTransformers
- **Paper:** [Language Models as Hierarchy Encoders](https://arxiv.org/abs/2401.11374)
## Usage
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
HiT models are used to encode entities (presented as texts) and predict their hierarhical relationships in hyperbolic space.
### Get Started
Install `hierarchy_transformers` (check our [repository](https://github.com/KRR-Oxford/HierarchyTransformers)) through `pip` or `GitHub`.
Use the code below to get started with the model.
```python
from hierarchy_transformers import HierarchyTransformer
# load the model
model = HierarchyTransformer.from_pretrained('Hierarchy-Transformers/HiT-MiniLM-L12-WordNetNoun')
# entity names to be encoded.
entity_names = ["computer", "personal computer", "fruit", "berry"]
# get the entity embeddings
entity_embeddings = model.encode(entity_names)
```
### Default Probing for Subsumption Prediction
Use the entity embeddings to predict the subsumption relationships between them.
```python
# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model.encode(["personal computer", "berry"], convert_to_tensor=True)
parent_entity_embeddings = model.encode(["computer", "fruit"], convert_to_tensor=True)
# compute the hyperbolic distances and norms of entity embeddings
dists = model.manifold.dist(child_entity_embeddings, parent_entity_embeddings)
child_norms = model.manifold.dist0(child_entity_embeddings)
parent_norms = model.manifold.dist0(parent_entity_embeddings)
# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - (dists + centri_score_weight * (parent_norms - child_norms))
```
### Train Your Own Models
Use the example scripts in our [repository](https://github.com/KRR-Oxford/HierarchyTransformers/tree/main/scripts) to reproduce existing models and train/evaluate your own models.
## Full Model Architecture
```
HierarchyTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)
```
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
*Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks.* **Language Models as Hierarchy Encoders.** Advances in Neural Information Processing Systems 37 (NeurIPS 2024).
```
@article{he2024language,
title={Language models as hierarchy encoders},
author={He, Yuan and Yuan, Moy and Chen, Jiaoyan and Horrocks, Ian},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={14690--14711},
year={2024}
}
```
## Model Card Contact
For any queries or feedback, please contact Yuan He (`yuan.he(at)cs.ox.ac.uk`). |