Skip to content

Commit 520a268

Browse files
authored
Merge pull request #350 from dice-group/develop
Develop
2 parents e39ac25 + 167b108 commit 520a268

12 files changed

+996
-79
lines changed

README.md

Lines changed: 29 additions & 29 deletions
Large diffs are not rendered by default.

docs/usage/01_introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Ontolearn
22

3-
**Version:** ontolearn 0.6.1
3+
**Version:** ontolearn 0.7.0
44

55
**GitHub repository:** [https://github.com/dice-group/Ontolearn](https://github.com/dice-group/Ontolearn)
66

examples/clip_notebook.ipynb

Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "blond-letter",
6+
"metadata": {},
7+
"source": [
8+
"# CLIP Notebook\n",
9+
"This is a jupyter notebook file to execute [CLIP](ontolearn.concept_learner.CLIP) and generate predictive results. We recommend you to see the [concept learners](../docs/usage/06_concept_learners.md) guide before continuing with the execution.\n",
10+
"Also if you have not done it already, from the main directory \"Ontolearn\", run the commands for Datasets mentioned [here](https://ontolearn-docs-dice-group.netlify.app/usage/02_installation#download-external-files) to download the datasets."
11+
]
12+
},
13+
{
14+
"cell_type": "code",
15+
"execution_count": 1,
16+
"id": "japanese-ivory",
17+
"metadata": {
18+
"tags": []
19+
},
20+
"outputs": [
21+
{
22+
"name": "stderr",
23+
"output_type": "stream",
24+
"text": [
25+
"\n",
26+
"Warning: SQLite3 version 3.40.0 and 3.41.2 have huge performance regressions; please install version 3.41.1 or 3.42!\n",
27+
"\n"
28+
]
29+
}
30+
],
31+
"source": [
32+
"import json\n",
33+
"from ontolearn.knowledge_base import KnowledgeBase\n",
34+
"from ontolearn.concept_learner import CLIP\n",
35+
"from ontolearn.refinement_operators import ExpressRefinement\n",
36+
"from ontolearn.learning_problem import PosNegLPStandard\n",
37+
"from owlapy.model import OWLNamedIndividual, IRI\n",
38+
"from ontolearn.utils import setup_logging\n"
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"id": "pending-coast",
44+
"metadata": {},
45+
"source": [
46+
"Open `uncle_lp.json` where we have stored the learning problem for the concept of 'Uncle' and the path to the 'family' ontology."
47+
]
48+
},
49+
{
50+
"cell_type": "code",
51+
"execution_count": 2,
52+
"id": "beginning-syntax",
53+
"metadata": {
54+
"tags": []
55+
},
56+
"outputs": [],
57+
"source": [
58+
"with open('uncle_lp.json') as json_file:\n",
59+
" settings = json.load(json_file)"
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"id": "humanitarian-heating",
65+
"metadata": {},
66+
"source": [
67+
"Create an instance of the class `KnowledeBase` by using the path that is stored in `settings`."
68+
]
69+
},
70+
{
71+
"cell_type": "code",
72+
"execution_count": 3,
73+
"id": "caroline-indiana",
74+
"metadata": {
75+
"tags": []
76+
},
77+
"outputs": [],
78+
"source": [
79+
"kb = KnowledgeBase(path=settings['data_path'])"
80+
]
81+
},
82+
{
83+
"cell_type": "markdown",
84+
"id": "lucky-activation",
85+
"metadata": {},
86+
"source": [
87+
"Retreive the IRIs of the positive and negative examples of Uncle from `settings` and create an instance of `PosNegLPStandard`. (more info about this [here](../docs/usage/06_concept_learners.md#configure-the-learning-problem))"
88+
]
89+
},
90+
{
91+
"cell_type": "code",
92+
"execution_count": 4,
93+
"id": "processed-patrick",
94+
"metadata": {
95+
"tags": []
96+
},
97+
"outputs": [],
98+
"source": [
99+
"examples = settings['Uncle']\n",
100+
"p = set(examples['positive_examples'])\n",
101+
"n = set(examples['negative_examples'])\n",
102+
"typed_pos = set(map(OWLNamedIndividual, map(IRI.create, p)))\n",
103+
"typed_neg = set(map(OWLNamedIndividual, map(IRI.create, n)))\n",
104+
"lp = PosNegLPStandard(pos=typed_pos, neg=typed_neg)"
105+
]
106+
},
107+
{
108+
"cell_type": "markdown",
109+
"id": "mechanical-latin",
110+
"metadata": {},
111+
"source": [
112+
"Create a model of [CLIP](ontolearn.concept_learner.CLIP) and fit the learning problem to the model."
113+
]
114+
},
115+
{
116+
"cell_type": "code",
117+
"execution_count": 5,
118+
"id": "171d1aa4-6c12-42c0-b7e9-8cf2dce85ff9",
119+
"metadata": {
120+
"tags": []
121+
},
122+
"outputs": [],
123+
"source": [
124+
"op = ExpressRefinement(knowledge_base=kb, use_inverse=False,\n",
125+
" use_numeric_datatypes=False)"
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": 6,
131+
"id": "binding-moderator",
132+
"metadata": {
133+
"tags": []
134+
},
135+
"outputs": [
136+
{
137+
"name": "stdout",
138+
"output_type": "stream",
139+
"text": [
140+
"\n",
141+
" Loaded length predictor!\n",
142+
"\n",
143+
" Loaded length predictor!\n",
144+
"\n",
145+
" Loaded length predictor!\n",
146+
"\n",
147+
" Loaded length predictor!\n",
148+
"\n",
149+
"***** Predicted length: 5 *****\n",
150+
"\n",
151+
"***** Predicted length: 5 *****\n"
152+
]
153+
},
154+
{
155+
"data": {
156+
"text/plain": [
157+
"<ontolearn.concept_learner.CLIP at 0x7f762ae039a0>"
158+
]
159+
},
160+
"execution_count": 6,
161+
"metadata": {},
162+
"output_type": "execute_result"
163+
}
164+
],
165+
"source": [
166+
"model = CLIP(knowledge_base=kb, path_of_embeddings=\"../CLIPData/family/embeddings/ConEx_entity_embeddings.csv\",\n",
167+
" refinement_operator=op, load_pretrained=True, max_runtime=200)\n",
168+
"model.fit(lp)"
169+
]
170+
},
171+
{
172+
"cell_type": "markdown",
173+
"id": "d981f2b9-3489-494e-825d-6a72ee480d4f",
174+
"metadata": {},
175+
"source": [
176+
"## Retrieve top 3 hypotheses and print them."
177+
]
178+
},
179+
{
180+
"cell_type": "code",
181+
"execution_count": 7,
182+
"id": "c6a90b21-3594-441d-bed0-eb822db5f993",
183+
"metadata": {
184+
"tags": []
185+
},
186+
"outputs": [
187+
{
188+
"name": "stdout",
189+
"output_type": "stream",
190+
"text": [
191+
"<class 'ontolearn.search.OENode'> at 0x0304774\tMale ⊓ (∀ hasParent.Grandparent)\tQuality:0.90476\tHeuristic:0.40407\tDepth:2\tH_exp:6\t|RC|:7\t|Indv.|:None\n",
192+
"<class 'ontolearn.search.OENode'> at 0x0ca154a\tMale ⊓ (∀ hasChild.Grandchild)\tQuality:0.90476\tHeuristic:0.36919\tDepth:1\tH_exp:7\t|RC|:7\t|Indv.|:None\n",
193+
"<class 'ontolearn.search.OENode'> at 0x2adbb89\tMale ⊓ (∀ hasChild.(¬Grandfather))\tQuality:0.88889\tHeuristic:0.39044\tDepth:3\tH_exp:6\t|RC|:0\t|Indv.|:None\n"
194+
]
195+
},
196+
{
197+
"data": {
198+
"text/plain": [
199+
"[None, None, None]"
200+
]
201+
},
202+
"execution_count": 7,
203+
"metadata": {},
204+
"output_type": "execute_result"
205+
}
206+
],
207+
"source": [
208+
"hypotheses = list(model.best_hypotheses(n=3))\n",
209+
"[print(_) for _ in hypotheses]"
210+
]
211+
}
212+
],
213+
"metadata": {
214+
"kernelspec": {
215+
"display_name": "onto",
216+
"language": "python",
217+
"name": "onto"
218+
},
219+
"language_info": {
220+
"codemirror_mode": {
221+
"name": "ipython",
222+
"version": 3
223+
},
224+
"file_extension": ".py",
225+
"mimetype": "text/x-python",
226+
"name": "python",
227+
"nbconvert_exporter": "python",
228+
"pygments_lexer": "ipython3",
229+
"version": "3.9.18"
230+
}
231+
},
232+
"nbformat": 4,
233+
"nbformat_minor": 5
234+
}

examples/concept_learning_cv_evaluation.py

Lines changed: 51 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@
1212
import time
1313
import pandas as pd
1414
from ontolearn.knowledge_base import KnowledgeBase
15-
from ontolearn.concept_learner import CELOE, OCEL, EvoLearner, NCES
15+
from ontolearn.concept_learner import CELOE, OCEL, EvoLearner, NCES, CLIP
16+
from ontolearn.refinement_operators import ExpressRefinement
1617
from ontolearn.learners import Drill, TDL
1718
from ontolearn.learning_problem import PosNegLPStandard
1819
from ontolearn.metrics import F1
@@ -32,27 +33,41 @@ def dl_concept_learning(args):
3233
settings = json.load(json_file)
3334

3435
kb = KnowledgeBase(path=args.kb)
35-
ocel = OCEL(knowledge_base=KnowledgeBase(path=args.kb), quality_func=F1(),
36+
ocel = OCEL(knowledge_base=kb, quality_func=F1(),
3637
max_runtime=args.max_runtime)
37-
celoe = CELOE(knowledge_base=KnowledgeBase(path=args.kb), quality_func=F1(),
38+
celoe = CELOE(knowledge_base=kb, quality_func=F1(),
3839
max_runtime=args.max_runtime)
39-
drill = Drill(knowledge_base=KnowledgeBase(path=args.kb), path_pretrained_kge=args.path_pretrained_kge,
40+
drill = Drill(knowledge_base=kb, path_pretrained_kge=args.path_pretrained_kge,
4041
quality_func=F1(), max_runtime=args.max_runtime)
41-
tdl = TDL(knowledge_base=KnowledgeBase(path=args.kb),
42+
tdl = TDL(knowledge_base=kb,
4243
dataframe_triples=pd.DataFrame(
4344
data=sorted([(str(s), str(p), str(o)) for s, p, o in Graph().parse(args.kb)], key=lambda x: len(x)),
4445
columns=['subject', 'relation', 'object'], dtype=str),
4546
kwargs_classifier={"random_state": 0},
4647
max_runtime=args.max_runtime)
4748
nces = NCES(knowledge_base_path=args.kb, quality_func=F1(), path_of_embeddings=args.path_of_nces_embeddings,
4849
pretrained_model_name=["LSTM", "GRU", "SetTransformer"], num_predictions=5)
50+
51+
express_rho = ExpressRefinement(kb, use_inverse=False, use_numeric_datatypes=False)
52+
clip = CLIP(knowledge_base=kb, refinement_operator=express_rho, quality_func=F1(),
53+
max_num_of_concepts_tested=int(1e9), max_runtime=args.max_runtime,
54+
path_of_embeddings=args.path_of_clip_embeddings,
55+
pretrained_predictor_name=["LSTM", "GRU", "SetTransformer", "CNN"], load_pretrained=True)
4956

5057
# dictionary to store the data
5158
data = dict()
52-
for str_target_concept, examples in settings['problems'].items():
59+
if "problems" in settings:
60+
problems = settings['problems'].items()
61+
positives_key = "positive_examples"
62+
negatives_key = "negative_examples"
63+
else:
64+
problems = settings.items()
65+
positives_key = "positive examples"
66+
negatives_key = "negative examples"
67+
for str_target_concept, examples in problems:
5368
print('Target concept: ', str_target_concept)
54-
p = examples['positive_examples']
55-
n = examples['negative_examples']
69+
p = examples[positives_key]
70+
n = examples[negatives_key]
5671

5772
kf = StratifiedKFold(n_splits=args.folds, shuffle=True, random_state=args.random_seed)
5873
X = np.array(p + n)
@@ -67,16 +82,16 @@ def dl_concept_learning(args):
6782
train_neg = {neg_individual for neg_individual in X[train_index][y[train_index] == 0]}
6883

6984
# Sanity checking for individuals used for training.
70-
assert train_pos.issubset(examples['positive_examples'])
71-
assert train_neg.issubset(examples['negative_examples'])
85+
assert train_pos.issubset(examples[positives_key])
86+
assert train_neg.issubset(examples[negatives_key])
7287

7388
# () Extract positive and negative examples from test fold
7489
test_pos = {pos_individual for pos_individual in X[test_index][y[test_index] == 1]}
7590
test_neg = {neg_individual for neg_individual in X[test_index][y[test_index] == 0]}
7691

7792
# Sanity checking for individuals used for testing.
78-
assert test_pos.issubset(examples['positive_examples'])
79-
assert test_neg.issubset(examples['negative_examples'])
93+
assert test_pos.issubset(examples[positives_key])
94+
assert test_neg.issubset(examples[negatives_key])
8095
train_lp = PosNegLPStandard(pos=set(map(OWLNamedIndividual, map(IRI.create, train_pos))),
8196
neg=set(map(OWLNamedIndividual, map(IRI.create, train_neg))))
8297

@@ -217,6 +232,28 @@ def dl_concept_learning(args):
217232
print(f"NCES Train Quality: {train_f1_nces:.3f}", end="\t")
218233
print(f"NCES Test Quality: {test_f1_nces:.3f}", end="\t")
219234
print(f"NCES Runtime: {rt_nces:.3f}")
235+
236+
237+
print("CLIP starts..", end="\t")
238+
start_time = time.time()
239+
pred_clip = clip.fit(train_lp).best_hypotheses(n=1)
240+
rt_clip = time.time() - start_time
241+
print("CLIP ends..", end="\t")
242+
# () Quality on the training data
243+
train_f1_clip = compute_f1_score(individuals={i for i in kb.individuals(pred_clip.concept)},
244+
pos=train_lp.pos,
245+
neg=train_lp.neg)
246+
# () Quality on test data
247+
test_f1_clip = compute_f1_score(individuals={i for i in kb.individuals(pred_clip.concept)},
248+
pos=test_lp.pos,
249+
neg=test_lp.neg)
250+
251+
data.setdefault("Train-F1-CLIP", []).append(train_f1_clip)
252+
data.setdefault("Test-F1-CLIP", []).append(test_f1_clip)
253+
data.setdefault("RT-CLIP", []).append(rt_clip)
254+
print(f"CLIP Train Quality: {train_f1_clip:.3f}", end="\t")
255+
print(f"CLIP Test Quality: {test_f1_clip:.3f}", end="\t")
256+
print(f"CLIP Runtime: {rt_clip:.3f}")
220257

221258
df = pd.DataFrame.from_dict(data)
222259
df.to_csv(args.report, index=False)
@@ -227,12 +264,13 @@ def dl_concept_learning(args):
227264
if __name__ == '__main__':
228265
parser = argparse.ArgumentParser(description='Description Logic Concept Learning')
229266
parser.add_argument("--max_runtime", type=int, default=10, help="Max runtime")
230-
parser.add_argument("--lps", type=str, required=True, help="Path fto the learning problems")
267+
parser.add_argument("--lps", type=str, required=True, help="Path to the learning problems")
231268
parser.add_argument("--folds", type=int, default=10, help="Number of folds of cross validation.")
232269
parser.add_argument("--kb", type=str, required=True,
233270
help="Knowledge base")
234271
parser.add_argument("--path_pretrained_kge", type=str, default=None)
235272
parser.add_argument("--path_of_nces_embeddings", type=str, default=None)
273+
parser.add_argument("--path_of_clip_embeddings", type=str, default=None)
236274
parser.add_argument("--report", type=str, default="report.csv")
237275
parser.add_argument("--random_seed", type=int, default=1)
238276
dl_concept_learning(parser.parse_args())

examples/example_knowledge_base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
print('*' * 100)
3939

4040
# Direct concept hierarchy from Top to Bottom.
41-
for concept in kb.class_hierarchy().items():
41+
for concept in kb.class_hierarchy.items():
4242
print(f'{concept.get_iri().as_str()} => {[c.get_iri().as_str() for c in kb.get_direct_sub_concepts(concept)]}')
4343
print('*' * 100)
4444

0 commit comments

Comments
 (0)