dice-group
diff --git a/‎docs/conf.py
Lines changed: 2 additions & 1 deletion b/‎docs/conf.py
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/index.rst
Lines changed: 1 addition & 0 deletions b/‎docs/index.rst
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/usage/01_introduction.md
Lines changed: 3 additions & 2 deletions b/‎docs/usage/01_introduction.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎docs/usage/04_knowledge_base.md
Lines changed: 133 additions & 4 deletions b/‎docs/usage/04_knowledge_base.md
Lines changed: 133 additions & 4 deletions
diff --git a/‎docs/usage/05_reasoner.md
Lines changed: 25 additions & 43 deletions b/‎docs/usage/05_reasoner.md
Lines changed: 25 additions & 43 deletions
diff --git a/‎docs/usage/06_concept_learners.md
Lines changed: 15 additions & 13 deletions b/‎docs/usage/06_concept_learners.md
Lines changed: 15 additions & 13 deletions
@@ -36,7 +36,8 @@
 ]
 
 # autoapi for ontolearn and owlapy. for owlapy we need to refer to its path in GitHub Action environment
-autoapi_dirs = ['../ontolearn', '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/owlapy']
+autoapi_dirs = ['../ontolearn', '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/owlapy',
+                '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/ontosample']
 
 # by default all are included but had to reinitialize this to remove private members from shoing
 autoapi_options = ['members', 'undoc-members', 'show-inheritance', 'show-module-summary', 'special-members',
 
@@ -18,6 +18,7 @@ Ontolearn is an open-source software library for explainable structured machine
    usage/09_further_resources
    autoapi/ontolearn/index
    autoapi/owlapy/index
+   autoapi/ontosample/index
 
 
 .. raw:: latex
 
@@ -1,6 +1,6 @@
 # Ontolearn
 
-**Version:** ontolearn 0.6.2
+**Version:** ontolearn 0.6.1
 
 **GitHub repository:** [https://github.com/dice-group/Ontolearn](https://github.com/dice-group/Ontolearn)
 
@@ -31,12 +31,13 @@ Owlready2 library made it possible to build more complex algorithms.
 
 ---------------------------------------
 
-**Ontolearn (including owlapy) can do the following:**
+**Ontolearn (including owlapy and ontosample) can do the following:**
 
 - Load/save ontologies in RDF/XML, OWL/XML.
 - Modify ontologies by adding/removing axioms.
 - Access individuals/classes/properties of an ontology (and a lot more).
 - Define learning problems.
+- Sample ontologies.
 - Construct class expressions.
 - Use concept learning algorithms to classify positive examples in a learning problem.
 - Use local datasets or datasets that are hosted on a triplestore server, for the learning task.
 
@@ -49,7 +49,7 @@ kb = KnowledgeBase(path="file://KGs/father.owl")
 What happens in the background is that the ontology located in this path will be loaded
 in the `OWLOntology` object of `kb` as done [here](03_ontologies.md#loading-an-ontology).
 
-In our recent version you can also initialize the KnowledgeBase using a dataset hosted in a triplestore.
+In our recent version you can also initialize a knowledge base using a dataset hosted in a triplestore.
 Since that knowledge base is mainly used for executing a concept learner, we cover that matter more in depth 
 in _[Use Triplestore Knowledge Base](06_concept_learners.md#use-triplestore-knowledge-base)_ 
 section of _[Concept Learning](06_concept_learners.md)_.
@@ -253,11 +253,140 @@ You can now:
     print(evaluated_concept.ic) # 3
     ```
 
+## Obtaining axioms
+
+You can retrieve Tbox and Abox axioms by using `tbox` and `abox` methods respectively.
+Let us take them one at a time. The `tbox` method has 2 parameters, `entities` and `mode`.
+`entities` specifies the owl entity from which we want to obtain the Tbox axioms. It can be 
+a single entity, a `Iterable` of entities, or `None`. 
+
+The allowed types of entities are: 
+- OWLClass
+- OWLObjectProperty
+- OWLDataProperty
+
+Only the Tbox axioms related to the given entit-y/ies will be returned. If no entities are 
+passed, then it returns all the Tbox axioms.
+The second parameter `mode` _(str)_ sets the return format type. It can have the
+following values:
+1) `'native'` -> triples are represented as tuples of owlapy objects.
+2) `'iri'` -> triples are represented as tuples of IRIs as strings.
+3) `'axiom'` -> triples are represented as owlapy axioms.
+
+For the `abox` method the idea is similar. Instead of the parameter `entities`, there is the parameter 
+`individuals` which accepts an object of type OWLNamedIndividuals or Iterable[OWLNamedIndividuals].
+
+If you want to obtain all the axioms (Tbox + Abox) of the knowledge base, you can use the method `triples`. It
+requires only the `mode` parameter.
+
+> **NOTE**: The results of these methods are limited only to named and direct entities. 
+> That means that especially the axioms that contain anonymous owl objects (objects that don't have an IRI)
+> will not be part of the result set. For example, if there is a Tbox T={ C ⊑ (A ⊓ B), C ⊑ D }, 
+> only the latter subsumption axiom will be returned.
+
+
+## Sampling the Knowledge Base
+
+Sometimes ontologies and therefore knowledge bases can get very large and our
+concept learners become inefficient in terms of runtime. Sampling is an approach
+to extract a portion of the whole knowledge base without changing its semantic and
+still being expressive enough to yield results with as little loss of quality as 
+possible. [OntoSample](https://github.com/alkidbaci/OntoSample/tree/main) is 
+a library that we use to perform the sampling process. It offers different sampling 
+techniques which fall into the following categories:
+
+- Node-based samplers
+- Edge-based samplers
+- Exploration-based samplers
+
+and almost each sampler is offered in 3 modes:
+
+- Classic
+- Learning problem first (LPF)
+- Learning problem centered (LPC)
+
+You can check them [here](ontosample).
+
+When operated on its own, Ontosample uses a light version of Ontolearn (`ontolearn_light`) 
+to reason over ontologies, but when both packages are installed in the same environment 
+it will use `ontolearn` module instead. This is made for compatibility reasons.
+
+Ontosample treats the knowledge base as a graph where nodes are individuals
+and edges are object properties. However, Ontosample also offers support for 
+data properties sampling, although they are not considered as _"edges"_.
+
+#### Sampling steps:
+1. Initialize the sample using a `KnowledgeBase` object. If you are using an LPF or LPC
+   sampler than you also need to pass the set of learning problem individuals (`lp_nodes`).
+2. To perform the sampling use the `sample` method where you pass the number
+   of nodes (`nodes_number`) that you want to sample, the amount of data properties in percentage
+   (`data_properties_percentage`) that you want to sample which is represented by float values 
+   form 0 to 1 and jump probability (`jump_prob`) for samplers that 
+   use "jumping", a technique to avoid infinite loops during sampling.
+3. The `sample` method returns the sampled knowledge which you can store to a 
+   variable, use directly in the code or save locally by using the static method 
+   `save_sample`.
+
+Let's see an example where we use [RandomNodeSampler](ontosample.classic_samplers.RandomNodeSampler) to sample a 
+knowledge base:
+
+```python
+from ontosample.classic_samplers import RandomNodeSampler
+
+# 1. Initialize KnowledgeBase object using the path of the ontology
+kb = KnowledgeBase(path="KGs/Family/family-benchmark_rich_background.owl")
+
+# 2. Initialize the sampler and generate the sample
+sampler = RandomNodeSampler(kb)
+sampled_kb = sampler.sample(30) # will generate a sample with 30 nodes
+
+# 3. Save the sampled ontology
+sampler.save_sample(kb=sampled_kb, filename="some_name")
+```
+
+Here is another example where this time we use an LPC sampler:
+
+```python
+from ontosample.lpc_samplers import RandomWalkerJumpsSamplerLPCentralized
+from owlapy.model import OWLNamedIndividual,IRI
+import json
+
+# 0. Load json that stores the learning problem
+with open("examples/uncle_lp2.json") as json_file:
+    examples = json.load(json_file)
+
+# 1. Initialize KnowledgeBase object using the path of the ontology
+kb = KnowledgeBase(path="KGs/Family/family-benchmark_rich_background.owl")
+
+# 2. Initialize learning problem (required only for LPF and LPC samplers)
+pos = set(map(OWLNamedIndividual, map(IRI.create, set(examples['positive_examples']))))
+neg = set(map(OWLNamedIndividual, map(IRI.create, set(examples['negative_examples']))))
+lp = pos.union(neg)
+
+# 3. Initialize the sampler and generate the sample
+sampler = RandomWalkerJumpsSamplerLPCentralized(graph=kb, lp_nodes=lp)
+sampled_kb = sampler.sample(nodes_number=40,jump_prob=0.15)
+
+# 4. Save the sampled ontology
+sampler.save_sample(kb=sampled_kb, filename="some_other_name")
+```
+
+> WARNING! Random walker and Random Walker with Prioritization are two samplers that suffer 
+> from non-termination in case that the ontology contains nodes that point to each other and 
+> form an inescapable loop for the "walker". In this scenario you can use their "jumping" 
+> version to make the "walker" escape these loops and ensure termination.
+
+To see how to use a sampled knowledge base for the task of concept learning check
+the `sampling_example.py` in [examples](https://github.com/dice-group/Ontolearn/tree/develop/examples) 
+folder. You will find descriptive comments in that script that will help you understand it better.
+
+For more details about OntoSample you can see [this paper](https://dl.acm.org/doi/10.1145/3583780.3615158).
+
 -----------------------------------------------------------------------------------------------------
 
-See [KnowledgeBase API documentation](ontolearn.knowledge_base.KnowledgeBase)
-to check all the methods that this class has to offer. You will find methods to 
-access the class/property hierarchy, convenient methods that use the reasoner indirectly and 
+Since we cannot cover everything here in details, see [KnowledgeBase API documentation](ontolearn.knowledge_base.KnowledgeBase)
+to check all the methods that this class has to offer. You will find convenient methods to 
+access the class/property hierarchy, methods that use the reasoner indirectly and 
 a lot more.
 
 Speaking of the reasoner, it is important that an ontology 
 
@@ -34,9 +34,6 @@ from. Currently, there are the following reasoners available:
     The structural reasoner requires an ontology ([OWLOntology](owlapy.model.OWLOntology)).
   The second argument is `isolate` argument which isolates the world (therefore the ontology) where the reasoner is
   performing the reasoning. More on that on _[Reasoning Details](07_reasoning_details.md#isolated-world)_.
-  The remaining argument, `triplestore_address`, is used in case you want to
-  retrieve instances from a triplestore (go to 
-    [_Using a Triplestore for Reasoning Tasks_](#using-a-triplestore-for-reasoning-tasks) for details).
 
 
 
@@ -60,8 +57,7 @@ from. Currently, there are the following reasoners available:
     which is just an enumeration with two possible values: `BaseReasoner_Owlready2.HERMIT` and `BaseReasoner_Owlready2.PELLET`.
   You can set the `infer_property_values` argument to `True` if you want the reasoner to infer
   property values. `infer_data_property_values` is an additional argument when the base reasoner is set to 
-    `BaseReasoner_Owlready2.PELLET`. The rest of the arguments `isolated` and `triplestore_address` 
-    are inherited from the base class.
+    `BaseReasoner_Owlready2.PELLET`. The argument `isolated` is inherited from the base class
 
 
 - [**OWLReasoner_FastInstanceChecker**](ontolearn.base.fast_instance_checker.OWLReasoner_FastInstanceChecker) **(FIC)**
@@ -87,6 +83,29 @@ from. Currently, there are the following reasoners available:
     `sub_properties` is another boolean argument to specify whether you want to take sub properties in consideration
   for `instances()` method.
 
+
+- [**TripleStoreReasoner**](ontolearn.triple_store.TripleStoreReasoner)
+  
+  Triplestores are known for their efficiency in retrieving data, and they can be queried using SPARQL.
+  Making this functionality available in Ontolearn makes it possible to use concept learners that
+  fully operates in datasets hosted on triplestores. Although that is the main goal, the reasoner can be used
+  independently for reasoning tasks.
+
+  In Ontolearn, we have implemented `TripleStoreReasoner`, to query triplestore endpoints using SPARQL queries.
+  It has only one required parameter:
+    - `ontology` - a [TripleStoreOntology](ontolearn.triple_store.TripleStoreOntology) that can be instantiated 
+  using a string that contains the URL of the triplestore host/server. 
+  
+  This reasoner inherit from OWLReasoner, and therefore you can use it like any other reasoner.
+  
+  **Initialization:**
+
+  ```python
+  from ontolearn.triple_store import TripleStoreReasoner, TripleStoreOntology
+  
+  reasoner = TripleStoreReasoner(TripleStoreOntology("http://some_domain/some_path/sparql"))
+  ```
+
 ## Usage of the Reasoner
 All the reasoners available in the Ontolearn library inherit from the
 class: [OWLReasonerEx](ontolearn.base.ext.OWLReasonerEx). This class provides some 
@@ -139,7 +158,7 @@ You can get all the types of a certain individual using `types` method:
 <!--pytest-codeblocks:cont-->
 
 ```python
-anna = list( onto.individuals_in_signature()).pop()
+anna = list(onto.individuals_in_signature()).pop()
 
 anna_types = ccei_reasoner.types(anna)
 ```
@@ -229,43 +248,6 @@ for ind in male_individuals:
     print(ind)
 ```
 
-### Using a Triplestore for Reasoning Tasks
-
-As we mentioned earlier, OWLReasoner has an argument for enabling triplestore querying:
-- `triplestore_address` - a string that contains the URL of the triplestore host/server. If specified, it tells
-the reasoner that for its operations it should query the triplestore hosted on the given address.
-
-Triplestores are known for their efficiency in retrieving data, and they can be queried using SPARQL.
-Making this functionality available for reasoners in Ontolearn makes it possible to use concept learners that
-fully operates in datasets hosted on triplestores. Although that is the main goal, the reasoner can be used
-independently for reasoning tasks. Therefore, you can initialize a reasoner to use triplestore as follows:
-
-```python
-from ontolearn.base import OWLReasoner_Owlready2
-
-reasoner = OWLReasoner_Owlready2(onto, triplestore_address="http://some_domain/some_path/sparql")
-```
-
-Now you can use the reasoner methods as you would normally do:
-
-```python
-# Retrieving the male instances using `male` variable that we declared earlier
-males = reasoner.instances(male, direct=False)
-```
-
-**Some important notice are given below:**
-
-> Not all the methods of the reasoner are implemented to use triplestore but the main methods 
-> such as 'instance' and those used to get sub/super classes/properties will work just fine.
-
-> **You cannot pass the triplestore argument directly to FIC constructor.** 
-> Because of the way it is implemented, if the base reasoner is set to use triplestore,
-> then FIC's is considered to using triplestore.
-
-> When using triplestore all methods, including `instances` method **will default to the base
-> implementation**. This means that no matter which type of reasoner you are using, the results will be always 
-> the same.
-
 -----------------------------------------------------------------------
 
 In this guide we covered the main functionalities of the reasoners in Ontolearn. More
 
@@ -323,23 +323,24 @@ Let's see what it takes to make use of it.
 First of all you need a server which should host the triplestore for your ontology. If you don't
 already have one, see [Loading and Launching a Triplestore](#loading-and-launching-a-triplestore) below.
 
-Now you can simply initialize the `KnowledgeBase` object that will server as an input for your desired 
+Now you can simply initialize a `TripleStoreKnowledgeBase` object that will server as an input for your desired 
 concept learner as follows:
 
 ```python
-from ontolearn.knowledge_base import KnowledgeBase
+from ontolearn.triple_store import TripleStoreKnowledgeBase
 
-kb = KnowledgeBase(triplestore_address="http://your_domain/some_path/sparql")
+kb = TripleStoreKnowledgeBase("http://your_domain/some_path/sparql")
 ```
 
-Notice that we did not provide a value for the `path` argument. When using triplestore, it is not required. Keep
-in mind that the `kb` will create a default reasoner that uses the triplestore. Passing a custom
-reasoner will not make any difference, because they all behave the same when using the triplestore.
-You may wonder what happens to the `Ontology` object of the `kb` since no path was given. A default ontology 
-object is created that will also use the triplestore for its processes. Basically every querying process concerning
-concept learning is now using the triplestore.
+Notice that the triplestore endpoint is the only argument that you need to pass.
+Also keep in mind that this knowledge base contains a 
+[TripleStoreOntology](ontolearn.triple_store.TripleStoreOntology) 
+and [TripleStoreReasoner](ontolearn.triple_store.TripleStoreReasoner) which means that
+every querying process concerning concept learning is now using the triplestore.
 
 > **Important notice:** The performance of a concept learner may differentiate when using triplestore.
+>  This happens because some SPARQL queries may not yield the exact same results as the local querying methods.
+
 
 ## Loading and Launching a Triplestore
 
@@ -401,14 +402,15 @@ you pass this url to `triplestore_address` argument, you have to add the
 `/sparql` sub-path indicating to the server that we are querying via SPARQL queries. Full path now should look like:
 `http://localhost:3030/father/sparql`.
 
-You can now create a knowledge base or a reasoner object that uses this URL for their 
+You can now create a triplestore knowledge base or a reasoner that uses this URL for their 
 operations:
 
 ```python
-from ontolearn.knowledge_base import KnowledgeBase
+from ontolearn.triple_store import TripleStoreKnowledgeBase
+
+father_kb = TripleStoreKnowledgeBase("http://localhost:3030/father/sparql")
 
-father_kb = KnowledgeBase(triplestore_address="http://localhost:3030/father/sparql")
-# ** Execute the learning algorithm as you normally would. ** .
+# ** Continue to execute the learning algorithm as you normally do. ** .
 ```
 
 -------------------------------------------------------------------
Original file line number	Diff line number	Diff line change
`@@ -36,7 +36,8 @@`
`36`	`36`	`]`
`37`	`37`
`38`	`38`	`# autoapi for ontolearn and owlapy. for owlapy we need to refer to its path in GitHub Action environment`
`39`		`-autoapi_dirs = ['../ontolearn', '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/owlapy']`
	`39`	`+autoapi_dirs = ['../ontolearn', '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/owlapy',`
	`40`	`+ '/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/ontosample']`
`40`	`41`
`41`	`42`	`# by default all are included but had to reinitialize this to remove private members from shoing`
`42`	`43`	`autoapi_options = ['members', 'undoc-members', 'show-inheritance', 'show-module-summary', 'special-members',`