Skip to content

Conversation

@fatualux
Copy link

Description

The context

As always, thanks for your work, and the time you may dedicate to my proposal.

Currently, we cannot avoid storing interactions between users and Cat to "episodic memories"

However, we can go around this limitation for our use case with CheshireCat.

Our workaround

Currently, the method _store_user_message_in_episodic_memory does not provide the possibility to avoid storing episodic memories:

 def _store_user_message_in_episodic_memory(self, user_message_text: str):
        doc = Document(
            page_content=user_message_text,
            metadata={"source": self.user_id, "when": time.time()},
        )
        doc = self.mad_hatter.execute_hook(
            "before_cat_stores_episodic_memory", doc, cat=self
        )
        # store user message in episodic memory
        # TODO: vectorize and store also conversation chunks
        #   (not raw dialog, but summarization)
        user_message_embedding = self.embedder.embed_documents([user_message_text])
        _ = self.memory.vectors.episodic.add_point(
            doc.page_content,
            user_message_embedding[0],
            doc.metadata,
        )

So, by no, we are using the before_cat_stores_episodic_memory hook to overwrite the Document attributes with empty ones:

@hook(priority=0)
def before_cat_stores_episodic_memory(doc: Document, cat) -> Document:
    log.warning("Overwriting episodic memory with an empty document.")
    doc = Document(page_content="", metadata={"source": None, "when": None})
    return doc

This "trick" works as intended, but leads to some inconvenients.
For instance, we need to run a cron job to periodically clean up the episodic memory.

Type of change

At first, we thought returning None inside the before_cat_stores_episodic_memory hook and adding a check in _store_user_message_in_episodic_memory would work.

However, this approach does not work, because of this logic (/cat/mad_hatter/mad_hatter.py):

        # run hooks
        for hook in self.hooks[hook_name]:
            try:
                # pass tea_cup to the hooks, along other args
                # hook has at least one argument, and it will be piped
                log.debug(
                    f"Executing {hook.plugin_id}::{hook.name} with priority {hook.priority}"
                )
                tea_spoon = hook.function(
                    deepcopy(tea_cup), *deepcopy(args[1:]), cat=cat
                )
                # log.debug(f"Hook {hook.plugin_id}::{hook.name} returned {tea_spoon}")
                if tea_spoon is not None:
                    tea_cup = tea_spoon
            except Exception:
                log.error(f"Error in plugin {hook.plugin_id}::{hook.name}")
                plugin_obj = self.plugins[hook.plugin_id]
                log.warning(plugin_obj.plugin_specific_error_message())

Our proposal is to use "pattern matching" in /cat/looking_glass/stray_cat.py, line 656:
if Document is a tuple with None and True, we skip storing it in episodic memory:

    def _store_user_message_in_episodic_memory(self, user_message_text: str):
        doc = Document(
            page_content=user_message_text,
            metadata={"source": self.user_id, "when": time.time()},
        )
        doc = self.mad_hatter.execute_hook(
            "before_cat_stores_episodic_memory", doc, cat=self
        )
        match doc:
            case (None, True):
                log.info("Document is None; skipping storing in episodic memory.")
                return
        # store user message in episodic memory
        # TODO: vectorize and store also conversation chunks
        #   (not raw dialog, but summarization)
        user_message_embedding = self.embedder.embed_documents([user_message_text])
        _ = self.memory.vectors.episodic.add_point(
            doc.page_content,
            user_message_embedding[0],
            doc.metadata,
        )

And here is before_cat_stores_episodic_memory hook in our plugin:

@hook(priority=0)
def before_cat_stores_episodic_memory(doc: Document, cat) -> Document:
    log.info("Doc set to None to skip storing episodic memory.")
    return None, True
  • Bug fix (non-breaking change which fixes an issue)

  • New feature (non-breaking change which adds functionality)

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@pieroit
Copy link
Member

pieroit commented Aug 17, 2025

Hi @fatualux , thank you for the clear explanation and proposal

The problem with the None strategy is that a plugin can break the hook pipe (other plugins may be working on the data structure)

This is an inherent problem with hooks, they can read and change stuff but not really deactivate the process.

In v2 (see branch) I am moving the vector memory in a plugin, so we can add more hooks and plugin level options to activate and deactivate any collection (even have new ones) without making core more complicated

Basically core will not save anything about episodic memory, it will just run a hook and plugins decide what do do with the passed data. Same will happen with retrieval (so you can use vectors or triples or SQL or whatever)

I am closing this PR, hopefully we can talk again about this on v2, plugin vector_memory

Thanks again! Appreciated

@pieroit pieroit closed this Aug 17, 2025
@fatualux
Copy link
Author

fatualux commented Aug 17, 2025 via email

@lucapiccinelli
Copy link

Hi @pieroit,

Thank you for your response. While we acknowledge and accept your choice, I’d like to take a moment to reinforce @fatualux’s point in hopes that you might reconsider his proposal.

The approach of checking the value of doc returned from the hook against a tuple (rather than solely checking for None) was studied to ensure backward compatibility with all existing plugins. Here’s the reasoning:


1. Backward compatibility with existing code

By allowing the hook to return a tuple, we can bypass the specific check in /cat/mad_hatter/mad_hatter.py:

if tea_spoon is not None:
    tea_cup = tea_spoon

This ensures that tea_spoon is assigned to tea_cup without requiring modifications to mad_hatter, which is more susceptible to breaking changes.


2. Low Probability of breaking existing plugins

The likelihood of an existing plugin already returning a tuple in the before_cat_stores_episodic_memory hook is almost zero. This assumption is based on the current code in /cat/looking_glass/stray_cat.py:

doc = self.mad_hatter.execute_hook(
    "before_cat_stores_episodic_memory", doc, cat=self
)

# store user message in episodic memory
# TODO: vectorize and store also conversation chunks
#   (not raw dialog, but summarization)
user_message_embedding = self.embedder.embed_documents([user_message_text])
_ = self.memory.vectors.episodic.add_point(
    doc.page_content,
    user_message_embedding[0],
    doc.metadata,
)

If doc were a tuple, doc.page_content would fail due to the missing attribute. This implies that a tuple was never a valid return value unless the plugin author was intentionally making it fail.


3. Expressing the intent

The proposed implementation explicitly indicates the intent behind returning a tuple like (None, True). Specifically:

  • Returning None with a True flag communicates the intention:
    “I’m returning no document, and I want to skip its storage.”

Maybe this intention would be more readable if expressed this way instead of the pattern matching:

def _store_user_message_in_episodic_memory(self, user_message_text: str):
    doc = Document(
        page_content=user_message_text,
        metadata={"source": self.user_id, "when": time.time()},
    )
    doc = self.mad_hatter.execute_hook(
        "before_cat_stores_episodic_memory", doc, cat=self
    )
    
    if isinstance(doc, tuple) and len(doc) == 2:
        doc, skip_storage = doc
        if doc is None and skip_storage:
            return
    
    # store user message in episodic memory
    # TODO: vectorize and store also conversation chunks
    #   (not raw dialog, but summarization)
    user_message_embedding = self.embedder.embed_documents([user_message_text])
    _ = self.memory.vectors.episodic.add_point(
        doc.page_content,
        user_message_embedding[0],
        doc.metadata,
    )

To sum up: I see the point of having a broken plugin hook's pipe, but this happens on the explicit intention of the plugin author, and only in the case he decide to return (None, True). Any other (doc, True/False) preserves the same behaviour of just returning doc as we can assume that, up to now, a tuple has been an invalid return value that was making the cat to throw a missing attribute exception

If you still believe it’s better to reject this proposal, we sincerely thank you for your time and consideration.

Best regards

@pieroit
Copy link
Member

pieroit commented Aug 19, 2025

Hi @lucapiccinelli thanks for chiming in.
This discussion happened many times; plugin authors typically want to be able to break the pipe, but usually they do not want other pluginsto do it.

My view on the topic is:

  • admin decide if something happens or not, with a plugin level setting (i.e. deactivate episodic memory)
  • plugins either ignore the data structure or change it

As already stated, I am moving the vector memory in a plugin, in branch v2. Once it is there, we may return to this point (can confirm right now you will be able to deactivate episodic memory and the whole vector memory altogether)

@lucapiccinelli
Copy link

Ok @pieroit thank you for the answer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants