Disk management: Global GC stop the object solution

Stop the object

Stopping dataClay might not be a good option. Instead of that, this solution proposes another way for cleaning all objects.

Currently, we are not able to propose a solution that avoid any kind of block because:

We always have to ask other nodes if they have references to the object we want to delete. During the ‘query’ the references could change.

For example: DS1 asks DS2 if it has any reference to object O1. DS2 says no. DS1 proceeds to ask DS3, during this query, DS3 send object O1 to DS2. DS3 has no reference to O1 but now DS2 has. DS3 says no. DS1 proceeds to delete the object. Wrong!

That’s why it is necessary to stop the creation of new references to O1. We call this solution ‘stop the object’.

Each node has:

Table of objects send and received from a certain client session (like previous solution).
Information about the number of references pointing to the objects stored in the node including the current alive threads.

Eventually, the node will check if any object has 0 references i.e. no entry in the client table and no reference from any object or GC root from the node itself.

First round: For any object with 0 references it is necessary to check if it ‘actually has 0 references by asking to ALL nodes’:

If any node has the object in memory, pointed by any of its objects or in its client table, then the first node cannot remove the object.
Otherwise, the node annotates the ID of the object and blocks any future interaction with it, for example, if the node receives the object as a parameter, the request is stopped until the cleaning process has finished. The stop is required to avoid the same problem during the second round.

Second round: If all nodes answered that they don’t have any reference, we proceed to ask them again. This is necessary to ensure the correctness of their answers. If any node has a reference now (blocked object) then the object cannot be deleted. Otherwise the object can be deleted. During this round the objects are unlocked. The broadcast process can be improved by ‘grouping’ objects. As you can see, the problem here can be the number of communications but, it is a background process.

Calculating the performance impact here is also difficult. Here we must quantify the time an object is blocked for the first round. Let’s say that:

We have N nodes.
Per each node, the time spend for answering during the first round is Mx where x is the number of the node.
The time between first and second round is T The worst case is when the object to block is the first of the broadcast, then the maximum time the object remains blocked is:

MaxTime = M1 + M2 + … + T

How can we compare this with the first solution? We could compare the number of communications needed in each case but, the fact that one is totally stopped and the other is not states that there is many factors we can’t calculate. For example, stopping the universe spends 1 second and stopping the object 2 seconds but during the stop of the object other requests continue working.