The past two weeks has seen two additions to mtriage:
- Parallelisation by default of
Analyser.analyse
andSelector.retrieve
. - A generic and optional 'carry' flag that can be passed via analyser config to copy files from an element's base folder to its destination.
Huge thanks to @ivansafrin for the major part of this PR. In my mind,
parallelising the two major computationally intensive operations,
retrieve
for the selecting phase and analyse
for the analysis phase, adds
a real reason to adopt mtriage as a framework, rather than writing your own
custom scripts.
Applying retrieve_element
from a selector, or analyse_element
from an analyser
is, because of the way mtriage is conceived, always self-contained; and
therefore easy to parallelise. The idea of an element as a folder that contains
a set of similarly typed elements is the geist of mtriage as a framework.
Selectors are functions that create elements, and analysers are functions
that process them (to create new elements).
When looking to apply computational logic on media at scale, packing media into elements through mtriage allows developers to focus on the important and innovative logic that is being applied, and forget about the redundant code that reads and writes files in for loops.
Parallelising these operations means that now, not only does mtriage take the burden of necessary redundancy from the developer, it also does so in a way that enables code to run a lot more efficiently across multiple CPUs. This is a huge boon for us at FA.
This is a continuation of work I had been doing before introducing these updates making mtriage's type system more flexible, and so less coupled to its inaugural (computer vision) workflow.
Prior to this PR, if an analyser further down the chain in a workflow needed a
file in a selector's original element, the first analysers had to encode that
logic in their analyse_element
function, and copy the files over.
Not only did this make analyser encapsulation bad, it also meant that analysers
tended towards convoluted out types such as JsonAnnotatedImageArray
.
The carry flag solves both of these problems by offloading the work and specification of copying files during analysis to mtriage config, which makes a lot more sense that baking it into analysers themselves.
This cycle begins February 16th, and will end on February 29th. There have been a couple of community contributions that I am looking to merge. Otherwise, this cycle will focus on improving developer experience in general, and on writing templates and documentation for creating new components (analysers and selectors) in particular, as well as fixing some critical bugs in the Youtube selector.