Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of mask metadata #55

Merged
merged 8 commits into from
Jul 17, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 102 additions & 16 deletions spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,58 @@ scripts provided by this repository will support one or more versions of this
file, but they should all be considered internal investigations, not intended
for public re-use.

## Basic layout
## On-disk (or in-cloud) layout

```

. # Root folder, potentially in S3,
├── 123.zarr # with a flat list of images by image ID.
└── 456.zarr #
├── .zattrs # Group level metadata.
├── .zgroup # Each image is a Zarr group with multscale metadata.
└── 0 # Each multiscale level is stored as a separate Zarr array.
├── .zarray #
├── 0.0.0.0.0 # Chunks are stored with the flat directory layout.
└── t.c.z.y.x # All image arrays are 5-dimensional
# with dimension order (t, c, z, y, x).
│ # with a flat list of images by image ID.
├── 123.zarr # One image (id=123) converted to Zarr.
└── 456.zarr # Another image (id=456) converted to Zarr.
├── .zgroup # Each image is a Zarr group, or a folder, of other groups and arrays.
├── .zattrs # Group level attributes are stored in the .zattrs file and include
│ # "multiscales" and "omero" below)
├── 0 # Each multiscale level is stored as a separate Zarr array,
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
│ ... # which is a folder containing chunk files which compose the array.
├── n # The name of the array is arbitrary with the ordering defined by
│ │ # by the "multiscales" metadata, but is often a sequence starting at 0.
│ │
│ ├── .zarray # All image arrays are 5-dimensional
│ │ # with dimension order (t, c, z, y, x).
│ │
│ ├── 0.0.0.0.0 # Chunks are stored with the flat directory layout.
sbesson marked this conversation as resolved.
Show resolved Hide resolved
│ │ ... # Each dotted component of the chunk file represents
│ └── t.c.z.y.x # a "chunk coordinate", where the maximum coordinate
│ # will be `dimension_size / chunk_size`.
└── masks
├── .zgroup # The masks group is a container which holds a list
├── .zattrs # of masks to make the objects easily discoverable,
│ # All masks will be listed in `.zattrs` e.g. `{ "masks": [ "original/0" ] }`
joshmoore marked this conversation as resolved.
Show resolved Hide resolved
│ # Each dimension of the mask `(t, c, z, y, x)` should be either the same as the
│ # corresponding dimension of the image, or `1` if that dimension of the mask
│ # is irrelevant.
└── original # Intermediate folders are permitted but not necessary
│ # and currently contain no extra metadata.
sbesson marked this conversation as resolved.
Show resolved Hide resolved
└── 0
├── .zarray # Each mask itself is a 5D array matching the highest resolution
Copy link
Member

@jburel jburel Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that means that we won't be able to store masks generated at a different resolution

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no. But then ome-zarr-py code likely wouldn't support that either, nor omero-cli-zarr that output. Let's file that as an issue on one or several of the repositories and then we can work through the implications. I'm trying to get this PR into a state that it can be merged before I leave so no one is blocked on me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood.
mainly a problem I am facing with the large DBV files that I have been segmenting at a resolution that is not the highest one but we can have that in another round

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jburel can you file an issue about the mask resolutions?

└── .zattrs # of the related image and has an extra key, "color", with display information.


```

## "multiscales" metadata
## Metadata

The various `.zattrs` files throughout the above array hierarchy may contain metadata
keys as specified below for discovering certain types of data, especially images.

### "multiscales" metadata

Metadata about the multiple resolution representations of the image can be
found under the "multiscales" key in the group-level metadata.
Expand All @@ -43,7 +79,7 @@ if not datasets:
The subresolutions in each multiscale are ordered from highest-resolution
to lowest.

## "omero" metadata
### "omero" metadata

Information specific to the channels of an image and how to render it
can be found under the "omero" key in the group-level metadata:
Expand Down Expand Up @@ -74,13 +110,63 @@ can be found under the "omero" key in the group-level metadata:
}
```


See https://docs.openmicroscopy.org/omero/5.6.1/developers/Web/WebGateway.html#imgdata
for more information.

### "masks"

The special group "masks" found under an image Zarr contains the key `masks` containing
the paths to mask objects which can be found underneath the group:

```
{
"masks": [
"orphaned/0"
]
}
```

Unlisted groups MAY be masks.

### "color"

The `color` key defines an image that is "labeled", i.e. every unique value in the image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also export color with style=--split. The mask only has 1 values (not labelled) but it does have a color.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... reading it, I'm not sure just having only one label makes something not labeled. The example also only shows one value ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so both --style=labelled and --style=split produce masks that are labelled. Maybe this is a bit less confusing if we ignore the --style=labelled option (since that is the default). If we also consider removing the non-compliant 6d option then all masks are "labelled". Are any masks NOT "labelled"? So I think you can remove that whole sentence "The color key defines an image that is labeled, i.e. every unique value in the image represents a unique, non-overlapping object within the image." since even without any 'color', that statement is still true.

represents a unique, non-overlapping object within the image. The value associated with
the `color` key is another JSON object in which the key is the pixel value of the image and
the value is an RGBA color (4 byte, `0-255` per channel) for representing the object:

```
{
"color": {
"1": 8388736,
sbesson marked this conversation as resolved.
Show resolved Hide resolved
...
```
### "image"

Copy link
Member

@manics manics Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### "image"
The `image` key is an optional dictionary which contains information on the image the mask is associated with.
If included it must include a key `array` whose value that is either:
- A relative path to a Zarr image array, for example:
```
{
"image": {
"array": "../../0"
}
}
```
- A URL to a Zarr image array (use this if the mask is stored seperately from the image Zarr), for example:
```
{
"image": {
"array": "https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/0"
}
}
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also open discussion at ome/omero-cli-zarr#19 (comment) about the specification of the image key

The `image` key is an optional dictionary which contains information on the image the mask is associated with.
If included it must include a key `array` whose value that is either:
- A relative path to a Zarr image array, for example:
```
{
"image": {
"array": "../../0"
}
}
```
- A URL to a Zarr image array (use this if the mask is stored seperately from the image Zarr), for example:
```
{
"image": {
"array": "https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001240.zarr/0"
}
}
```



| Revision | Date | Description |
| ---------- | ------------ | ------------------------------------------ |
| - | 2020-05-07 | Add description of "omero" metadata |
| - | 2020-05-06 | Add info on the ordering of resolutions |
| 0.1 | 2020-04-20 | First version for internal demo |
| 0.1.3 | 2020-07-07 | Add mask metadata |
| 0.1.2 | 2020-05-07 | Add description of "omero" metadata |
| 0.1.1 | 2020-05-06 | Add info on the ordering of resolutions |
| 0.1.0 | 2020-04-20 | First version for internal demo |