Sequence objects

class sima.Sequence

Object containing data from sequentially acquired imaging data.

Sequences are created with a call to the create method.

>>> from sima import Sequence
>>> from sima.misc import example_hdf5
>>> path = example_hdf5()
>>> seq = Sequence.create('HDF5', path, 'yxt')

Sequences are array like, and can be converted to numpy arrays or passed as arguments into numpy functions that take arrays.

>>> import numpy as np
>>> arr = np.array(seq)
>>> time_avg = np.mean(seq, axis=0)
>>> np.shape(seq) == seq.shape
True

Note, however, that the application of numpy functions to Sequence objects may force the entire sequence to be loaded into memory at once. Depending on the size of the data and the available memory, this may result in memory errors. If possible, consider slicing the array prior prior to applying the numpy function.

shape

tuple

(num_frames, num_planes, num_rows, num_columns, num_channels)

__array__()

Used to convert the Sequence to a numpy array.

>>> import sima
>>> import numpy as np
>>> data = np.ones((10, 3, 16, 16, 2))
>>> seq = sima.Sequence.create('ndarray', data)
>>> np.all(data == np.array(seq))
True
__getitem__(indices)

Create a new Sequence by slicing this Sequence.

__iter__()

Iterate over the frames of the Sequence.

The yielded structures are numpy arrays of the shape (num_planes, num_rows, num_columns, num_channels).

classmethod create(fmt, *args, **kwargs)

Create a Sequence object.

Parameters:
  • fmt ({‘HDF5’, ‘TIFF’, ‘TIFFs’, ‘ndarray’}) – The format of the data used to create the Sequence.
  • *args
  • **kwargs

    Additional arguments depending on the data format. See Notes below.

Notes

Below are explanations of the arguments for each format.

HDF5

path : str
The HDF5 filename, typically with .h5 extension.
dim_order : str
Specification of the order of the dimensions. This string can contain the letters ‘t’, ‘x’, ‘y’, ‘z’, and ‘c’, representing time, column, row, plane, and channel, respectively. For example, ‘tzyxc’ indicates that the HDF5 data dimensions represent time (t), plane (z), row (y), column (x), and channel (c), respectively. The string ‘tyx’ indicates that data for a single imaging plane and single channel has been stored in a HDF5 dataset with three dimensions representing time (t), column (y), and row (x), respectively.
group : str, optional
The HDF5 group containing the imaging data. Defaults to using the root group ‘/’
key : str, optional
The key for indexing the the HDF5 dataset containing the imaging data. This can be omitted if the HDF5 group contains only a single key.
>>> from sima import Sequence
>>> from sima.misc import example_hdf5
>>> path = example_hdf5()
>>> seq = Sequence.create('HDF5', path, 'yxt')
>>> seq.shape == (20, 1, 128, 256, 1)
True

Warning

Moving the HDF5 file may make this Sequence unusable when the ImagingDataset is reloaded. The HDF5 file can only be moved if the ImagingDataset path is also moved such that they retain the same relative position.

TIFF

This format is appropriate when the imaging data is stored in a single multi-page TIFF file, with each page containing a different frame. If the TIFF file contains interleaved data from multiple planes or channels, then the number of planes or channels should be specified.

path : str
The path to the file storing the imaging data.
num_planes : int, optional
The number of interleaved planes. Default: 1.
num_channels : int, optional
The number of interleaved channels. Default: 1.

Warning

Moving the TIFF file may make this Sequence unusable when the ImagingDataset is reloaded. The TIFF file can only be moved if the ImagingDataset path is also moved such that they retain the same relative position.

Warning

Due to a limitation in the PIL module image read function, multi-page TIFF files will fail to initialize if size exceeds 4-5 gb.

TIFFs

This format is appropriate when the imaging data is stored in single-page TIFF files, with each file containing the data from a single frame.

paths : list of list of str
The string paths[i][j] is a Unix style expression for the filenames for plane i and channel j. See glob for details on how to format such a string.
>>> from sima import Sequence
>>> seq = Sequence.create('TIFFs', [['example/example_??.tif']])
>>> seq = Sequence.create('TIFFs', [['example/example_*.tif']])

Warning

Moving the TIFF files may make this Sequence unusable when the ImagingDataset is reloaded. The TIFF files can only be moved if the ImagingDataset path is also moved such that they retain the same relative position.

ndarray

This format allows for sequences to be created from numpy arrays of shape (num_frames, num_planes, num_rows, num_columns, num_channels), i.e. tzyxc. If your array is organized by of a different shape, you can reorganize it using numpy.transpose() to reorder the axes and numpy.newaxis to insert any missing axes.

array : numpy.ndarray
A numpy array of shape (num_frames, num_planes, num_rows, num_columns, num_channels)
path : str, optional
Instead of directly passing in an array, a path to a saved numpy .npy file may be used to initialize the sequence.
export(filenames, fmt='TIFF16', fill_gaps=False, channel_names=None, compression=None, scale_values=False)

Save frames to the indicated filenames.

This function stores a multipage tiff file for each channel.

Parameters:
  • filenames (str or list of list str) – The names of the output files. For HDF5 files, this must be a single string. For TIFF formats, this should be a list of list of strings, such that filenames[i][j] corresponds to the ith plane and the jth channel.
  • fmt ({‘HDF5’, ‘TIFF16’, ‘TIFF8’}) – The output file format.
  • fill_gaps (bool, optional) – Whether to fill in missing data with pixel intensities from adjacent frames. Default: False.
  • channel_names (list of str, optional) – List of labels for the channels to be saved if using HDF5 format.
  • compression ({None, ‘gzip’, ‘lzf’, ‘szip’}, optional) – If not None and ‘fmt’ is ‘HDF5’, compress the data with the specified lossless compression filter. See h5py docs for details on each compression filter.
  • scale_values (bool, optional) – Whether to scale the values to use the full range of the output format. Defaults to False. Channels are scaled separately.
static join(*sequences)

Join together sequences representing different channels.

Parameters:sequences (sima.Sequence) – Each argument is a Sequence representing a different channel.
Returns:joined_sequence – A single sequence with multiple channels.
Return type:sima.Sequence

Examples

>>> from sima import Sequence
>>> from sima.misc import example_hdf5
>>> path = example_hdf5()
>>> seq = Sequence.create('HDF5', path, 'yxt')
>>> joined = Sequence.join(seq, seq)
>>> joined.shape[4] == 2 * seq.shape[4]  # twice as many channels
True
>>> joined.shape[:4] == seq.shape[:4]  # the frame shape is unchanged
True
mask(masks)

Apply a mask to the sequence.

Masked values will be represented as numpy.NaN.

Parameters:masks (list of tuple) – Each element of the list is a tuple describing a mask. Each mask tuple can have the form (frames, zyx-mask, channels) or (frames, planes, yx-mask, channels). The frames and channels elements of the tuple are lists of the frames and channels to which the mask is to be applied. They yx-mask element is a binary array whose True values indicate the pixels to be masked. If any of the entries is set to None, than the mask will be fully applied along that dimension.

Examples

Mask out frame 3 entirely:

>>> from sima import Sequence
>>> from sima.misc import example_hdf5
>>> path = example_hdf5()
>>> seq = Sequence.create('HDF5', path, 'yxt')
>>> masked_seq = seq.mask([(3, None, None)])
>>> [np.all(np.isnan(frame))
...  for i, frame in enumerate(masked_seq) if i < 5]
[False, False, False, True, False]

Mask out plane 0 of frame 3:

>>> masked_seq = seq.mask([(3, 0, None, None)])

Mask out certain pixels at all times in channel 0.

>>> mask = np.random.binomial(1, 0.5, seq.shape[1:-1])
>>> masked_seq = seq.mask([(None, mask, 0)])