Tutorial

Here we provide some basic usage examples for the SIMA Python package. These examples can be run in a standard Python shell, or with IPython. Users new to Python may wish to consult the Python documentation.

For more details on the classes and methods that comprise the SIMA package, please consult the SIMA API.

Importing SIMA

Like all Python packages, the SIMA package must be imported prior to use. Here we show a simple example of importing the SIMA package, which results in printing the docstring containing view basic information about the package.

>>> import sima
SIMA: Python package for sequential image analysis.
Developed by Patrick Kaifosh, Jeffrey Zaremba, Nathan Danielson.
Copyright (C) 2014 The Trustees of Columbia University in the City of New York.
Licensed under the GNU GPL version 2 or later.
Documentation: http://www.losonczylab.org/sima
Version 0.3.2

In all future examples, we assume that the SIMA package has been imported as shown above.

Submodules of the SIMA package also need to be imported before use. For example, the motion correction module can be imported as follows.

>>> import sima.motion

Individual classes or functions can be imported from submodules. For example, we can import the iterable object for use with multi-page TIFF files with the following command:

>>> from sima.iterables import MultiPageTIFF

For more details on importing, consult the Python documentation.

Example data

The SIMA package comes with a small amount of example data that, although insufficient to allow for reasonable results from the motion correction and segmentation algorithms, can at least be used to run the functions. For the purposes of this tutorial, we copy the example data into the current working directory.

>>> from shutil import copy, copytree
>>> import sima.misc
>>> copytree(sima.misc.example_data(), 'example.sima')
>>> copy(sima.misc.example_tiff(), 'example.tif')
>>> copy(sima.misc.example_tiff(), 'example_Ch1.tif')
>>> copy(sima.misc.example_tiff(), 'example_Ch2.tif')
>>> copy(sima.misc.example_hdf5(), 'example.h5')
>>> copy(sima.misc.example_imagej_rois(), 'imageJ_ROIs.zip')

Creating an ImagingDataset object

The SIMA package is centers around the ImagingDataset object class. A single ImagingDataset object can contain imaging data from multiple simultaneously recorded optical channels, as well as from multiple cycles (i.e. continuous imaging epochs/trials) acquired at the same imaging location during the same imaging session. Accordingly, the raw imaging data used to initialize the ImagingDataset object must be packaged into a list of lists, whose first index runs over the cycles and whose second index runs over the channels. The subsections below provide examples of how to initialize ImagingDataset objects using raw data in a variety of formats, including Numpy arrays, TIFF files, and HDF5 files.

The ImagingDataset object is permanently stored in the location (ending with extension .sima) specified during initialization. Results of segmentation, signal extraction, and other alterations to the ImagingDataset object are automatically saved to this location.

Numpy arrays

To begin with, we create some Numpy arrays containing random data.

>>> import numpy as np
>>> cycle1_channel1 = np.random.rand(100, 128, 128)
>>> cycle1_channel2 = np.random.rand(100, 128, 128)
>>> cycle2_channel1 = np.random.rand(100, 128, 128)
>>> cycle2_channel2 = np.random.rand(100, 128, 128)

Once we have the Numpy arrays containing the imaging data, we create the ImagingDataset object as follows.

>>> iterables = [
...     [cycle1_channel1, cycle1_channel2],
...     [cycle2_channel1, cycle2_channel2]
... ]
>>> dataset = sima.ImagingDataset(
...     iterables, 'example_np.sima', channel_names=['green', 'red'])

Multipage TIFF files

For simplicity, we consider the case of only a single cycle and channel.

>>> from sima.iterables import MultiPageTIFF
>>> iterables = [[MultiPageTIFF('example_Ch1.tif')]]
>>> dataset = sima.ImagingDataset(iterables, 'example_TIFF.sima')

HDF5 files

The argument ‘yxt’ specifies that the first index of the HDF5 array corresponds to the row, the second to the column, and the third to the time.

>>> from sima.iterables import HDF5
>>> iterables = [[HDF5('example.h5', 'yxt')]]
>>> dataset = sima.ImagingDataset(iterables, 'example_HDF5.sima')

Loading ImagingDataset objects

A dataset object can also be loaded from a saved path with the .sima extension.

>>> dataset = sima.ImagingDataset.load('example.sima')

Motion correction

In the following example, the SIMA package is used for motion correction based on a hidden Markov model (HMM). The sima.motion.hmm() function takes the same arguments as are used to initialize an imaging dataset object, as well as some additional optional arguments. In the example below, an optional argument is used to indicate that the maximum possible displacement is 20 rows and 30 columns.

>>> import sima.motion
>>> from sima.iterables import MultiPageTIFF
>>> iterables = [[MultiPageTIFF('example_Ch1.tif')]]
>>> dataset = sima.motion.hmm(iterables, 'example_mc.sima',
...                           max_displacement=[20,30], verbose=False)

When the signal is of interest is very sparse or highly dynamic, it is sometimes helpful to use a second static channel to estimate the displacements for motion correction. The example below is for the case where the first channel contains a dynamic GCaMP signal whose large variations would confuse the motion correction alogorithm, and the second channel contains a static tdTomato signal that provides a stable reference.

>>> import sima.motion
>>> from sima.iterables import MultiPageTIFF
>>> iterables = [[MultiPageTIFF('example_Ch1.tif'),
...               MultiPageTIFF('example_Ch2.tif')]]
>>> dataset = sima.motion.hmm(
...     iterables, 'example_mc2.sima', max_displacement=[20,30],
...     channel_names=['GCaMP', 'tdTomato'],
...     correction_channels=['tdTomato'], verbose=False)

When motion correction is invoked as above, only the tdTomato channel is used for estimating the displacements, which are then applied to both channels.

Segmentation and ROIs

Automated segmentation

An ImagingDataset object can be automatically segmented with a call to its segment() method. The arguments of the segment() method specify the segmentation approach to be used, an optional label for the resulting set of ROIs, and additional arguments specific to the particular segmentation approach. In the example below, an imaging dataset object is segmented with the 'ca1pc' method designed for segmenting CA1 pyramidal cells.

>>> dataset = sima.ImagingDataset.load('example.sima')
>>> rois = dataset.segment('ca1pc', 'auto_ROIs', num_pcs=5)
>>> dataset.ROIs.keys()  # view the labels of the existing ROILists
['auto_ROIs']

Editing, creating, and registering ROIs with ROI Buddy

Note that the an ImagingDataset object can be loaded with the ROI Buddy graphical user interface (GUI) for manual editing of existing the ROI lists, creation of new ROI lists, or registration of ROI lists across multiple experiments in which the same field of view is imaged. For more details, consult the ROI Buddy documentation.

Importing ROIs from ImageJ

ROIS can also be imported from ImageJ, as shown in the following example.

>>> from sima.ROI import ROIList
>>> dataset = sima.ImagingDataset.load('example.sima')
>>> rois = ROIList.load('imageJ_ROIs.zip', fmt='ImageJ')
>>> dataset.add_ROIs(rois, 'from_ImageJ')
>>> dataset.ROIs.keys()  # view the labels of the existing ROILists
['from_ImageJ', 'auto_ROIs']

Mapping ROIs between datasets

Sometimes, for example when imaging the same field of view over multiple days, one wishes to segment the same structures in separate ImagingDataset objects. If all of the ImagingDataset objects have been segmented, then the results of the segmentations can be registered with the ROI Buddy GUI as mentioned previously. If, however, only one of the datasets has been segmented, the results of the segmentation can be applied to the other datasets by applying to each ROI the affine transformation necessary to map one imaged field of view onto the other. This can be done either with the ROI Buddy GUI or with a call to the import_transformed_ROIs() method, whose arguments allow for specification of the channels used to align the two datasets, the label of the :obj:ROIList to be transformed from one dataset to the other, the label that will be applied to the new ROIList, and whether to copy the properties of the ROIs as well as their shapes.

>>> source_dataset = sima.ImagingDataset.load('example.sima')
>>> target_dataset = sima.ImagingDataset.load('example_mc2.sima')
>>> target_dataset.ROIs.keys()
[]
>>> target_dataset.import_transformed_ROIs(
...     source_dataset, source_channel='green', target_channel='GCaMP',
...     source_label='from_ImageJ', target_label='transformed',
...     copy_properties='True')
>>> target_dataset.ROIs.keys()
['transformed']

This approach allows the user to focus on careful manual curation of the segmentation for a single ImagingDataset, with the results of this segmentation then applied to all datasets acquired at the same field of view.

Accessing stored ROIs

Whenever ROIs are created or imported, they are permanently stored as part of the ImagingDataset object. The ROIs can be recovered at any time using the label specified at the time when the ROIs were created.

>>> dataset = sima.ImagingDataset.load('example.sima')
>>> dataset.ROIs.keys()  # view the labels of the existing ROILists
['from_ImageJ', 'auto_ROIs']
>>> rois = dataset.ROIs['auto_ROIs']

Extraction

Once the ROIs have been edited and registered, the dataset can be loaded, and then dynamic fluorescence signals can be extracted from the ROIs with the extract() method.

>>> dataset = sima.ImagingDataset.load('example.sima')
>>> dataset.ROIs.keys()
['from_ImageJ', 'auto_ROIs']
>>> rois = dataset.ROIs['from_ImageJ']
>>> dataset.channel_names
['red', 'green']
>>> signals = dataset.extract(
...     rois, signal_channel='green', label='green_signal')

The extracted signals are permanently saved with the ImagingDataset object and can be accessed at any time with the command signals() method.

>>> dataset = sima.ImagingDataset.load('example.sima')
>>> signals = dataset.signals(channel='green')['green_signal']

Exporting data

Data can be exported from the SIMA ImagingDataset objects at various stages of the analysis. This allows SIMA to be used for early stages of data analysis, and then for the exported data to be analyzed with separate software. If, however, further analysis is to be performed with Python, such exporting may not be necessary. The subsections below contain examples showing how to export image data and signal data.

Image data

The ImagingDataset class has two methods for exporting image data, export_frames() and export_averages(), which export either all the frames or the time averages of each channel, respectively. These methods can be used to view the results of motion correction, as shown in the following example.

>>> import sima.motion
>>> from sima.iterables import MultiPageTIFF
>>> iterables = [[MultiPageTIFF('example_Ch1.tif')]]
>>> dataset = sima.motion.hmm(iterables, 'example_mc3.sima',
...                           max_displacement=[20,30], verbose=False)
>>> dataset.export_averages(['exported_frames.tif'], fmt='TIFF16')
>>> dataset.export_frames([['exported_frames.tif']], fmt='TIFF16')

The paths to which the exported data are saved are organized as a list with one filename per channel for the export_averages() method, or as a list of lists (organized analogously to the iterables used to initialize an ImagingDataset object) for the export_frames() method. If however, the export format is specified to HDF5, then the filenames for export_frames() should be organized into a list with one filename per cycle, since both channels are combined into a single HDF5 file.

>>> dataset.export_frames('exported_frames.h5', fmt='HDF5')

Signal data

For users wishing to analyze the extracted signals with an external program, these signals can be exported to a CSV file.

>>> dataset = sima.ImagingDataset.load('example.sima')
>>> dataset.export_signals('example_signals.csv', channel='green')

The resulting CSV file contains the id, label, and tags for each ROI, and the extracted signal from each ROI at each frame time.

Complete example

Below are the contents of workflow.py in the examples directory provided with the SIMA source code.

#! /usr/bin/env python
"""
This file provides a demonstration of how to use the SIMA package.

In order to run this file, you will need to download the file
http://www.losonczylab.org/workflow_data.zip and extract it in your
current working directory.

"""

##############################################################################
#                                                                            #
#   PART 0: Import SIMA and necessary submodules.                            #
#                                                                            #
##############################################################################

import sima
import sima.motion
from sima.iterables import MultiPageTIFF

##############################################################################
#                                                                            #
#   PART 1: Preparing the iterables.                                         #
#                                                                            #
##############################################################################

# Generate the filenames with Python list comprehensions.
tiff_filenames = [
    ['workflow_data/Cycle{n1:02d}_Ch{n2}.tif'.format(n1=cycle, n2=channel)
     for channel in range(1, 3)] for cycle in range(1, 16)
]

# The resulting filenames are printed for clarification.
print "TIFF filenames:\n", tiff_filenames

# We set the clip to remove the first 20 columns from each image because
# of a problem with the imaging system that produced the images.
clip = ((0, 0), (20, 0))

# Finally, we construct a MultiPageTIFF iterable using each of the filenames.
iterables = [
    [MultiPageTIFF(chan, clip) for chan in cycle] for cycle in tiff_filenames
]

##############################################################################
#                                                                            #
#   PART 2: Running motion correction to create the dataset, and exporting   #
#           the corrected image data.                                        #
#                                                                            #
##############################################################################

dataset_path = 'workflow_data/dataset.sima'
dataset = sima.motion.hmm(
    iterables, dataset_path, ['tdTomato', 'GCaMP'], num_states_retained=30,
    max_displacement=[20, 30], trim_criterion=0.95
)

# Export the time averages for a manuscript figure.
dataset.export_averages(['workflow_data/tdTomato.tif',
                         'workflow_data/GCaMP.tif'])

# Generate the output filenames with Python list comprehensions.
output_filenames = [
    [channel.replace('.tif', '_corrected.tif') for channel in cycle]
    for cycle in tiff_filenames
]

# The resulting filenames are printed for clarification.
print "Output filenames:\n", output_filenames

# Export the corrected frames for a presentation.
dataset.export_frames(output_filenames, fill_gaps=True)

# At this point, one may wish to inspect the exported image data to evaluate
# the quality of the motion correction before continuing.
while True:
    input_ = raw_input("Continue? (y/n): ")
    if input_ == 'n':
        exit()
    elif input_ == 'y':
        break

##############################################################################
#                                                                            #
#   PART 3: Running automated segmentation and editing results with the ROI  #
#           Buddy GUI.                                                       #
#                                                                            #
##############################################################################

# Segment the field of view into ROIs using the method for CA1 pyramidal cells
# and parameters that were determined based on the imaging magnification.
dataset.segment(
    'ca1pc',
    channel='GCaMP',
    num_pcs=30,
    max_dist=(3, 6),
    spatial_decay=(3, 6),
    cut_max_pen=0.10,
    cut_min_size=50,
    cut_max_size=150,
    x_diameter=14,
    y_diameter=7,
    circularity_threhold=.5,
    min_roi_size=20,
    min_cut_size=40
)

# At this point, one may wish to edit the automatically segmented ROIs using
# the ROI Buddy GUI before performing signal extraction.
while True:
    input_ = raw_input("Continue? (y/n): ")
    if input_ == 'n':
        exit()
    elif input_ == 'y':
        break

##############################################################################
#                                                                            #
#   PART 4: Extracting fluorescence signals from the ROIs.                   #
#                                                                            #
##############################################################################

# Reload the dataset in case any changes have been made with ROI Buddy
dataset = sima.ImagingDataset.load(dataset_path)

# Extract the signals. By default, the most recently created ROIs are used.
dataset.extract(signal_channel='GCaMP', label='GCaMP_signals')

# Export the extracted signals to a CSV file.
dataset.export_signals('example_signals.csv', channel='GCaMP',
                       signals_label='GCaMP_signals')

##############################################################################
#                                                                            #
#   PART 5: Visualizing data using Python.                                   #
#                                                                            #
##############################################################################

# import necessary functions from matplotlib
from matplotlib.pyplot import plot, show

# plot the signal from an ROI object, with a different color for each cycle
raw_signals = dataset.signals('GCaMP')['GCaMP_signals']['raw']
for cycle in range(3):  # plot data from the first 3 cycles
    plot(raw_signals[cycle][3])  # plot the data from ROI #3
show(block=True)