SIMA Example workflow

The following notebook will walk you through a basic SIMA workflow.

It will demonstarate how to: 1. Import raw data in to SIMA. 2. Correct for motion artifacts using SIMA's HMM algorithm. 3. Perform automatic segmentation/source extraction/ROI detection. 4. Extract signals from segmented ROIs. 5. Plot data!

In [1]:
# iPython magic function to setup plotting
%matplotlib inline
import matplotlib.pyplot as plt
In [2]:
# Download workflow example data if needed
from StringIO import StringIO
from zipfile import ZipFile
from urllib import urlopen
from os.path import isdir

data_url = 'http://www.losonczylab.org/workflow_data.zip'

if not isdir('workflow_data'):
    url = urlopen(data_url)
    zipfile = ZipFile(StringIO(url.read()))
    zipfile.extractall()

Part 0: Import SIMA and necessary submodules

In [3]:
import sima
import sima.segment

sima.__version__
Out[3]:
'1.3.2'

Part 1: Importing data in to SIMA and preparing sequences

In [4]:
# Generate the filenames with Python list comprehensions.

tiff_filenames = [
    ['workflow_data/Cycle{n1:02d}_Ch{n2}.tif'.format(n1=cycle, n2=channel)
     for channel in range(1, 3)] for cycle in range(1, 16)
]

tiff_filenames
Out[4]:
[['workflow_data/Cycle01_Ch1.tif', 'workflow_data/Cycle01_Ch2.tif'],
 ['workflow_data/Cycle02_Ch1.tif', 'workflow_data/Cycle02_Ch2.tif'],
 ['workflow_data/Cycle03_Ch1.tif', 'workflow_data/Cycle03_Ch2.tif'],
 ['workflow_data/Cycle04_Ch1.tif', 'workflow_data/Cycle04_Ch2.tif'],
 ['workflow_data/Cycle05_Ch1.tif', 'workflow_data/Cycle05_Ch2.tif'],
 ['workflow_data/Cycle06_Ch1.tif', 'workflow_data/Cycle06_Ch2.tif'],
 ['workflow_data/Cycle07_Ch1.tif', 'workflow_data/Cycle07_Ch2.tif'],
 ['workflow_data/Cycle08_Ch1.tif', 'workflow_data/Cycle08_Ch2.tif'],
 ['workflow_data/Cycle09_Ch1.tif', 'workflow_data/Cycle09_Ch2.tif'],
 ['workflow_data/Cycle10_Ch1.tif', 'workflow_data/Cycle10_Ch2.tif'],
 ['workflow_data/Cycle11_Ch1.tif', 'workflow_data/Cycle11_Ch2.tif'],
 ['workflow_data/Cycle12_Ch1.tif', 'workflow_data/Cycle12_Ch2.tif'],
 ['workflow_data/Cycle13_Ch1.tif', 'workflow_data/Cycle13_Ch2.tif'],
 ['workflow_data/Cycle14_Ch1.tif', 'workflow_data/Cycle14_Ch2.tif'],
 ['workflow_data/Cycle15_Ch1.tif', 'workflow_data/Cycle15_Ch2.tif']]
In [5]:
# Finally, we construct a TIFF sequence using each of the TIFF stacks.

all_sequences = [
    [sima.Sequence.create('TIFF', chan) for chan in cycle]
    for cycle in tiff_filenames]

# Combine channels that were saved as separte TIFF stacks
sequences = [
    sima.Sequence.join(*channel_sequences) for channel_sequences in all_sequences]

sequences
Out[5]:
[<sima.sequence._Joined_Sequence at 0x7f6097082650>,
 <sima.sequence._Joined_Sequence at 0x7f609683bd10>,
 <sima.sequence._Joined_Sequence at 0x7f60981c7390>,
 <sima.sequence._Joined_Sequence at 0x7f6096818790>,
 <sima.sequence._Joined_Sequence at 0x7f6096818850>,
 <sima.sequence._Joined_Sequence at 0x7f6096818810>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5610>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5810>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5850>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5890>,
 <sima.sequence._Joined_Sequence at 0x7f60965c58d0>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5910>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5950>,
 <sima.sequence._Joined_Sequence at 0x7f60965c5990>,
 <sima.sequence._Joined_Sequence at 0x7f60965c59d0>]

Part 2: Running motion correction to create the dataset, and exporting the corrected image data.

In [6]:
dataset_path = 'workflow_data/dataset.sima'

# Remove an old dataset if there is one
from os.path import isdir
if isdir(dataset_path):
    from shutil import rmtree
    rmtree(dataset_path)

correction_approach = sima.motion.HiddenMarkov2D(num_states_retained=30,
                                                 max_displacement=[20, 30],
                                                 verbose=True)
dataset = correction_approach.correct(
    sequences, dataset_path, channel_names=['tdTomato', 'GCaMP'],
    trim_criterion=0.95)
Estimating model parameters.
Estimating displacements for cycle  0
Estimating displacements for cycle  1
Estimating displacements for cycle  2
Estimating displacements for cycle  3
Estimating displacements for cycle  4
Estimating displacements for cycle  5
Estimating displacements for cycle  6
Estimating displacements for cycle  7
Estimating displacements for cycle  8
Estimating displacements for cycle  9
Estimating displacements for cycle  10
Estimating displacements for cycle  11
Estimating displacements for cycle  12
Estimating displacements for cycle  13
Estimating displacements for cycle  14

In [7]:
# Check the output
dataset
Out[7]:
<ImagingDataset: num_sequences=15, frame_shape=(1, 125, 252, 2), num_frames=1800>
In [8]:
# Export the time averages for a manuscript figure.
dataset.export_averages(['workflow_data/tdTomato.tif',
                         'workflow_data/GCaMP.tif'])
In [9]:
# Display the time averages of each channel
_, axs = plt.subplots(1, 2)
axs[0].imshow(dataset.time_averages[0, :, :, 0], cmap='gray', interpolation='none', aspect=2)
axs[1].imshow(dataset.time_averages[0, :, :, 1], cmap='gray', interpolation='none', aspect=2)

axs[0].set_title('tdTomato')
axs[0].tick_params(labelleft=False, labelbottom=False)

axs[1].set_title('GCaMP')
axs[1].tick_params(labelleft=False, labelbottom=False)
In [10]:
# Generate the output filenames with Python list comprehensions.
# NOTE: Exporting the motion corrected TIFF stacks is not necessary
# if you plan to extract signals with SIMA.
output_filenames = [
    [[channel.replace('.tif', '_corrected.tif') for channel in cycle]]
    for cycle in tiff_filenames
]

output_filenames
Out[10]:
[[['workflow_data/Cycle01_Ch1_corrected.tif',
   'workflow_data/Cycle01_Ch2_corrected.tif']],
 [['workflow_data/Cycle02_Ch1_corrected.tif',
   'workflow_data/Cycle02_Ch2_corrected.tif']],
 [['workflow_data/Cycle03_Ch1_corrected.tif',
   'workflow_data/Cycle03_Ch2_corrected.tif']],
 [['workflow_data/Cycle04_Ch1_corrected.tif',
   'workflow_data/Cycle04_Ch2_corrected.tif']],
 [['workflow_data/Cycle05_Ch1_corrected.tif',
   'workflow_data/Cycle05_Ch2_corrected.tif']],
 [['workflow_data/Cycle06_Ch1_corrected.tif',
   'workflow_data/Cycle06_Ch2_corrected.tif']],
 [['workflow_data/Cycle07_Ch1_corrected.tif',
   'workflow_data/Cycle07_Ch2_corrected.tif']],
 [['workflow_data/Cycle08_Ch1_corrected.tif',
   'workflow_data/Cycle08_Ch2_corrected.tif']],
 [['workflow_data/Cycle09_Ch1_corrected.tif',
   'workflow_data/Cycle09_Ch2_corrected.tif']],
 [['workflow_data/Cycle10_Ch1_corrected.tif',
   'workflow_data/Cycle10_Ch2_corrected.tif']],
 [['workflow_data/Cycle11_Ch1_corrected.tif',
   'workflow_data/Cycle11_Ch2_corrected.tif']],
 [['workflow_data/Cycle12_Ch1_corrected.tif',
   'workflow_data/Cycle12_Ch2_corrected.tif']],
 [['workflow_data/Cycle13_Ch1_corrected.tif',
   'workflow_data/Cycle13_Ch2_corrected.tif']],
 [['workflow_data/Cycle14_Ch1_corrected.tif',
   'workflow_data/Cycle14_Ch2_corrected.tif']],
 [['workflow_data/Cycle15_Ch1_corrected.tif',
   'workflow_data/Cycle15_Ch2_corrected.tif']]]
In [11]:
# Export the corrected frames for a presentation.
dataset.export_frames(output_filenames, fill_gaps=True)

Part 3: Running automated segmentation/source detection.

In [12]:
# Segment the field of view into ROIs using the method for CA1 pyramidal cells
# and parameters that were determined based on the imaging magnification.

segmentation_approach = sima.segment.PlaneCA1PC(
    channel='GCaMP',
    num_pcs=30,
    max_dist=(3, 6),
    spatial_decay=(3, 6),
    cut_max_pen=0.10,
    cut_min_size=50,
    cut_max_size=150,
    x_diameter=14,
    y_diameter=7,
    circularity_threhold=.5,
    min_roi_size=20,
    min_cut_size=40
)
In [13]:
# ROIs will both be returned and also saved under the label given (in this case 'auto_ROIs')

auto_rois = dataset.segment(segmentation_approach, label='auto_ROIs')
In [14]:
# Check the results

plt.imshow(dataset.time_averages[0, :, :, 1], cmap='gray', interpolation='none', aspect=2)
xlim, ylim = plt.xlim(), plt.ylim()

for roi in dataset.ROIs['auto_ROIs']:
    for poly in roi.coords:
        plt.plot(poly[:, 0], poly[:, 1])

plt.xlim(xlim)
plt.ylim(ylim)
plt.tick_params(labelleft=False, labelbottom=False)

Part 4: Extracting fluorescence signals from the ROIs

At this point you can also check/edit ROIs with the ROIBuddy GUI (https://github.com/losonczylab/roibuddy)

In [15]:
# Reload the dataset in case any changes have been made.
dataset = sima.ImagingDataset.load(dataset_path)
In [16]:
# Extract the signals. By default, the most recently created ROIs are used.
# Signals will also be saved under the given label ('GCaMP_signals' in this case)
extracted_signals = dataset.extract(signal_channel='GCaMP', label='GCaMP_signals')
In [17]:
# Check the ouput
"""Each extraction produces a dictionary with the following keys:
    mean_frame -- the time averaged image of the extracted channel (zyx)
    overlap -- coordinates (of flattened image) of pixels that are contained in multiple ROIs
    raw -- raw extracted signal
    rois -- a copy of the ROI objects used for the extraction
    signal_channel -- index of channel that was extracted
    timestamp -- date and time of the extraction
    
"""

signals = dataset.signals('GCaMP')['GCaMP_signals']
signals.keys()
Out[17]:
['mean_frame', 'rois', 'timestamp', 'overlap', 'raw', 'signal_channel']
In [18]:
# List of arrays, one array per cylcle/trial...
len(signals['raw'])
Out[18]:
15
In [19]:
# Each array is number of ROIS by number of frames
signals['raw'][0].shape
Out[19]:
(231, 120)
In [20]:
# Export the extracted signals to a CSV file.
# As with the corrected TIFF stacks, this step is not necessary if you are planning to
# continue the analysis in Python
dataset.export_signals('example_signals.csv', channel='GCaMP',
                       signals_label='GCaMP_signals')
In [21]:
# Perform a simple dF/F caluclation and plot the signal from an ROI,
# with a different color for each cycle
from builtins import range
from numpy import nanmean

def dff(trace):
    return (trace - nanmean(trace)) / nanmean(trace)

raw_signals = dataset.signals('GCaMP')['GCaMP_signals']['raw']
for sequence in range(3):  # plot data from the first 3 cycles
    plt.plot(dff(raw_signals[sequence][34])) # plot the data from the 35th ROI

plt.xlabel('Frame')
plt.ylabel(r'$\Delta$F/F')
plt.title('Signal from ROI #35 for first 3 trials');
In [21]: