Tutorial about example datasets#

import sys
from pathlib import Path
import shutil
import logging

import requests

import locan as lc
lc.show_versions(system=False, dependencies=False, verbose=False)
Locan:
   version: 0.20.0.dev41+g755b969

Python:
   version: 3.11.6
logging.basicConfig(stream=sys.stdout, level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger()

Load SMLM data from ShareLoc.XYZ#

SMLM data can e.g. be found on ShareLoc.XYZ, an open platform for sharing single-molecule localization microscopy data.

Copy a specific download link for downloading a smlm-file.

url = "https://zenodo.org/records/7182242/files/UniWue_Tubulin_AF647_3/data.smlm"
url
'https://zenodo.org/records/7182242/files/UniWue_Tubulin_AF647_3/data.smlm'
response = requests.get(url)
print("Response is ok: ", response.status_code == requests.codes.ok)
Response is ok:  True
file_path = Path.home() / Path(url).name

with open(file_path, 'wb') as file:
    for chunk in response.iter_content(chunk_size=128):
        file.write(chunk)
        
file_path
PosixPath('/home/docs/data.smlm')

Load data and visualize#

locdata = lc.load_SMLM_file(file_path)
Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.

Print information about the data:

print('Data head:')
print(locdata.data.head(), '\n')
print('Summary:')
locdata.print_summary()
print('Properties:')
print(locdata.properties)
Data head:
   original_index    position_x  local_background  chi_square     intensity  \
0               1   1653.339966        733.021973   6224440.0  13324.799805   
1               2   5672.879883        798.614990   2066740.0  10348.000000   
2               3   7117.830078        909.901978   1787550.0   6864.439941   
3               4   3707.459961        804.187012   2542450.0   9038.549805   
4               5  14038.200195        815.343994   1842030.0   8774.750000   

   frame    position_y  
0      0   8879.339844  
1      0   9851.900391  
2      0  11888.099609  
3      0  15224.799805  
4      0   5597.879883   

Summary:
identifier: "1"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 4582761
frame_count: 45000
file {
  type: SMLM
  path: "/home/docs/data.smlm"
}
creation_time {
  2024-03-14T11:08:25.215283Z
}

Properties:
{'localization_count': 4582761, 'position_x': 15788.771019381189, 'uncertainty_x': 4.646714958816508, 'position_y': 16751.12161338221, 'uncertainty_y': 4.13308495156872, 'intensity': 10823655000.0, 'local_background': 380.9028, 'frame': 0, 'region_measure_bb': 1078728300.0, 'localization_density_bb': 0.004248299516230371, 'subregion_measure_bb': 131376.0}
lc.render_2d(locdata, bin_size=100, rescale=lc.Trafo.EQUALIZE_0P3);
../../_images/216dea32a032cac3852e815978cf4406e03efdaf73f45b484d74c05b1fdd2cd8.png
lc.render_2d(locdata, bin_size=10, rescale=lc.Trafo.EQUALIZE_0P3,
             bin_range=((15_000, 20_200), (10_000, 15_000)));
../../_images/bff0b9c2e354a52d3e778569b862961f02483debf612e3736291405731716e7a.png

Load SMLM data from LocanDatasets#

Selected example datasets are provided in a separate directory (repository) called LocanDatasets.

These datasets can be loaded by ready-to-go utility functions.

Set up a datasets directory#

lc.DATASETS_DIR = Path.home() / 'LocanDatasets'
lc.DATASETS_DIR.mkdir(exist_ok=True)

Load dSTORM data of nuclear pore complexes#

This is a rather large 2D dataset with > 2 mio localizations.

url = "https://raw.github.com/super-resolution/LocanDatasets/main/smlm_data/npc_gp210.asdf"
response = requests.get(url)
print("Response is ok: ", response.status_code == requests.codes.ok)
Response is ok:  True
file_path = lc.DATASETS_DIR / 'npc_gp210.asdf'

with open(file_path, 'wb') as file:
    for chunk in response.iter_content(chunk_size=128):
        file.write(chunk)
        
file_path
PosixPath('/home/docs/LocanDatasets/npc_gp210.asdf')
dat = lc.load_npc()

Print information about the data:

print('Data head:')
print(dat.data.head(), '\n')
print('Summary:')
dat.print_summary()
print('Properties:')
print(dat.properties)
Data head:
     position_x    position_y  frame     intensity  two_kernel_improvement  \
0   5768.129883  20242.400391      0  83745.398438                     0.0   
1  21402.800781  18154.599609      0  67648.296875                     0.0   
2  11410.700195   3155.639893      0  73358.398438                     0.0   
3  15570.599609  15854.599609      0  65827.898438                     0.0   
4  22235.500000  12840.900391      0  56347.398438                     0.0   

   chi_square  local_background  
0   4355630.0       1511.459961  
1   8383720.0       1637.000000  
2   2420450.0       1480.380005  
3   2539930.0       1540.670044  
4  16152800.0       1572.040039   

Summary:
identifier: "13"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 2285189
frame_count: 24999
file {
  type: ASDF
  path: "/home/docs/LocanDatasets/npc_gp210.asdf"
}
creation_time {
  2024-03-14T11:08:30.030663Z
}

Properties:
{'localization_count': 2285189, 'position_x': 14570.820174404367, 'uncertainty_x': 5.195636170641695, 'position_y': 12028.219687728053, 'uncertainty_y': 4.523940296807998, 'intensity': 30375694000.0, 'local_background': 1124.8801, 'frame': 0, 'region_measure_bb': 655392800.0, 'localization_density_bb': 0.0034867473545268047, 'subregion_measure_bb': 103168.0}
lc.render_2d(dat, bin_size=100, rescale=lc.Trafo.EQUALIZE_0P3);
../../_images/2467b3551e189936f3bfe5ec50748b0438d6d9f5cbd8a3e3462bb9a2682144cd.png
lc.render_2d(dat, bin_size=10, rescale=lc.Trafo.EQUALIZE_0P3,
             bin_range=((0, 5000), (0, 5000)));
../../_images/ed7fc21647acc0e63fbf48aabed877225e5580bbea139bf77121dc7a87253700.png

Load dSTORM data of microtubules#

This is a rather large 2D dataset with about 1.5 mio localizations.

url = "https://raw.github.com/super-resolution/LocanDatasets/main/smlm_data/tubulin_cos7.asdf"
response = requests.get(url)
print("Response is ok: ", response.status_code == requests.codes.ok)
Response is ok:  True
file_path = lc.DATASETS_DIR / 'tubulin_cos7.asdf'

with open(file_path, 'wb') as file:
    for chunk in response.iter_content(chunk_size=128):
        file.write(chunk)
        
file_path
PosixPath('/home/docs/LocanDatasets/tubulin_cos7.asdf')
dat = lc.load_tubulin()

Print information about the data:

print('Data head:')
print(dat.data.head(), '\n')
print('Summary:')
dat.print_summary()
print('Properties:')
print(dat.properties)
Data head:
     position_x    position_y  frame     intensity  chi_square  \
0   9937.400391  16751.300781      0  40501.601562   3744920.0   
1   9998.709961  12022.799805      0  36280.300781  14295400.0   
2   9566.769531   8078.229980      0  29984.000000  12302200.0   
3  15492.500000  10120.400391      0  38488.300781   3219820.0   
4   6381.459961  16057.700195      0  37093.300781   1620450.0   

   local_background  
0        709.413025  
1        800.455017  
2        890.807007  
3        495.067993  
4        476.035004   

Summary:
identifier: "7"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 1506568
frame_count: 74969
file {
  type: ASDF
  path: "/home/docs/LocanDatasets/tubulin_cos7.asdf"
}
creation_time {
  2024-03-14T11:08:36.248853Z
}

Properties:
{'localization_count': 1506568, 'position_x': 8833.605738466662, 'uncertainty_x': 4.053253266247036, 'position_y': 10446.826432598386, 'uncertainty_y': 3.278383909327508, 'intensity': 19965508000.0, 'local_background': 284.22076, 'frame': 0, 'region_measure_bb': 289612320.0, 'localization_density_bb': 0.005202016267816231, 'subregion_measure_bb': 68072.0}
lc.render_2d(dat, bin_size=100, rescale=lc.Trafo.EQUALIZE_0P3);
../../_images/fd281d395c2cb4937f43092700924c2f5f16d646b06493bcf28a5dc1f8b83aef.png
lc.render_2d(dat, bin_size=10, rescale=lc.Trafo.EQUALIZE_0P3,
             bin_range=((0, 4000), (6000, 10_000)));
../../_images/71988eb8effd12ed0131e2083aff1fd4dfd22b9c57994cd5350a116235817448.png