Tutorial about loading localization data from file#
from pathlib import Path
import locan as lc
lc.show_versions(system=False, dependencies=False, verbose=False)
Locan:
version: 0.20.0.dev41+g755b969
Python:
version: 3.11.6
Localization data is typically provided as text or binary file with different formats depending on the fitting software. Locan provides functions for loading various localization files.
All available functions can be looked up in the API documentation.
In locan there are functions availabel to deal with file types according to the constant enum FileType
:
list(lc.FileType._member_names_)
['UNKNOWN_FILE_TYPE',
'CUSTOM',
'RAPIDSTORM',
'ELYRA',
'THUNDERSTORM',
'ASDF',
'NANOIMAGER',
'RAPIDSTORMTRACK',
'SMLM',
'DECODE',
'SMAP']
Currently the following io functions are available:
[name for name in dir(lc.locan_io) if not name.startswith("__")]
Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.
['Files',
'annotations',
'convert_property_names',
'convert_property_types',
'files',
'find_file_upstream',
'load_Elyra_file',
'load_Elyra_header',
'load_Nanoimager_file',
'load_Nanoimager_header',
'load_SMAP_file',
'load_SMAP_header',
'load_SMLM_file',
'load_SMLM_header',
'load_SMLM_manifest',
'load_asdf_file',
'load_decode_file',
'load_decode_header',
'load_locdata',
'load_rapidSTORM_file',
'load_rapidSTORM_header',
'load_rapidSTORM_track_file',
'load_rapidSTORM_track_header',
'load_thunderstorm_file',
'load_thunderstorm_header',
'load_txt_file',
'locdata',
'manifest_file_info_from_locdata',
'manifest_format_from_locdata',
'manifest_from_locdata',
'save_SMAP_csv',
'save_SMLM',
'save_asdf',
'save_thunderstorm_csv',
'utilities']
Throughout this manual it might be helpful to use pathlib to provide path information. In all cases a string path is also usable.
Load rapidSTORM data file#
Here we identify some data in the test_data directory and provide a path using pathlib (a pathlib object is returned by lc.ROOT_DIR
):
path = lc.ROOT_DIR / 'tests/test_data/rapidSTORM_dstorm_data.txt'
print(path, '\n')
/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/rapidSTORM_dstorm_data.txt
The data is then loaded from a rapidSTORM localization file. The file header is read to provide correct property names. The number of localisations to be read can be limited by nrows
dat = lc.load_rapidSTORM_file(path=path, nrows=10)
Print information about the data:
print('Data head:')
print(dat.data.head(), '\n')
print('Summary:')
dat.print_summary()
print('Properties:')
print(dat.properties)
Data head:
position_x position_y frame intensity chi_square local_background
0 9657.40 24533.5 0 33290.10 1192250.0 767.732971
1 16754.90 18770.0 0 21275.40 2106810.0 875.460999
2 14457.60 18582.6 0 20748.70 526031.0 703.369995
3 6820.58 16662.8 0 8531.77 3179190.0 852.789001
4 19183.20 22907.2 0 14139.60 448631.0 662.770020
Summary:
identifier: "1"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 10
frame_count: 1
file {
type: RAPIDSTORM
path: "/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/rapidSTORM_dstorm_data.txt"
}
creation_time {
2024-03-14T11:08:51.023763Z
}
Properties:
{'localization_count': 10, 'position_x': 16878.898, 'uncertainty_x': 2252.823069869743, 'position_y': 18209.502, 'uncertainty_y': 1569.5233876858572, 'intensity': 147591.47, 'local_background': 707.37335, 'frame': 0, 'region_measure_bb': 378484578.7175999, 'localization_density_bb': 2.6421155741358055e-08, 'subregion_measure_bb': 80399.79999999999}
Column names are exchanged with standard locan property names according to the following mapping. If no mapping is defined a warning is issued and the original column name is kept.
lc.RAPIDSTORM_KEYS
{'Position-0-0': 'position_x',
'Position-1-0': 'position_y',
'Position-2-0': 'position_z',
'ImageNumber-0-0': 'frame',
'Amplitude-0-0': 'intensity',
'FitResidues-0-0': 'chi_square',
'LocalBackground-0-0': 'local_background',
'TwoKernelImprovement-0-0': 'two_kernel_improvement',
'Position-0-0-uncertainty': 'uncertainty_x',
'Position-1-0-uncertainty': 'uncertainty_y',
'Position-2-0-uncertainty': 'uncertainty_z'}
Load Zeiss Elyra data file#
The Elyra super-resolution microscopy system from Zeiss uses as slightly different file format. Elyra column names are exchanged with locan property names upon loading the data.
path_Elyra = lc.ROOT_DIR / 'tests/test_data/Elyra_dstorm_data.txt'
print(path_Elyra, '\n')
/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/Elyra_dstorm_data.txt
dat_Elyra = lc.load_Elyra_file(path=path_Elyra, nrows=10)
print('Data head:')
print(dat_Elyra.data.head(), '\n')
print('Summary:')
dat_Elyra.print_summary()
print('Properties:')
print(dat_Elyra.properties)
Data head:
original_index frame frames_number frames_missing position_x \
0 1 1 1 0 15850.6
1 2 1 1 0 25617.3
2 3 1 1 0 20155.8
3 4 1 1 0 10776.9
4 5 1 1 0 28966.9
position_y uncertainty intensity local_background_sigma chi_square \
0 23502.1 8.6 472.0 5.33 0.28
1 24310.2 9.5 529.0 4.38 0.31
2 24039.1 13.0 306.0 3.06 0.23
3 10047.4 13.4 369.0 3.98 0.25
4 8731.6 18.1 428.0 14.73 0.41
psf_half_width channel slice_z
0 110.000000 1 1.0
1 129.800003 1 1.0
2 131.100006 1 1.0
3 143.000000 1 1.0
4 150.100006 1 1.0
Summary:
identifier: "2"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 10
frame_count: 1
file {
type: ELYRA
path: "/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/Elyra_dstorm_data.txt"
}
creation_time {
2024-03-14T11:08:51.060046Z
}
Properties:
{'localization_count': 10, 'position_x': 19610.811087722697, 'uncertainty_x': 2109.405021031108, 'position_y': 18319.131814537763, 'uncertainty_y': 2608.8543142720196, 'intensity': 3145.0, 'frame': 1, 'region_measure_bb': 351167887.24, 'localization_density_bb': 2.847640790447807e-08, 'subregion_measure_bb': 75072.8}
Localization data from a custom text file#
Other custom text files can be read with a function that wraps the pandas.read_table() method.
path_csv = lc.ROOT_DIR / 'tests/test_data/five_blobs.txt'
print(path_csv, '\n')
/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/five_blobs.txt
Here data is loaded from a comma-separated-value file. Column names are read from the first line and a warning is given if the naming does not comply with locan conventions. Column names can also be provided as column. The separater, e.g. a tab â\tâ can be provided as sep.
dat_csv = lc.load_txt_file(path=path_csv, sep=',', columns=None, nrows=10)
print('Data head:')
print(dat_csv.data.head(), '\n')
print('Summary:')
dat_csv.print_summary()
print('Properties:')
print(dat_csv.properties)
Data head:
index position_x position_y cluster_label frame
0 0 624.0 919.0 3 0
1 1 611.0 873.0 3 0
2 2 388.0 1015.0 0 0
3 3 209.0 465.0 2 0
4 4 1001.0 851.0 4 0
Summary:
identifier: "3"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 10
frame_count: 1
file {
type: CUSTOM
path: "/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/five_blobs.txt"
}
creation_time {
2024-03-14T11:08:51.086592Z
}
Properties:
{'localization_count': 10, 'position_x': 517.5, 'uncertainty_x': 87.40648971583543, 'position_y': 815.3, 'uncertainty_y': 65.24586832385123, 'frame': 0, 'region_measure_bb': 488950.0, 'localization_density_bb': 2.0451988955925965e-05, 'subregion_measure_bb': 2878.0}
Load localization data file#
A general function for loading localization data is provided. Targeting specific localization file formats is done through the file_format
parameter.
path = lc.ROOT_DIR / 'tests/test_data/rapidSTORM_dstorm_data.txt'
print(path, '\n')
/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/rapidSTORM_dstorm_data.txt
dat = lc.load_locdata(path=path, file_type=lc.FileType.RAPIDSTORM, nrows=10)
dat.print_summary()
identifier: "4"
comment: ""
source: EXPERIMENT
state: RAW
element_count: 10
frame_count: 1
file {
type: RAPIDSTORM
path: "/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/rapidSTORM_dstorm_data.txt"
}
creation_time {
2024-03-14T11:08:51.111785Z
}
The file type can be specified by using the enum class FileType
and use tab control to make a choice.
list(lc.FileType._member_names_)
['UNKNOWN_FILE_TYPE',
'CUSTOM',
'RAPIDSTORM',
'ELYRA',
'THUNDERSTORM',
'ASDF',
'NANOIMAGER',
'RAPIDSTORMTRACK',
'SMLM',
'DECODE',
'SMAP']
lc.FileType.RAPIDSTORM
<FileType.RAPIDSTORM: 2>
Adjust data types#
The data types of localization proparties are adjusted in all load functions by default to the following standdard types:
lc.PROPERTY_KEYS
{'index': 'integer',
'original_index': 'integer',
'position_x': 'float',
'position_y': 'float',
'position_z': 'float',
'frame': 'integer',
'frames_number': 'integer',
'frames_missing': 'integer',
'time': 'float',
'intensity': 'float',
'local_background': 'float',
'local_background_sigma': 'float',
'signal_noise_ratio': 'float',
'signal_background_ratio': 'float',
'chi_square': 'float',
'two_kernel_improvement': 'float',
'psf_amplitude': 'float',
'psf_width': 'float',
'psf_width_x': 'float',
'psf_width_y': 'float',
'psf_width_z': 'float',
'psf_half_width': 'float',
'psf_half_width_x': 'float',
'psf_half_width_y': 'float',
'psf_half_width_z': 'float',
'psf_sigma': 'float',
'psf_sigma_x': 'float',
'psf_sigma_y': 'float',
'psf_sigma_z': 'float',
'uncertainty': 'float',
'uncertainty_x': 'float',
'uncertainty_y': 'float',
'uncertainty_z': 'float',
'channel': 'integer',
'slice_z': 'float',
'plane': 'integer',
'cluster_label': 'integer'}
If this is not what you want, add convert = False
.
path = lc.ROOT_DIR / 'tests/test_data/rapidSTORM_dstorm_data.txt'
print(path, '\n')
/home/docs/checkouts/readthedocs.org/user_builds/locan/envs/latest/lib/python3.11/site-packages/locan/tests/test_data/rapidSTORM_dstorm_data.txt
locdata = lc.load_locdata(path=path, file_type=lc.FileType.RAPIDSTORM, nrows=10, convert=False)
locdata.data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 position_x 10 non-null float64
1 position_y 10 non-null float64
2 frame 10 non-null int64
3 intensity 10 non-null float64
4 chi_square 10 non-null float64
5 local_background 10 non-null float64
dtypes: float64(5), int64(1)
memory usage: 612.0 bytes
Maybe adjust types for selected localization properties.
other_types = {"frame": float}
df = lc.convert_property_types(locdata.data, types=other_types)
locdata.update(dataframe=df)
locdata.data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 position_x 10 non-null float64
1 position_y 10 non-null float64
2 frame 10 non-null float64
3 intensity 10 non-null float64
4 chi_square 10 non-null float64
5 local_background 10 non-null float64
dtypes: float64(6)
memory usage: 612.0 bytes