{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial about loading localization data from file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "import locan as lc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lc.show_versions(system=False, dependencies=False, verbose=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# A path in which test data can be found:\n",
    "TEST_DIR: Path = Path.cwd().parents[2] / \"tests\"\n",
    "TEST_DIR"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Localization data is typically provided as text or binary file with different formats depending on the fitting software. Locan provides functions for loading various localization files. \n",
    "\n",
    "All available functions can be looked up in the [API documentation](https://locan.readthedocs.io/en/latest/source/generated/locan.locan_io.locdata.html#module-locan.locan_io.locdata)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In locan there are functions availabel to deal with file types according to the constant enum `FileType`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list(lc.FileType._member_names_)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Currently the following io functions are available:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "[name for name in dir(lc.locan_io) if not name.startswith(\"__\")]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Throughout this manual it might be helpful to use pathlib to provide path information. In all cases a string path is also usable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load rapidSTORM data file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we identify some data in the test_data directory and provide a path using pathlib (a pathlib object is returned by `lc.ROOT_DIR`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = TEST_DIR / 'test_data/rapidSTORM_dstorm_data.txt'\n",
    "print(path, '\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data is then loaded from a rapidSTORM localization file. The file header is read to provide correct property names. The number of localisations to be read can be limited by *nrows*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dat = lc.load_rapidSTORM_file(path=path, nrows=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print information about the data: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Data head:')\n",
    "print(dat.data.head(), '\\n')\n",
    "print('Summary:')\n",
    "dat.print_summary()\n",
    "print('Properties:')\n",
    "print(dat.properties)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Column names are exchanged with standard locan property names according to the following mapping. If no mapping is defined a warning is issued and the original column name is kept."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lc.RAPIDSTORM_KEYS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Zeiss Elyra data file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Elyra super-resolution microscopy system from Zeiss uses as slightly different file format. Elyra column names are exchanged with locan property names upon loading the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path_Elyra = TEST_DIR / 'test_data/Elyra_dstorm_data.txt'\n",
    "print(path_Elyra, '\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dat_Elyra = lc.load_Elyra_file(path=path_Elyra, nrows=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Data head:')\n",
    "print(dat_Elyra.data.head(), '\\n')\n",
    "print('Summary:')\n",
    "dat_Elyra.print_summary()\n",
    "print('Properties:')\n",
    "print(dat_Elyra.properties)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Localization data from a custom text file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other custom text files can be read with a function that wraps the pandas.read_table() method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path_csv = TEST_DIR / 'test_data/five_blobs.txt'\n",
    "print(path_csv, '\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here data is loaded from a comma-separated-value file. Column names are read from the first line and a warning is given if the naming does not comply with locan conventions. Column names can also be provided as *column*. The separater, e.g. a tab '\\t' can be provided as *sep*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dat_csv = lc.load_txt_file(path=path_csv, sep=',', columns=None, nrows=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Data head:')\n",
    "print(dat_csv.data.head(), '\\n')\n",
    "print('Summary:')\n",
    "dat_csv.print_summary()\n",
    "print('Properties:')\n",
    "print(dat_csv.properties)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load localization data file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A general function for loading localization data is provided. Targeting specific localization file formats is done through the `file_format` parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = TEST_DIR / 'test_data/rapidSTORM_dstorm_data.txt'\n",
    "print(path, '\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dat = lc.load_locdata(path=path, file_type=lc.FileType.RAPIDSTORM, nrows=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dat.print_summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The file type can be specified by using the enum class `FileType` and use tab control to make a choice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list(lc.FileType._member_names_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lc.FileType.RAPIDSTORM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Adjust data types"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data types of localization proparties are adjusted in all load functions by default to the following standdard types: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "lc.PROPERTY_KEYS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If this is not what you want, add `convert = False`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = TEST_DIR / 'test_data/rapidSTORM_dstorm_data.txt'\n",
    "print(path, '\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "locdata = lc.load_locdata(path=path, file_type=lc.FileType.RAPIDSTORM, nrows=10, convert=False)\n",
    "locdata.data.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Maybe adjust types for selected localization properties."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "other_types = {\"frame\": float}\n",
    "df = lc.convert_property_types(locdata.data, types=other_types)\n",
    "locdata.update(dataframe=df)\n",
    "locdata.data.info()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.6"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}