{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial about metadata" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from google.protobuf import json_format, text_format\n", "\n", "import locan as lc" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lc.show_versions(system=False, dependencies=False, verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Metadata definition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have define a canonical set of metadata to accompany localization data.\n", "\n", "Metadata is described by protobuf messages. Googles protobuf format is advantageous to enforce metdata definitions that can be easily attached to various file formats, exchanged with other programmes and implemented in different programming languages.\n", "\n", "Metadata is instantiated through messages defined in the `locan.data.metadata_pb2` module." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(lc.data.metadata_pb2.DESCRIPTOR.message_types_by_name.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each class contains a logical set of information that is integrated in the main class `Metadata`.\n", "\n", "`Metadata` contains the following keys:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = lc.data.metadata_pb2.Metadata()\n", "list(metadata.DESCRIPTOR.fields_by_name.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each field has a predefined type and can be set to appropriate values:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata.comment = \"This is a comment\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " metadata.comment = 1\n", "except Exception as e:\n", " print(e)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Metadata values including the default values can be shown in JSON format or as dictionary:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "json_format.MessageToDict(metadata)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# except empty fields with repeated message classes\n", "json_format.MessageToDict(metadata, always_print_fields_with_no_presence=True, preserving_proto_field_name=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "json_format.MessageToJson(metadata, always_print_fields_with_no_presence=True, preserving_proto_field_name=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To print metadata with timestamp and duration in a well formatted string use:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lc.metadata_to_formatted_string(metadata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set metadata fields " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Repeated fields " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To set selected fields instantiate the appropriate messages. For list fields use `message.add()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = lc.data.metadata_pb2.Metadata()\n", "\n", "ou = metadata.experiment.setups.add().optical_units.add()\n", "ou.detection.camera.electrons_per_count = 13.26\n", "\n", "metadata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Timestamp fields " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Timestamp fields contain information on date and time zone and are of type `google.protobuf.Timestamp`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "metadata = lc.data.metadata_pb2.Metadata()\n", "metadata.creation_time.GetCurrentTime()\n", "metadata.creation_time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata.creation_time.FromJsonString('2022-05-14T06:58:00.514893Z')\n", "metadata.creation_time.ToJsonString()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Time duration fields contain information on time intervals and are of type `google.protobuf.Duration`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata.experiment.setups.add().optical_units.add().detection.camera.integration_time.FromMilliseconds(20)\n", "metadata.experiment.setups[0].optical_units[0].detection.camera.integration_time.ToMilliseconds()\n", "# metadata.experiment.setups[0].optical_units[0].detection.camera.integration_time.ToJsonString()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To print metadata with timestamp and duration in a well formatted string use:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lc.metadata_to_formatted_string(metadata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Metadata scheme" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The overall scheme can be instantiated and visualized:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "metadata = lc.data.metadata_pb2.Metadata()\n", "scheme = lc.message_scheme(metadata)\n", "scheme" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Metadata from toml file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can provide metadata in a [toml](https://toml.io) file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata_toml = \\\n", "\"\"\"\n", "# Define the class (message) instances.\n", "\n", "[[messages]]\n", "name = \"metadata\"\n", "module = \"locan.data.metadata_pb2\"\n", "class_name = \"Metadata\"\n", "\n", "\n", "# Fill metadata attributes\n", "# Headings must be a message name or valid attribute.\n", "# Use [[]] to add repeated elements.\n", "# Use string '2022-05-14T06:58:00Z' for Timestamp elements.\n", "# Use int in nanoseconds for Duration elements.\n", "\n", "[metadata]\n", "identifier = \"123\"\n", "comment = \"my comment\"\n", "ancestor_identifiers = [\"1\", \"2\"]\n", "production_time = '2022-05-14T06:58:00Z'\n", "\n", "[[metadata.experiment.experimenters]]\n", "first_name = \"First name\"\n", "last_name = \"Last name\"\n", "\n", "[[metadata.experiment.experimenters.affiliations]]\n", "institute = \"Institute\"\n", "department = \"Department\"\n", "\n", "[[metadata.experiment.setups]]\n", "identifier = \"1\"\n", "\n", "[[metadata.experiment.setups.optical_units]]\n", "identifier = \"1\"\n", "\n", "[metadata.experiment.setups.optical_units.detection.camera]\n", "identifier = \"1\"\n", "name = \"camera name\"\n", "model = \"camera model\"\n", "electrons_per_count = 3.1\n", "integration_time = 10_000_000\n", "\n", "[metadata.localizer]\n", "software = \"rapidSTORM\"\n", "\n", "[[metadata.relations]]\n", "identifier = \"1\"\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "toml_out = lc.metadata_from_toml_string(metadata_toml)\n", "for k, v in toml_out.items():\n", " print(k, \":\\n\\n\", v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To load from file:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "with open(path, mode=\"r\", encoding=\"utf-8\") as file:\n", " toml_out = lc.load_metadata_from_toml(file)" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Metadata for LocData" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Metadata is instantiated for each LocData object and accessible through the `LocData.meta` attribute." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sample data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(\n", " {\n", " 'position_x': np.arange(0,10),\n", " 'position_y': np.random.random(10),\n", " 'frame': np.arange(0,10),\n", " })\n", "locdata = lc.LocData.from_dataframe(dataframe=df)\n", "\n", "locdata.meta" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fields can also be printed as well formatted string (using `lc.metadata_to_formatted_string`):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "locdata.print_meta()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A summary of the most important metadata is printed as:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "locdata.print_summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Metadata fields can be printed and changed individually:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(locdata.meta.comment)\n", "locdata.meta.comment = 'user comment'\n", "print(locdata.meta.comment)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Metadata can also be added at instantiation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "locdata_2 = lc.LocData.from_dataframe(dataframe=df, meta={'identifier': 'myID_1', \n", " 'comment': 'my own user comment'})\n", "locdata_2.print_summary()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.6" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }