{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial about coordinate-based colocalization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Coordinate-based colocalization of two selections is computed as introduced by Malkusch et al. (1). A local density of locdata A is compared with the local density of locdata B within a varying radius for each localization in selection A. Local densities at various radii are compared by Spearman-rank-correlation and weighted by an exponential factor including the Euclidean distance to the nearest neighbor for each localization. The colocalization coefficient can take a value between -1 and 1 with -1 indicating anti-correlation (i.e. nearby localizations of selection B), 0 no colocalization, 1 strong colocalization.\n", "\n", "The colocalization coefficient is provided as property for each localization within the corresponding dataset. The property key refers to colocalizing locdata A with locdata B." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "%matplotlib inline\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import matplotlib.colors as colors\n", "\n", "import locan as lc" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lc.show_versions(system=False, dependencies=False, verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Synthetic data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Provide synthetic data made of clustered localizations that are normal distributed around their center positions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rng = np.random.default_rng(seed=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def set_centers(n_centers_1d=3, feature_range = (0, 1000)):\n", " dist = (feature_range[1] - feature_range[0])/(n_centers_1d + 1)\n", " centers = np.mgrid[feature_range[0] + dist : feature_range[1] : dist, feature_range[0] + dist : feature_range[1] : dist].reshape(2, -1).T\n", " return centers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_centers_1d = 3\n", "feature_range = (0, 1000)\n", "centers = set_centers(n_centers_1d, feature_range)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offspring = rng.normal(loc=0, scale=20, size=(len(centers), 100, 2))\n", "locdata = lc.simulate_cluster(centers=centers, region=[feature_range] * 2, offspring=offspring, clip=True, shuffle=True, seed=rng)\n", "\n", "locdata.print_summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A second dataset is provided by shifting the first dataset by an offset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "locdata_trans = lc.transform_affine(locdata, offset=(20,0))\n", "locdata_trans.print_summary()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(nrows=1, ncols=1)\n", "locdata.data.plot.scatter(x='position_x', y='position_y', ax=ax, color='Blue', label='origin')\n", "locdata_trans.data.plot.scatter(x='position_x', y='position_y', ax=ax, color='Red', label='transform')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CBC computation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cbc = lc.CoordinateBasedColocalization(radius=100, n_steps=10).compute(locdata=locdata, other_locdata=locdata_trans)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cbc.results.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "points = locdata.coordinates\n", "color = cbc.results['colocalization_cbc_2'].values\n", "\n", "fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 5))\n", "cax = axes[0].scatter(x=points[:,0], y=points[:,1], marker='.', c=color, cmap='coolwarm', norm= colors.Normalize(-1., 1.), label='points')\n", "axes[0].set_title('CBC coefficients for original data')\n", "\n", "# axes[1].scatter(x=points[:,0], y=points[:,1], marker='.', color='Blue', label='points')\n", "# axes[1].scatter(x=points_trans[:,0], y=points_trans[:,1], marker='o', color='Red', label='transformed points')\n", "locdata.data.plot.scatter(x='position_x', y='position_y', ax=axes[1], color='Blue', label='origin')\n", "locdata_trans.data.plot.scatter(x='position_x', y='position_y', ax=axes[1], color='Red', label='transform')\n", "axes[1].set_title('Transformed points')\n", "plt.colorbar(cax, ax=axes[0])\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CBC for various shifts" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_centers_1d = 3 \n", "feature_range = (0, 2000)\n", "centers = set_centers(n_centers_1d, feature_range)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offspring = rng.normal(loc=0, scale=20, size=(len(centers), 100, 2))\n", "locdata = lc.simulate_cluster(centers=centers, region=[feature_range] * 2, offspring=offspring, clip=True, shuffle=True, seed=rng)\n", "\n", "locdata.print_summary()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offsets = [0, 50, 100, 200]\n", "locdata_trans_list = [lc.transform_affine(locdata, offset=(offset, 0)) for offset in offsets]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cbc_list = [lc.CoordinateBasedColocalization(radius=100, n_steps=10).compute(locdata=locdata, other_locdata=other_locdata) for other_locdata in locdata_trans_list]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "points = locdata.coordinates\n", "n_rows = 1\n", "n_cols = 4\n", "\n", "fig = plt.figure(figsize=(15, 5))\n", "for n, (cbc, offset) in enumerate(zip(cbc_list, offsets)):\n", " ax = fig.add_subplot(n_rows, n_cols, n+1)\n", " color = cbc.results.iloc[:, 0].values\n", " ax.scatter(x=points[:,0], y=points[:,1], marker='.', c=color, cmap='coolwarm', norm=colors.Normalize(-1., 1.))\n", " ax.set_title(f'offset: {offset}')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CBC on various length scales (for small cluster)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_centers_1d = 3 \n", "feature_range = (0, 2000)\n", "centers = set_centers(n_centers_1d, feature_range)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offspring = rng.normal(loc=0, scale=20, size=(len(centers), 100, 2))\n", "locdata = lc.simulate_cluster(centers=centers, region=[feature_range] * 2, offspring=offspring, clip=True, shuffle=True, seed=rng)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offsets = [0, 50, 100, 200]\n", "locdata_trans_list = [lc.transform_affine(locdata, offset=(offset,0)) for offset in offsets]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "radii = [50, 100, 150, 200, 250, 300, 350, 400]\n", "cbc_list = [lc.CoordinateBasedColocalization(radius=radius, n_steps=10).compute(locdata=locdata, other_locdata=other_locdata) for radius in radii \n", " for other_locdata in locdata_trans_list\n", " ]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "points = locdata.coordinates\n", "params = [(radius, offset) for radius in radii for offset in offsets]\n", "\n", "n_rows = len(radii)\n", "n_cols = len(offsets)\n", "\n", "fig = plt.figure(figsize=(15, 15))\n", "for n, (cbc, (radius, offset)) in enumerate(zip(cbc_list, params)):\n", " ax = fig.add_subplot(n_rows, n_cols, n+1)\n", " color = cbc.results.iloc[:, 0].values\n", " if not all(np.isnan(color)):\n", " ax.scatter(x=points[:,0], y=points[:,1], marker='.', c=color, cmap='coolwarm', norm=colors.Normalize(-1., 1.))\n", " ax.set_title(f'offset: {offset}, radius: {radius}')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CBC on various length scales (for larger cluster)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_centers_1d = 3 \n", "feature_range = (0, 10_000)\n", "centers = set_centers(n_centers_1d, feature_range)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offspring = rng.normal(loc=0, scale=100, size=(len(centers), 100, 2))\n", "locdata = lc.simulate_cluster(centers=centers, region=[feature_range] * 2, offspring=offspring, clip=True, shuffle=True, seed=rng)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "offsets = [0, 50, 100, 200]\n", "locdata_trans_list = [lc.transform_affine(locdata, offset=(offset,0)) for offset in offsets]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "radii = [50, 100, 150, 200, 250, 300, 350, 400]\n", "cbc_list = [lc.CoordinateBasedColocalization(radius=radius, n_steps=10).compute(locdata=locdata, other_locdata=other_locdata) for radius in radii \n", " for other_locdata in locdata_trans_list\n", " ]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "points = locdata.coordinates\n", "params = [(radius, offset) for radius in radii for offset in offsets]\n", "\n", "n_rows = len(radii)\n", "n_cols = len(offsets)\n", "\n", "fig = plt.figure(figsize=(15, 15))\n", "for n, (cbc, (radius, offset)) in enumerate(zip(cbc_list, params)):\n", " ax = fig.add_subplot(n_rows, n_cols, n+1)\n", " color = cbc.results.iloc[:, 0].values\n", " ax.scatter(x=points[:,0], y=points[:,1], marker='.', c=color, cmap='coolwarm', norm=colors.Normalize(-1., 1.))\n", " ax.set_title(f'offset: {offset}, radius: {radius}')\n", "plt.tight_layout()\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }