{ "cells": [ { "cell_type": "markdown", "id": "efcfc806-08c7-4ee5-894c-92b6235ebe91", "metadata": {}, "source": [ "## Run cell-based matching\n", "\n", "This takes as set of input metadetect catalogs and runs matching using the cell-based `ShearMatch` matcher." ] }, { "cell_type": "markdown", "id": "8b9fb907-ce60-43d5-a06a-8d7d75291ec2", "metadata": {}, "source": [ "#### Standard imports" ] }, { "cell_type": "code", "execution_count": null, "id": "b3fe89f9-cf35-486c-b15b-0bd9566a64d6", "metadata": {}, "outputs": [], "source": [ "import hpmcm\n", "import glob\n", "import os\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "ed7edf2e-e25e-4b28-b5ea-8dd0e806566c", "metadata": {}, "source": [ "#### Set up the configuration" ] }, { "cell_type": "code", "execution_count": null, "id": "f0ceddec-40d2-42d4-98a3-88b88dcdbd81", "metadata": {}, "outputs": [], "source": [ "DATADIR = \"test_data\" # Input data directory\n", "shear_st = \"0p01\" # Applied shear as a string\n", "shear = 0.01 # Decimal version of applied shear\n", "shear_type = \"wmom\" # which object characterization to use \n", "tract = 10463 # which tract to study\n", "\n", "SOURCE_TABLEFILES = sorted(glob.glob(os.path.join(DATADIR, f\"shear_{shear_type}_{shear_st}_uncleaned_{tract}_*.pq\")))\n", "SOURCE_TABLEFILES.reverse()\n", "VISIT_IDS = np.arange(len(SOURCE_TABLEFILES))\n", "\n", "PIXEL_R2CUT = 4. # Cut at distance**2 = 4 pixels\n", "PIXEL_MATCH_SCALE = 1 # Use pixel scale to do matching" ] }, { "cell_type": "markdown", "id": "fcecff53-e644-4e0a-b568-3061cd81cecd", "metadata": {}, "source": [ "#### Make the matcher, reduce the data" ] }, { "cell_type": "code", "execution_count": null, "id": "5662f9fc-40b9-4c75-b1a6-a66dec54ff43", "metadata": {}, "outputs": [], "source": [ "matcher = hpmcm.ShearMatch.createShearMatch(pixelR2Cut=PIXEL_R2CUT, pixelMatchScale=PIXEL_MATCH_SCALE, deshear=-1*shear)" ] }, { "cell_type": "code", "execution_count": null, "id": "5f2b49c5-36cd-47a6-9480-c207fadfe552", "metadata": {}, "outputs": [], "source": [ "matcher.reduceData(SOURCE_TABLEFILES, VISIT_IDS)" ] }, { "cell_type": "markdown", "id": "22d7092b-b4a2-4f43-bd46-5c8c04c708e3", "metadata": {}, "source": [ "#### This should have made 200 x 200 cells" ] }, { "cell_type": "code", "execution_count": null, "id": "e4508b9e-ee8e-4d44-94c5-21dd594b49a1", "metadata": {}, "outputs": [], "source": [ "matcher.n_cell" ] }, { "cell_type": "markdown", "id": "3eb110bf-913a-4292-9c90-d5a8191aa01b", "metadata": {}, "source": [ "#### Run the data\n", "\n", "Note the option to run all the cells. By default we only run a small subset for testing" ] }, { "cell_type": "code", "execution_count": null, "id": "70f180a5-a400-49c7-97c3-f5bac3204da5", "metadata": {}, "outputs": [], "source": [ "do_partial = True\n", "if do_partial:\n", " x_range = range(50, 70)\n", " y_range = range(170, 190)\n", " #xRange = [55]\n", " #yRange = [170]\n", " matcher.analysisLoop(x_range, y_range)\n", "else:\n", " matcher.analysisLoop()" ] }, { "cell_type": "markdown", "id": "46e95402-fbda-423e-83ec-c7d2adb60100", "metadata": {}, "source": [ "#### Show the source counts map for a single cell\n", "\n", "The x and y axes here are the in the cell frame.\n", "The color scale shows the number of sources per/pixel.\n", "The analysis looks for clusters of adjacent pixels with counts." ] }, { "cell_type": "code", "execution_count": null, "id": "791146c8-ec2d-493e-b18a-3907565138d8", "metadata": {}, "outputs": [], "source": [ "cell = matcher.cell_dict[matcher.getCellIdx(50, 170)]\n", "od = cell.analyze(None, 4)\n", "_ = plt.imshow(od['counts_map'], origin='lower')\n", "_ = plt.colorbar(label=\"n sources / pixel\")\n", "_ = plt.xlabel(r\"$x_{\\rm cell}$ [pixels]\")\n", "_ = plt.ylabel(r\"$y_{\\rm cell}$ [pixels]\")" ] }, { "cell_type": "markdown", "id": "8dab7b77-96f0-4fe2-96fb-0284ce8f0b07", "metadata": {}, "source": [ "#### Show a single cluster\n", "\n", "The x and y axes here are the in the cluster frame for a single cluster.\n", "The color scale shows the number of sources per/pixel.\n", "\n", "The `x` markers are the original source postions. The `o` makters are the deshear positions.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7bb87ecf-0aa7-4006-8e8b-8eaafcf3a8db", "metadata": {}, "outputs": [], "source": [ "cluster = list(cell.cluster_dict.values())[0]\n", "fig = hpmcm.viz_utils.showCluster(od['image'], cluster, cell)\n", "_ = fig.axes[0].set_xlim(-1, 1)\n", "_ = fig.axes[0].set_ylim(0, 2)\n", "_ = fig.axes[0].set_xlabel(r\"$x_{\\rm cluster}$ [pixels]\")\n", "_ = fig.axes[0].set_ylabel(r\"$y_{\\rm cluster}$ [pixels]\")" ] }, { "cell_type": "markdown", "id": "66548e0b-89a1-4b54-9ade-4bf7401e2079", "metadata": {}, "source": [ "#### Extract the output of the matching\n", "\n", "There are a few empty cells to play around with the output data.\n", "\n", "`stats` and `shear_stats` are both tuples of pandas.DataFrame " ] }, { "cell_type": "code", "execution_count": null, "id": "22dbca58-4278-43c0-b3bf-751f47a093ad", "metadata": {}, "outputs": [], "source": [ "stats = matcher.extractStats()\n", "shear_stats = matcher.extractShearStats()\n", "obj_shear = shear_stats[1]" ] }, { "cell_type": "code", "execution_count": null, "id": "ddabe9fb-427f-4899-b5de-816abacbea45", "metadata": {}, "outputs": [], "source": [ "stats[0]" ] }, { "cell_type": "markdown", "id": "cf3ec9b9-f3b5-4aed-acb5-a2465510d32e", "metadata": {}, "source": [ "#### Get the offsets between the cluster centroid and the sources\n", "\n", "This is to check that the deshearing is correctly applied" ] }, { "cell_type": "code", "execution_count": null, "id": "4b9466f5-a16f-401d-b91f-2096ba841655", "metadata": {}, "outputs": [], "source": [ "def get_offsets(matcher):\n", " n = 0\n", " dd = {\n", " 0:dict(dx=[], dy=[], x=[], y=[]), \n", " 1:dict(dx=[], dy=[], x=[], y=[]), \n", " 2:dict(dx=[], dy=[], x=[], y=[]), \n", " 3:dict(dx=[], dy=[], x=[], y=[]), \n", " 4:dict(dx=[], dy=[], x=[], y=[]), \n", " }\n", " for cellData in matcher.cell_dict.values():\n", " n += len(cellData.data[0])\n", " for obj in cellData.object_dict.values():\n", " if not obj.n_unique == 5 and obj.n_src == 5:\n", " continue\n", " for iCat in range(5):\n", " mask = obj.catalog_id == iCat\n", " if mask.sum() == 0:\n", " continue\n", " for dx, dy in zip((obj.x_pix[mask] - obj.x_cent), (obj.y_pix[mask] - obj.y_cent)):\n", " dd[iCat][\"dx\"].append(dx)\n", " dd[iCat][\"dy\"].append(dy)\n", " dd[iCat][\"x\"].append(float(obj.data[mask].iloc[0].x_cell))\n", " dd[iCat][\"y\"].append(float(obj.data[mask].iloc[0].y_cell))\n", "\n", " for i in range(5):\n", " dd[i]['dx'] = np.array(dd[i]['dx'])\n", " dd[i]['dy'] = np.array(dd[i]['dy'])\n", " dd[i]['x'] = np.array(dd[i]['x'])\n", " dd[i]['y'] = np.array(dd[i]['y'])\n", " print(n)\n", " return dd \n", " \n", " " ] }, { "cell_type": "code", "execution_count": null, "id": "0fe094c7-efeb-49d0-b08d-0b1823bcedad", "metadata": {}, "outputs": [], "source": [ "dd = get_offsets(matcher)" ] }, { "cell_type": "markdown", "id": "c4c85519-e236-49b9-b068-ebf8aa1571b9", "metadata": {}, "source": [ "#### Plots the residuals, they should be flat" ] }, { "cell_type": "code", "execution_count": null, "id": "dadfaaa4-11ac-4418-b802-5cd0c5320c0a", "metadata": {}, "outputs": [], "source": [ "_ = plt.scatter(dd[4]['x'], dd[4]['dx'])" ] }, { "cell_type": "markdown", "id": "552fbfde-e913-48ba-b81e-160e22571e9b", "metadata": {}, "source": [ "#### Look at how the sources lie within the cells" ] }, { "cell_type": "code", "execution_count": null, "id": "c3b637c4-290d-4ef4-a592-68ad8146a4d8", "metadata": {}, "outputs": [], "source": [ "_ = plt.hist(matcher.full_data[0].x_cell_coadd, bins=np.linspace(-100, 100, 201))" ] }, { "cell_type": "code", "execution_count": null, "id": "53ecf906-9d0b-403a-8c8e-240b3e3ff1d8", "metadata": {}, "outputs": [], "source": [ "_ = plt.hist(matcher.full_data[0].y_cell_coadd, bins=np.linspace(-100, 100, 201))" ] }, { "cell_type": "markdown", "id": "92c67824-dc58-42fe-b787-01048fa4a7d0", "metadata": {}, "source": [ "#### Classify the objects by match type\n", "\n", "This looks at the characteristics of the matched objects and categorizes them." ] }, { "cell_type": "code", "execution_count": null, "id": "79c084b7-923e-41e3-813a-411a53ecdcad", "metadata": {}, "outputs": [], "source": [ "obj_lists = hpmcm.classify.classifyObjects(matcher, snr_cut=10.)\n", "hpmcm.classify.printObjectTypes(obj_lists)" ] }, { "cell_type": "markdown", "id": "8fa170d6-b454-4e1e-929b-daf63b773daf", "metadata": {}, "source": [ "#### Measure the matching efficiency for objects above the SNRCut" ] }, { "cell_type": "code", "execution_count": null, "id": "24bdd872-fad4-467a-950e-68e6a0998175", "metadata": {}, "outputs": [], "source": [ "n_good = len(obj_lists['ideal'])\n", "bad_list = ['edge_mixed', 'edge_missing', 'edge_extra', 'orphan', 'missing', 'two_missing', 'many_missing', 'extra', 'caught']\n", "n_bad = np.sum([len(obj_lists[x]) for x in bad_list])\n", "effic = n_good/(n_good+n_bad)\n", "effic_err = np.sqrt(effic*(1-effic)/(n_good+n_bad))\n", "print(f\"Effic: {effic:.5} +- {effic_err:.5f}\")" ] }, { "cell_type": "markdown", "id": "65c36b4b-901a-4e21-9196-d90f5645a055", "metadata": {}, "source": [ "#### Classify the clusters by match type\n", "\n", "This looks at the characteristics of the matched cluster and categorizes them. " ] }, { "cell_type": "code", "execution_count": null, "id": "ee22a960-3193-4d5f-bb76-0a6aa193784b", "metadata": {}, "outputs": [], "source": [ "cluster_lists = hpmcm.classify.classifyClusters(matcher, snr_cut=10.)\n", "hpmcm.classify.printClusterTypes(cluster_lists)" ] }, { "cell_type": "markdown", "id": "1552d538-dcc1-49ff-a865-685283a9fccf", "metadata": {}, "source": [ "#### Display a few objects\n", "\n", "The various markers show the sources from different shear catalogs: `ns=.`, `1m = <`, `1p = >`, `2m = ^`, `2p = v`. " ] }, { "cell_type": "code", "execution_count": null, "id": "96301ea4-49bc-4109-8f19-726f9ccd3e27", "metadata": {}, "outputs": [], "source": [ "_ = hpmcm.viz_utils.showShearObjs(matcher, cluster_lists['ideal'][5])" ] }, { "cell_type": "code", "execution_count": null, "id": "4cb7f95b-122f-43d4-918c-0de1f5a620ae", "metadata": {}, "outputs": [], "source": [ "_ = hpmcm.viz_utils.showShearObj(matcher, obj_lists['many_missing'][0])" ] }, { "cell_type": "code", "execution_count": null, "id": "9e6b438f-8256-4173-9b0a-70ff89335084", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "1a480019-b559-46eb-bb58-a357c550361b", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "5fcf73b8-8152-45c4-b022-8906883d4755", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 5 }