Compare metadetect catalog types

This is a simple script to measure the match efficiency between different metadectect shear catalogs

Standard imports

[1]:
import hpmcm
import glob
import os
import numpy as np

Set up the configuration

[2]:
DATADIR = "test_data"   # Input data directory
shear_st = "0p01"       # Applied shear as a string
shear = 0.01            # Decimal version of applied shear
shear_type = "wmom"     # which object characterization to use
tract = 10463           # which tract to study

SOURCE_TABLEFILES = sorted(glob.glob(os.path.join(DATADIR, f"shear_{shear_type}_{shear_st}_uncleaned_{tract}_*.pq")))
SOURCE_TABLEFILES.reverse()
VISIT_IDS = np.arange(len(SOURCE_TABLEFILES))

PIXEL_R2CUT = 4.         # Cut at distance**2 = 4 pixels
PIXEL_MATCH_SCALE = 1    # Use pixel scale to do matching

Make the matcher, reduce the data

[3]:
matcher = hpmcm.ShearMatch.createShearMatch(pixel_r2_cut=PIXEL_R2CUT, pixel_match_scale=PIXEL_MATCH_SCALE)
matcher.reduceData(SOURCE_TABLEFILES, VISIT_IDS)

Run the data

Note the option to run all the cells. By default we only run a small subset for testing

[4]:
do_partial = True
if do_partial:
    x_range = range(50, 70)
    y_range = range(170, 190)
    matcher.analysisLoop(x_range, y_range)
else:
    matcher.analysisLoop()
 50!
......... 60!
......... Done!

Classify the objects by match type

This looks at the characteristics of the matched objects and categorizes them.

[5]:
obj_lists = hpmcm.classify.classifyObjects(matcher)
hpmcm.classify.printObjectTypes(obj_lists)
All Objects:                                    2771
cut 1                                           384
cut 2                                           38
Used:                                           2349
good (n source from n catalogs):                1327
good faint                                      832
faint (< n sources, snr < cut):                 171
mixed (n source from < n catalogs):             0
edge_mixed (mixed near edge of cell):           0
edge_missing (< n sources, near edge of cell):  2
edge_extra (> n sources, near edge of cell):    0
faint (< n sources, snr < cut):                 171
orphan (split off from larger cluster           2
one missing (n-1 sources, not near edge):       1
two missing (n-2 sources, not near edge):       2
many missing (< n-2 sources, not near edge):    11
extra (> n sources, not near edge):             1
[6]:
n_good = len(obj_lists['ideal'])
bad_list = ['edge_mixed', 'edge_missing', 'edge_extra', 'orphan', 'missing', 'two_missing', 'many_missing', 'extra', 'caught']
n_bad = np.sum([len(obj_lists[x]) for x in bad_list])
[7]:
n_good/(n_good+n_bad)
[7]:
np.float64(0.9858841010401189)
[ ]:

[ ]: