hpmcm.match module
- class hpmcm.match.Match(**kwargs)[source]
Bases:
objectClass to do N-way matching
Uses a provided WCS to define a Skymap that covers the full region begin matched.
Uses that WCS to assign pixel locations to all sources in the input catalogs
Iterates over cells and does source clustering in each cell using Footprint detection on a Skymap of source counts per pixel.
Assigns each input source to a cluster.
At that stage the clusters are not the final product as they can include more than one soruce from a given catalog.
Loops over clusters and processes each cluster to resolve confusion.
If there is not a unqiue source per-catalog redo the clustering with half-size pixels to try to split the cluster (down to minimum pixel scale)
- Parameters:
kwargs (Any)
- pix_size
Pixel size in arcseconds
- Type:
float
- n_pix_side
Number of pixels in the match region
- Type:
int
- cell_size
Number of pixels in a Cell
- Type:
int
- cell_buffer
Number of overlapping pixel in a Cell
- Type:
int
- cell_max_object
Max number of objects in a cell, used to make unique IDs
- Type:
int
- max_sub_division
Maximum number of cell sub-divisions
- Type:
int
- pixel_r2_cut
Distance cut for Object membership, in pixels**2
- Type:
float
- n_cell
Number of cells in match region
- Type:
np.ndarray
- full_data
Full input DataFrames
- Type:
list[DataFrame]
- red_data
Reduced DataFrames with only the columns needed for matching
- Type:
list[DataFrame]
Notes
This expectes a list of parquet files with pandas DataFrames. The expected columns depend on which sub-class of Match is being used.
Four output tables are produced:
Key
Class
_cluster_assoc
_cluster_stats
_object_assoc
_object_stats
- analysisLoop(x_range=None, y_range=None)[source]
Does matching for all cells.
This stores the results, but does not write or return them.
- Return type:
None- Parameters:
x_range (Iterable | None) – Range of cells to analysze in X. None -> Entire range.
y_range (Iterable | None) – Range of cells to analysis in Y. None -> Entire range.
- analyzeCell(ix, iy, full_data=False)[source]
Analyze a single cell
- Return type:
dict|None- Parameters:
ix (int) – Cell index in x-coord
iy (int) – Cell index in y-coord
full_data (bool)
- Return type:
Output of cell analysis
Notes
cell_data : CellData : The analysis data for the Cell
image : np.ndarray : Image of cell source counts map
countsMap : np.ndarray : Numpy array with cell source counts
clusters : FootprintSet : Clusters as dectected by finding FootprintSet on source counts map
clusterKey : np.ndarray : Map of cell with pixels filled with index of associated Footprints
Notes
If full_data is False, only cell_data will be returned
-
extraCols:
list[str] = []
- extractStats()[source]
Extracts cluster statisistics
- Return type:
list[DataFrame]- Returns:
DataFrames with matching info,
- getCellIdx(ix, iy)[source]
Get the Index to use for a given cell
- Return type:
int- Parameters:
ix (int)
iy (int)
- getCluster(i_k)[source]
Get a particular cluster
- Return type:
- Parameters:
i_k (tuple[int, int]) – CellId, ClusterId
- Return type:
Requested cluster
- getIdOffset(ix, iy)[source]
Get the ID offset to use for a given cell
- Return type:
int- Parameters:
ix (int)
iy (int)
- getObject(i_k)[source]
Get a particular object
- Return type:
- Parameters:
i_k (tuple[int, int]) – CellId, ObjectId
- Return type:
Requested object
- inputTableClass
alias of
SourceTable
- pixToWorld(x_pix, y_pix)[source]
Convert local coords in pixels to world coordinates (RA, DEC)
- Return type:
tuple[ndarray,ndarray]- Parameters:
x_pix (ndarray)
y_pix (ndarray)
- reduceData(input_files, catalog_id)[source]
Read input files and filter out only the columns we need
Each input file should have an associated catalog_id. This is used to test if we have more than one-source per input catalog.
If the inputs files have a pre-defined ID associated with them that can be used. Otherwise it is fine just to give a range from 0 to nInputs.
- Return type:
None- Parameters:
input_files (list[str])
catalog_id (list[int])