astromodule.table.selfmatch#

selfmatch(table: Table | DataFrame | str | Path | BufferedIOBase | RawIOBase | TextIOBase, radius: float | Quantity, action: Literal['identify', 'keep0', 'keep1', 'wide2', 'wideN'] = 'keep1', ra: str | None = None, dec: str | None = None, fmt: Literal['fits', 'csv', 'parquet'] = 'parquet') DataFrame | None[source]#

Performs a selfmatch in a table (crossmatch agains the same table) using STILTS [2] as a backend (the same backend of TOPCAT [1]). This is useful for duplicates removal, groups detection, etc. This function spawns a subprocess invoking the tmatch1 [3] tool of the STILTS executable.

Parameters:
tableTableLike | PathOrFile

The table that will be crossmatched. This parameter accepts a table-like object (pandas dataframe, astropy table), a path to a file represented as a str or pathlib.Path object, or a file object (BinaryIO, StringIO, file-descriptor, etc).

radiusfloat | u.Quantity

The crossmatch max error radius. This function accepts a float value, that will be interpreted as arcsec unit, or a astropy.units.Quantity

actionLiteral[‘identify’, ‘keep0’, ‘keep1’, ‘wide2’, ‘wideN’], optional

Determines the form of the table which will be output as a result of the internal match.

  • identify: The output table is the same as the input table except that it contains two additional columns, GroupID and GroupSize, following the input columns. Each group of rows which matched is assigned a unique integer, recorded in the GroupID column, and the size of each group is recorded in the GroupSize column. Rows which don’t match any others (singles) have null values in both these columns.

  • keep0: The result is a new table containing only “single” rows, that is ones which don’t match any other rows in the table. Any other rows are thrown out.

  • keep1: The result is a new table in which only one row (the first in the input table order) from each group of matching ones is retained. A subsequent intra-table match with the same criteria would therefore show no matches.

  • wideN: The result is a new “wide” table consisting of matched rows in the input table stacked next to each other. Only groups of exactly N rows in the input table are used to form the output table; each row of the output table consists of the columns of the first group member, followed by the columns of the second group member and so on. The output table therefore has N times as many columns as the input table. The column names in the new table have _1, _2, … appended to them to avoid duplication.

rastr | None, optional

The name of the Right Ascension (RA) column. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

decstr | None, optional

The name of the Declination (DEC) column. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

fmtLiteral[‘fits’, ‘csv’], optional

This function converts the input table to file before passing to stilts backend. This parameter can be used to set the intermediate file type. Fits is faster and is the default file type.

Returns:
pd.DataFrame | None

A table of resulting selfmatch

References