astromodule.table.crossmatch#

crossmatch(table1: ~astropy.table.table.Table | ~pandas.core.frame.DataFrame | str | ~pathlib.Path | ~io.BufferedIOBase | ~io.RawIOBase | ~io.TextIOBase, table2: ~astropy.table.table.Table | ~pandas.core.frame.DataFrame | str | ~pathlib.Path | ~io.BufferedIOBase | ~io.RawIOBase | ~io.TextIOBase, ra1: str | None = None, dec1: str | None = None, ra2: str | None = None, dec2: str | None = None, radius: float | ~astropy.units.quantity.Quantity = <Quantity 1. arcsec>, join: ~typing.Literal['1and2', '1or2', 'all1', 'all2', '1not2', '2not1', '1xor2'] = '1and2', find: ~typing.Literal['all', 'best', 'best1', 'best2'] = 'best', fixcols: ~typing.Literal['dups', 'all', 'none'] = 'dups', suffix1: str = '_1', suffix2: str = '_2', scorecol: str | None = 'xmatch_sep', fmt: ~typing.Literal['fits', 'csv', 'parquet'] = 'parquet') DataFrame | None[source]#

Performs a crossmatch between two tables using STILTS [1] as the backend. This function spawns a subprocess invoking the tmatch2 [2] tool of the STILTS executable.

Parameters:
table1TableLike | PathOrFile

The first table that will be crossmatched. This parameter accepts a table-like object (pandas dataframe, astropy table), a path to a file represented as a string or pathlib.Path object, or a file object (BinaryIO, StringIO, file-descriptor, etc).

table2TableLike | PathOrFile

The second table that will be crossmatched. This parameter accepts a table-like object (pandas dataframe, astropy table), a path to a file represented as a string or pathlib.Path object, or a file object (BinaryIO, StringIO, file-descriptor, etc).

ra1str | None, optional

The name of the Right Ascension (RA) column in the first table. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

dec1str | None, optional

The name of the Declination (DEC) column in the first table. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

ra2str | None, optional

The name of the Right Ascension (RA) column in the second table. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

dec2str | None, optional

The name of the Declination (DEC) column in the second table. If None is passed, this function will try to guess the RA column name based on predefined patterns using the function guess_coords_columns, see this function’s documentation for more details.

radiusfloat | u.Quantity, optional

The crossmatch max error radius. This function accepts a float value, that will be interpreted as arcsec unit, or a Quantity

joinLiteral[‘1and2’, ‘1or2’, ‘all1’, ‘all2’, ‘1not2’, ‘2not1’, ‘1xor2’], optional

Determines which rows are included in the output table. The matching algorithm determines which of the rows from the first table correspond to which rows from the second. This parameter determines what to do with that information. Perhaps the most obvious thing is to write out a table containing only rows which correspond to a row in both of the two input tables. However, you may also want to see the unmatched rows from one or both input tables, or rows present in one table but unmatched in the other, or other possibilities. The options are:

  • 1and2: An output row for each row represented in both input tables (INNER JOIN)

  • 1or2: An output row for each row represented in either or both of the input tables (FULL OUTER JOIN)

  • all1: An output row for each matched or unmatched row in table 1 (LEFT OUTER JOIN)

  • all2: An output row for each matched or unmatched row in table 2 (RIGHT OUTER JOIN)

  • 1not2: An output row only for rows which appear in the first table but are not matched in the second table

  • 2not1: An output row only for rows which appear in the second table but are not matched in the first table

  • 1xor2: An output row only for rows represented in one of the input tables but not the other one

findLiteral[‘all’, ‘best’, ‘best1’, ‘best2’], optional

Determines what happens when a row in one table can be matched by more than one row in the other table. The options are:

  • all: All matches. Every match between the two tables is included in the result. Rows from both of the input tables may appear multiple times in the result.

  • best: Best match, symmetric. The best pairs are selected in a way which treats the two tables symmetrically. Any input row which appears in one result pair is disqualified from appearing in any other result pair, so each row from both input tables will appear in at most one row in the result.

  • best1: Best match for each Table 1 row. For each row in table 1, only the best match from table 2 will appear in the result. Each row from table 1 will appear a maximum of once in the result, but rows from table 2 may appear multiple times.

  • best2: Best match for each Table 2 row. For each row in table 2, only the best match from table 1 will appear in the result. Each row from table 2 will appear a maximum of once in the result, but rows from table 1 may appear multiple times.

The differences between best, best1 and best2 are a bit subtle. In cases where it’s obvious which object in each table is the best match for which object in the other, choosing betwen these options will not affect the result. However, in crowded fields (where the distance between objects within one or both tables is typically similar to or smaller than the specified match radius) it will make a difference. In this case one of the asymmetric options (best1 or best2) is usually more appropriate than best, but you’ll have to think about which of them suits your requirements. The performance (time and memory usage) of the match may also differ between these options, especially if one table is much bigger than the other.

fixcolsLiteral[‘dups’, ‘all’, ‘none’], optional

Determines how input columns are renamed before use in the output table. The choices are:

  • none: columns are not renamed

  • dups: columns which would otherwise have duplicate names in the output will be renamed to indicate which table they came from

  • all: all columns will be renamed to indicate which table they came from

If columns are renamed, the new ones are determined by suffix* parameters.

suffix1str, optional

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from table 1.

suffix2str, optional

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from table 2.

scorecolstr | None, optional

Gives the name of a column in the output table to contain the “match score” for each pairwise match. The meaning of this column is dependent on the chosen matcher, but it typically represents a distance of some kind between the two matching points. If None is chosen, no score column will be inserted in the output table. The default value of this parameter depends on matcher.

fmtLiteral[‘fits’, ‘csv’], optional

This function converts the two input tables to files to pass to stilts backend. This parameter can be used to set the intermediate file types. Fits is faster and is the default file type.

Returns:
pd.DataFrame | None

The result table as a pandas dataframe

References