AbstractCartesianMatchEngine
, AbstractSkyMatchEngine
, AnisotropicCartesianMatchEngine
, CombinedMatchEngine
, CuboidCartesianMatchEngine
, EllipseCartesianMatchEngine
, EllipseSkyMatchEngine
, EqualsMatchEngine
, ErrorCartesianMatchEngine
, ErrorSkyMatchEngine
, FixedSkyMatchEngine
, IsotropicCartesianMatchEngine
, SphericalPolarMatchEngine
public interface MatchEngine
getTupleInfos()
method. Typically a tuple will be a list of coordinates,
such as RA and Dec.
The business end of the interface consists of two methods. One tests whether two tuples count as matching or not, and assigns a closeness score if they are (in practice, this is likely to compare corresponding elements of the two submitted tuples allowing for some error in each one). The second is a bit more subtle: it must identify a set of bins into which possible matches for the tuple might fall. For the case of coordinate matching with errors, you would need to chop the whole possible space into a discrete set of zones, each with a given key, and return the key for each zone near enough to the submitted tuple (point) that it might contain a match for it.
Formally, the requirements for correct implementations of this interface are as follows:
It may help to think of all this as a sort of fuzzy hash.
Modifier and Type | Field | Description |
---|---|---|
static java.lang.Object[] |
NO_BINS |
Convenience constant - it's a zero-length array of objects, suitable
for returning from
getBins(java.lang.Object[]) if no match can result. |
Modifier and Type | Method | Description |
---|---|---|
boolean |
canBoundMatch() |
Indicates that the
getMatchBounds(uk.ac.starlink.table.join.NdRange[], int) method can be invoked
to provide some sort of useful result. |
java.lang.Object[] |
getBins(java.lang.Object[] tuple) |
Returns a set of keys for bins into which possible matches for
a given tuple might fall.
|
NdRange |
getMatchBounds(NdRange[] inRanges,
int index) |
Given a range of tuple values, returns a range outside which
no match to anything within that range can result.
|
DescribedValue[] |
getMatchParameters() |
Returns a set of DescribedValue objects whose values can be modified
to modify the matching criteria.
|
ValueInfo |
getMatchScoreInfo() |
Returns a description of the value returned by the
matchScore(java.lang.Object[], java.lang.Object[]) method. |
double |
getScoreScale() |
Returns a scale value for the match score.
|
DescribedValue[] |
getTuningParameters() |
Returns a set of DescribedValue objects whose values can be modified
to tune the performance of the match.
|
ValueInfo[] |
getTupleInfos() |
Returns a set of ValueInfo objects indicating what is required for
the elements of each tuple.
|
double |
matchScore(java.lang.Object[] tuple1,
java.lang.Object[] tuple2) |
Indicates whether two tuples count as matching each other, and if
so how closely.
|
static final java.lang.Object[] NO_BINS
getBins(java.lang.Object[])
if no match can result.java.lang.Object[] getBins(java.lang.Object[] tuple)
tuple
- tupledouble matchScore(java.lang.Object[] tuple1, java.lang.Object[] tuple2)
If there's no reason to do otherwise, the range 0..1 is recommended for successul matches. However, if the result has some sort of physical meaning (such as a distance in real space) that may be used instead.
tuple1
- one tupletuple2
- the other tupleValueInfo getMatchScoreInfo()
matchScore(java.lang.Object[], java.lang.Object[])
method. The content class should be numeric
(though need not be Double
), and the name,
description and units should be descriptive of whatever the
physical significance of the value is.
If the result of matchScore
is not interesting
(for instance, if it's always either 0 or -1),
null
may be returned.double getScoreScale()
matchScore
/getScoreScale()
is of order unity, and is thus comparable between
different match engines.
As a general rule, the result should be the maximum value ever
returned from the matchScore
method,
corresponding to the least good successful match.
For binary MatchEngine implementations
(all matches are either score=0 or failures)
a value of 1 is recommended.
If nothing reliable can be said about the scale, NaN may be returned.
ValueInfo[] getTupleInfos()
DescribedValue[] getMatchParameters()
DescribedValue.setValue(java.lang.Object)
on the
returned objects.DescribedValue[] getTuningParameters()
DescribedValue.setValue(java.lang.Object)
on the
returned objects.
Changing these values will make no difference to the output of
matchScore(java.lang.Object[], java.lang.Object[])
, but may change the output of getBins(java.lang.Object[])
.
This may change the CPU and memory requirements of the match,
but will not change the result. The default value should be
something sensible, so that setting the value of these parameters
is not in general required.
NdRange getMatchBounds(NdRange[] inRanges, int index)
Both the input and output rectangles are specified by tuples representing its opposite corners; equivalently, they are the minimum and maximum values of each tuple element. In either the input or output min/max tuples, any element may be null to indicate that no information is available on the bounds of that tuple element (coordinate).
An array of n-dimensional ranges is given, though only one of them
(specified by the index
value) forms the basis for
the output range. The other ranges in the input array may in some
cases be needed as context in order to do the calculation.
If the match error is fixed, only the single input n-d range is needed
to work out the single output range. However, if the errors are
obtained by looking at the tuples themselves (match errors are per-row)
then in general the broadening has to be done using the maximum
error of any of the tables involved in the match,
not just the one to be broadened.
For a long time, I didn't realise this, so versions of this software
up to STIL v3.0-14 (Oct 2015) were not correctly broadening these
ranges, leading to potentially missed associations near the edge
of bounded regions.
This method can be used by match algorithms which know in advance the range of coordinates they will match against and wish to reduce workload by not attempting matches which are bound to fail.
For example, a 1-d Cartesian match engine with an isotropic match error 0.5 would turn input values of ((0,200),(10,210)) into output values ((-0.5,199.5),(10.5,210.5)).
This method will only be called if canBoundMatch()
returns true. Thus engines that cannot provide any useful
information along these lines (for instance because none of its
tuple elements is Comparable
) do not need to
implement it in a meaningful way.
inRanges
- array of input ranges for the tables on which
the match will take place;
each element bounds the values for each tuple
element in its corresponding table
in a possible match
(to put it another way - each element gives the
coordinates of the opposite corners of a tuple-space
rectangle covered by one input table)index
- which element of the inRanges
array
for which the broadened output value is requiredinRanges[index]
broadened by errorscanBoundMatch()
boolean canBoundMatch()
getMatchBounds(uk.ac.starlink.table.join.NdRange[], int)
method can be invoked
to provide some sort of useful result.Copyright © 2018 Central Laboratory of the Research Councils. All Rights Reserved.