public class MatchStarTables
extends java.lang.Object
Modifier and Type | Field | Description |
---|---|---|
static ValueInfo |
GRP_ID_INFO |
Defines the characteristics of a table column which represents the
ID of a group of matched row objects.
|
static ValueInfo |
GRP_SIZE_INFO |
Defines the characteristics of a table column which represents the
number of matched row objects in a given group (with the same group ID).
|
Constructor | Description |
---|---|
MatchStarTables() |
Modifier and Type | Method | Description |
---|---|---|
static java.util.Map |
findGroups(LinkSet links) |
|
static StarTable |
makeInternalMatchTable(int iTable,
LinkSet rowLinks,
long rowCount) |
Analyses a set of RowLinks to mark as linked rows of a given table.
|
static StarTable |
makeJoinTable(StarTable[] tables,
LinkSet rowLinks,
boolean addGroups,
JoinFixAction[] fixActs,
ValueInfo matchScoreInfo) |
Constructs a table made out of a set of constituent tables
joined together according to a
LinkSet describing
row matches. |
static StarTable |
makeJoinTable(StarTable table1,
StarTable table2,
LinkSet pairs,
JoinType joinType,
boolean addGroups,
JoinFixAction[] fixActs,
ValueInfo matchScoreInfo) |
|
static StarTable |
makeParallelMatchTable(StarTable table,
int iTable,
LinkSet links,
int width,
int minSize,
int maxSize,
JoinFixAction[] fixActs) |
Constructs a new wide table from a single given base table and a set of
RowLinks.
|
static StarTable |
makeSequentialJoinTable(StarTable[] tables,
LinkSet rowLinks,
JoinFixAction[] fixActs,
ValueInfo matchScoreInfo) |
Constructs a non-random table made out of a set of possibly non-random
constituent tables joined together according to a LinkSet.
|
public static final ValueInfo GRP_ID_INFO
public static final ValueInfo GRP_SIZE_INFO
public static StarTable makeJoinTable(StarTable table1, StarTable table2, LinkSet pairs, JoinType joinType, boolean addGroups, JoinFixAction[] fixActs, ValueInfo matchScoreInfo)
LinkSet
describing row matches and
a flag determining what conditions on a RowLink
give you an output row.
The columns of the resulting table are made by appending the
columns of the constituent tables side by side.
The tables array determines which tables columns appear in the output table. It must have (at least) as many elements as the highest table index in the RowLink set. Table data will be picked from the n'th table in this array for RowRef elements with a tableIndex of n. If the nth element is null, the corresponding columns will not appear in the output table.
The matchScoreInfo
parameter is optional.
If it is non-null, then an additional column, described by
matchScoreInfo
, will be added to the table containing
the score
values from any RowLink2
s in
links
. The content class of matchScoreInfo
should be Number
or one of its subclasses.
This is a convenience method which calls the other
makeJoinTable
method.
table1
- first input tabletable2
- second input tablepairs
- set of links each representing a matched pair of rows
between table1
and table2
.
Contents of this set may be modified by this routinejoinType
- describes how the input list of matched pairs
is used to generate an output sequence of rowsaddGroups
- flag which indicates whether the output table
should, if appropriate, include GRP_ID_INFO
and
GRP_SIZE_INFO
columnsfixActs
- actions to take for deduplicating column names
(array of the same length as tables)matchScoreInfo
- may supply information about the meaning
of the match scorespublic static StarTable makeJoinTable(StarTable[] tables, LinkSet rowLinks, boolean addGroups, JoinFixAction[] fixActs, ValueInfo matchScoreInfo)
LinkSet
describing
row matches.
The columns of the resulting table are made by appending the
columns of the constituent tables side by side.
Each row in the resulting table corresponds to one RowLink
entry in a set rowLinks; if that RowLink
contains a row from one of the tables being joined here,
the columns corresponding to that table are filled in.
If it contains multiple rows from that table, an arbitrary one
of them is filled in.
The tables array determines which tables columns appear in the output table. It must have (at least) as many elements as the highest table index in the RowLink set. Table data will be picked from the n'th table in this array for RowRef elements with a tableIndex of n. If the nth element is null, the corresponding columns will not appear in the output table.
The matchScoreInfo
parameter is optional.
If it is non-null, then an additional column, described by
matchScoreInfo
, will be added to the table containing
the score
values from the RowLink
s in
links
. The content class of matchScoreInfo
should be Number
or one of its subclasses.
tables
- array of constituent tablesrowLinks
- set of RowLink objects which define which rows
in one table are associated with which rows in the othersaddGroups
- flag which indicates whether the output table
should, if appropriate, include GRP_ID_INFO
and
GRP_SIZE_INFO
columnsfixActs
- actions to take for deduplicating column names
(array of the same length as tables)matchScoreInfo
- may supply information about the meaning
of the link scorespublic static StarTable makeSequentialJoinTable(StarTable[] tables, LinkSet rowLinks, JoinFixAction[] fixActs, ValueInfo matchScoreInfo)
tables
- array of constituent tablesrowLinks
- link set defining the matchfixActs
- actions to take for deduplicating column names
(array of the same size as tables
)matchScoreInfo
- may suply information about the meaning of
the match scores, if presentpublic static StarTable makeInternalMatchTable(int iTable, LinkSet rowLinks, long rowCount)
GRP_ID_INFO
and GRP_SIZE_INFO
.
Rows of the table linked together
by rowLinks are assigned the same integer value in
the new GRP_ID_INFO column, and the GRP_SIZE_INFO column
indicates how many rows are linked together in this way.
Each group corresponds to a single RowLink; if a row is part of
more than one RowLink then only one of them will be recorded
in the new columns.
Any rows linked in rowLinks which do not refer to
table have null entries in these columns.iTable
- the index of the table in which internal matches
are to be soughtrowLinks
- a collection of RowLink
objects
linking groups of rows togetherrowCount
- number of rows in the returned table
(must be large enough
to accommodate the indices in rowLinks)public static StarTable makeParallelMatchTable(StarTable table, int iTable, LinkSet links, int width, int minSize, int maxSize, JoinFixAction[] fixActs)
table
- input tableiTable
- index corresponding to this table in the
rowLinks setlinks
- collection of RowLink
objects describing the
matches. This collection is modified on exitwidth
- width of the output table as a multiple of the
width of the input tableminSize
- minimum number of entries in a RowLink to count as
an output rowmaxSize
- maximum number of entries in a RowLink to count as
an output row; also the width of the output table
(as a multiple of the width of the input table)fixActs
- actions to take for deduplicating column names
(width-element array, or null)public static java.util.Map findGroups(LinkSet links)
RowLink
s to LinkGroup
s
which describes connected groups of links in the input LinkSet.
A related group is one in which the RowRefs of its constituent
RowLinks form a connected graph in which RowRefs are the nodes
and RowLinks are the edges.
A LinkGroup with a link count of more than one therefore
represents an ambiguous match, that is one in which one or more
of its RowRefs is contained in more than one RowLink in the
original LinkSet.
The returned map contains entries only for non-trivial LinkGroups, that is ones which contain more than one link.
links
- link set representing a set of matcheslinks
Copyright © 2018 Central Laboratory of the Research Councils. All Rights Reserved.