DNA Overlay Charts are User Items that present DNA evidence in a modified descendant chart format where lines of descent are arranged in rows and columns. The goal is to review and analyze DNA evidence within a tree structure supported by traditional genealogical evidence. Boxes in the chart include either name and lifespan information, or a kit number; kit numbers are used in place of names when the person has an associated DNA test result unless

The chart also includes a "DNA summary" box that displays DNA-derived or user-specified content. The chart also supports a popup "detail panel" to show DNA signatures and other information.

Some example charts are shown here. See the help page for more information.

Example DNA Overlay Chart

Top to Bottom, Optimized
lpq{qr}{lm}
lpqql
kpqql
Aaron Example
(1733 - )
Kit1001
kpqql
Aaron Example
(1701 - )
lopql
Kevin Example
(1784 - )
Aaron Example
(1818 - )
Keith Example
(1890 - )
Kit1002
lopql
John Example
(1671 - )
lpqrm
Asa Example
(1700 - )
John Example
(1749 - )
James Example
(1772 - )
Allan Example
(1889 - )
David Example
(1921 - a 1950)
Kit1003
lpqrm
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
1224141114161212121313291991011112515192915161717111019231515{17,18}183738{12,13}12
lxnknplllmmCsijkkyosCopqqkjswoo{qr}rKL{lm}l
This is the text from the Y-DNA Detail event. This is where one could explain why his DNA signature is "lpqql" rather than "ql".
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
1224141114161212121313291991011112515192915161717111019231515171837381212
lxnknplllmmCsijkkyosCopqqkjswooqrKLll
Haplogroup: R1b
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
1224141114161212111313291991011112515192915161717111019231515171837381212
lxnknpllkmmCsijkkyosCopqqkjswooqrKLll
Haplogroup: R1b
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
1224141114161212121313291991011112515192915151617111019231515171837381212
lxnknplllmmCsijkkyosCoopqkjswooqrKLll
Haplogroup: R1b
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
1224141114161212121313291991011112515192915161717111019231515181837381312
lxnknplllmmCsijkkyosCopqqkjswoorrKLml
Main line
Aaron (1701) line
John (1671) line

This chart shows three lines of descent. The user selects the lines by choosing the progenitor, also known as the most recent common ancestor (MRCA), and several descendants. The progenitor for this chart was Thomas Example, the descendants are not named but are indicated by their DNA test kit identifiers "Kit001", "Kit002", and "Kit003".

This chart uses an Accent feature to control the color of various boxes in the chart. In this case, the boxes were categorized by line of descent.

DNA Test Results

The test results used in the chart above were as follows. The same test results were used for all the example Y-DNA Overlay charts.

DNA Test Results
12345678910111213141516171819202122232425262728293031323334353637
3
9
3
3
9
0
1
9
/
3
9
4
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
-
1
3
9
2
3
8
9
-
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
4
6
0
Y

G
A
T
A

H
4
Y
C
A

I
I

a
Y
C
A

I
I

b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
Modal1224141114161212121313291991011112515192915161717111019231515171837381212
Kit10011224141114161212111313291991011112515192915161717111019231515171837381212
Kit10021224141114161212121313291991011112515192915151617111019231515171837381212
Kit10031224141114161212121313291991011112515192915161717111019231515181837381312

Other Examples

Overview

As you can see, a DNA Overlay Chart is quite different from the "marker grids" that are often used to display DNA test results. Marker grids are very useful for analyzing sets of test results, but they do not meet the goal of presenting DNA evidence in the context of a lineage supported by traditional genealogical evidence.

The marker grid above was created using the DNA Grid User Item. The DNA Overlay Chart and DNA Grid User Item are complementary and together they make Second Site an excellent platform for presenting DNA evidence.

Multiple DNA Overlay Charts may be placed on a single page, and other content may also appear on the page. This page was made with Second Site, for example, and it includes a DNA Overlay Chart and a DNA Grid.

The DNA Overlay Chart is similar to a descendant chart in that it starts with a progenitor and proceeds down through generations of descendants. Unlike a full descendant chart, however, the user specifies one or more descendants in addition to the progenitor, and the chart is constrained to the lines of descent between the progenitor and those descendants. This makes the chart far more sparse than a typical descendant chart.

In the DNA Overlay Chart, the progenitor is called the MRCA, the most recent common ancestor.

Typically, the user specifies the ID numbers of descendants who have DNA test results in TMG's DNA Log. The user may choose to add other descendents when proposing a theory about the DNA of a descendant or demonstrating how having DNA from a specific branch of the tree would help the research project.

The example charts listed above were all created by specifying the ID numbers for the MRCA ("Thomas Example"), and 3 of his descendants. I made 4 variations to show how the "Direction" and "Optimize Chart" options affect the chart.

The DNA Overlay Chart includes an option to add DNA test results to the tree. Second Site summarizes the DNA evidence by constructing an abbreviated DNA signature that includes only the markers that vary between tests. That abbreviated signature is shown in the chart. The full DNA signature is available via a popup "detail panel".

The DNA Overlay Chart includes an option to derive DNA signatures for ancestors from the test results of their descendents. See the DNA Signatures section for more information.

Users may add content to the DNA Overlay Chart via "chart events", custom TMG events that are used to provide comments about the DNA evidence. See the Chart Events section for more information.

Names and Kit Numbers

By default, Second Site shows the names of people in the chart unless the person has an associated DNA test result. When a person has a test result, Second Site shows the kit number only. You can override that behavior and show the person's name by attaching a special directive to the DNA test result. For demonstration purposes, the directive was added to the test for "Peter Example".

Detail Panels

The DNA Overlay Chart includes a Detail Panel that pops up to reveal more details than are shown in the chart proper. Click the [+] button to open the panel and click the [X] button to close the panel. The detail panel includes both contents specified by the user via Chart Events and the full DNA Signature of the given person.

Optimization

In the unoptimized charts, you see all the people in the lineage between the MRCA and the specified descendants. In the optimized charts, Second Site has removed people who do not add significantly to the chart.

  • The MRCA is always retained, as are the descendants who were specifically identified by the user.
  • Parents who have more than one child in the chart are always retained, as are all of those children.
  • People who have one of the DNA "chart events" will always be retained.
  • People who have DNA test results of the proper type (Y-DNA or mtDNA) will always be retained.

When generations are removed during optimization, Second Site displays the number of generations.

Chart Events

Users have the option of using two special event types to add information to the chart. Both event types are custom tags that users must add to their TMG projects via the Master Tag Type List. If a person that appears in the chart has a primary event of the given type, Second Site will include information from that event in the chart, as explained below.

Y-DNA Summary Event

The M1 segment of the memo of a "Y-DNA Summary" event is displayed in a special "DNA summary" box that appears beneath the typical name/lifespan box. The DNA summary box is small and so the summary text should be short. If there is no "Y-DNA Summary" event for a given person, Second Site will display an abbreviated DNA signature in the DNA summary box.

The M1 segment of the memo of a "Y-DNA Summary" event is displayed in the detail panel explained above.

Y-DNA Detail Event

The M1 segment of the memo of a "Y-DNA Detail" event is displayed in the detail panel explained above.

In the examples, only one person has a chart event, Timothy Example (b. 1666). Click the [+] button to see the text from his Y-DNA Detail event.

DNA Signatures

The Y-DNA variation of the DNA Overlay Chart processes DNA signatures. Signatures include actual marker values from test results entered by the user as well as marker values derived by Second Site using logic described below. Second Site displays an abbreviated form of the signature which shows only marker values that vary among the tests included in the chart.

Derived Signatures

The Y-DNA Overlay chart includes an option to compute a DNA signature for parents in the chart. For the MRCA, a derived signature is often called a modal signature and becomes the standard to which other test results are compared. The word modal is not accurate in this context because the DNA Overlay Chart in Second Site does not derive DNA signatures using the statistical mode, i.e., it does not choose the values that occur the most frequently in the set of markers. It uses a maximum parsimony method as described below.

Second Site analyzes DNA results within the context of the lines of descent to compute the DNA signature of fathers in the chart. The rules are designed to choose marker values that would produce the observed test results of descendants via the minimum number of mutations. The rules are as follows.Note 2

  1. A father with only one son (no branching) is assumed to have the same marker value as his son. If the son's marker value is (un)certain, then so is the father's.
  2. A father at a branch point is assumed to have the value derived from the values of his sons that minimizes mutation probabilities. If this is (un)certain, then the father's value is (un)certain.

The result of employing those rules is shown in the example.

Using rule #1, Kit001's DNA test result is pushed up the tree to Bradford and becomes Bradford's DNA signature. That does not mean that Bradford actually had those marker values; we move the values up to the earliest branch in the tree where we have an opportunity to compare results of siblings. Some mutations may have occurred between Bradford and Kit001 but without more test results, we can not tell where. Luckily, the exact location is not required.

Similarly, Kit002's DNA test result is pushed up the tree to Aaron and Peter's DNA test result is pushed up the tree to John.

The DNA signature for Thomas is based on rule #2 and the values derived for his sons Timothy and John.

The DNA signature for Timothy is a little more involved. Based solely on his son's DNA signatures, Timothy's DNA signature would be {kl}{op}{pq}ql, where the values in braces are possible values for a given marker. The DNA signature for his father Thomas is the result of combining {kl}{op}{pq}ql and lpqrm. Note that if one son has {kl} ("k or l") for a marker, and the other son has a value l for that marker, then the father's likely value is l, the value that requires the least number of mutations. Thomas's DNA signature is therefore lpq{qr}{lm}. {qr} and {lm} remain uncertain because there is a tie: one son has ql and the other has rm.

After Thomas's signature is derived, Second Site reviews the DNA signatures for his descendants and at that point, the uncertainties in Timothy's DNA signature are resolved using values that minimize mutations. This is an extension of rule #2.

Second Site derives probable DNA signatures. It is possible for unlikely mutation sequences to occur, such as multiple mutations occurring in the same generation and thus obscuring the DNA signature of the father. All methods for deriving DNA signatures are subject to uncertainty, but the maximum parsimony method used by Second Site reduces the possiblity that a bias in the test population (many people tested from one branch of the family) unduly influences the derived DNA signature of the MRCA.

Alphabetic or Numeric Maker Values

Second Site expects that marker values in TMG's DNA Log will be numeric. By default, Second Site will display those values using an alphabetic code where a=1, b=2, ... z=26, A=27, B=28, etc. The alphabetic code is much more concise and lends itself well to both display and analysis. For users who want to use the more traditional—and more lengthy—numeric values, the DNA Overlay Chart includes a Marker Format property that can be set to "Numeric". See the "Left to Right, UnOptimized" Chart for an example of the numeric format.

Test Result Conversion

Note that two descendants had Family Tree DNA 67-marker test results, and one had a Family Tree DNA 37-marker test result. The chart was configured to favor the 37-marker test, and so Second Site converted the 67-marker tests to 37-marker tests.

Second Site has logic for converting test results from one test to another, and includes logic for adjusting values based on differences between test lab practices. In some cases, the user may find it necessary to create a test result of the desired type and manually compute the marker values.


Notes and Acknowledgements

Alvy Ray Smith and Robert Charles Anderson were instrumental in the design of the DNA Overlay Chart.

  1. Robert Charles Anderson first described to me the basic form of a chart where DNA evidence was overlaid on a descendant-type chart.
  2. Alvy Ray Smith described the DNA signature derivation algorithm in his article "The Probable Genetic Signature of Thomas1 Riggs, Immigrant to Gloucester, Massachusetts, by 1658", The New England Historical and Genealogical Register, volume 164, April, 2010. Mr. Smith described the two rules (described above) that Second Site adopted to push marker values up the tree from descendants to ancestors, including using maximum parsimony to choose a marker value when there are multiple candidate values.

    Second Site has also implemented two other practices recommended by Mr. Smith: (A) using alphabetic codes for marker values and (B) using "{" and "}" to indicate markers with multiple possible values.