By: LaKisha David and Leia Jones
Criteria of Relatedness
For the first time in history, technology is available to identify living members of African ancestral family groups. For African Americans, this means that they can identify the specific African ancestral family groups from whom their African enslaved ancestors were taken during the Transatlantic Slave Trade. For our Kassena participants from northern Ghana, this is a new social context in which to make kinship meanings. The emphasis we are making here is not on ancestral geography or ethnic group identification but on the ability to identify and communicate with the direct living African and African diaspora relatives within 10 generations of shared ancestral great grandparents. This challenges the common narrative that African American separation from Africa was too long ago for family history or relatives to be discovered, but it also presents new questions about the meaning of family.
For our criteria, we drew from work on cryptic distant relatives (Henn et al., 2012). We searched for 4th to 9th cousin genetic relatedness between Ghanaians and people of African descent, meaning that the Ghanaian and person of African descent share a common ancestor within the last 5 to 10 generations and within 300 years ago (Henn et al., 2012). Genetic relatedness is measured using DNA segment sharing algorithms on GEDmatch. We used “the length of DNA segments that are consistent with identity by descent (IBD) from a common ancestor” (S. R. Browning & Browning, 2007; Gusev et al., 2009; Henn et al., 2012, p. 1) measured in centiMorgans (cMs) as the genetic similarity metric. Matching is based on similarity of autosomal single nucleotide polymorphisms (SNPs, pronounced “snips”). The amount of DNA shared in a cousin dyad depends on the number of shared ancestors and the number of generations between the cousins and the shared ancestors. The greater the number of shared ancestors or the shorter the generational distance, the greater the amount of shared DNA in cMs between the cousins. The amount shared between cousins vary greatly such that at certain generational distances, cousins will not show matching DNA at the SNP locations even though they are biologically related (Henn et al., 2012).
To ensure IBD segments, we used family-based phased matching. This means that our final results consist of segments that match the parent and progeny of both our Ghanaian participants and the unknown relatives in the database such that all four matched and indicated that they shared common ancestors within 10 generations. We show that there is genetic evidence that Ghanaians and people of African descent show relatedness within 10 generations, supporting the claim that families that were separated during the Transatlantic Slave Trade are reuniting.
Drawing from Henn et al.’s (2012) work, 4th to 8th cousins share 14 to 0.055 cMs of DNA (Henn et al., 2012). Their threshold was set to a minimum of 7 cM based on their ability to find progeny-other and parent-other segments where the progeny also matched the parent (i.e., IBD for the parent and progeny) for 90% of the segments. They used unphased data in their work (Henn et al., 2012). Our minimum segment threshold was set to 7 cMs. However, for the phased data that we used, we could have lowered the threshold to 2 cMs (S. R. Browning & Browning, 2007; Henn et al., 2012). Although there are benefits to being able to identify genetic matches using computationally phased data between 2 persons, our use of family-based phasing between 2 parent-progeny dyads ensures greater accuracy (Roach et al., 2010). Every segment recognized as a genetic match will be IBD based on matching both parent and progeny inherent in family-based phased data. Additionally, our method also enables the person of African descent to learn more about their genetic family history than with the use of data between 2 persons. For example, with our phased data, a person of African descent could learn that they, through their mother, are related to a person born in Ghana through the Ghanaian person’s father. This additional information discovered using family-based phased data is of value to people of African descent testing to learn about their relatedness with Africans.
Confirming Parent-Progeny Dyads
We did a one-to-once comparison (GEDmatch) between parent and progeny before creating the phased datafiles to ensure that each dyad consisted of biological parent and progeny. Parent-progeny dyads will have at least 3,400 cM of shared DNA 100% of the time (Bettinger & Perl, 2017). We expected the members of each dyad to share at least 3,400 cMs of DNA. Members of dyads shared 3,538.7 to 3,568.7 cMs indicating that each dyad consisted of a biological parent-progeny pair (see Table 1). We then used the GEDmatch Phasing tool to create one phased profile for each dyad.
Table 1: Shared DNA between parent and progeny
|Parent||Offspring||Total DNA |
|Parent 1||Offspring 1.1||3,568.7||511,616||151.8|
|Parent 2||Offspring 2.1||3,554.8||515,810||151.8|
|Parent 3||Offspring 3.1||3,547.4||514,166||151.8|
|Parent 4||Offspring 4.1||3,560.4||513,923||151.8|
|Parent 5||Offspring 5.1||3,553.0||515,325||151.8|
|Parent 6||Offspring 6.1||3,551.6||515,063||151.8|
|Parent 7||Offspring 7.1||3,559.2||514,105||151.8|
|Parent 8||Offspring 8.1||3,543.5||458,832||151.8|
|Parent 8||Offspring 8.2||3,538.7||459,252||131.7|
Note: As of 7 July 2019
Identifying Matching Parent- Progeny Dyads within GEDmatch
We used the phased profile for the rest of the matching. We used GEDmatch’s one-to-many tool to find all profiles within the database that matched at least one of our phased profiles at a minimum of 7 cMs on a single segment. The number of matching profiles for each participant dyad ranged from 7 to 50. These resultant matching profiles were unphased and so we were uncertain if the segments in the matching database profiles were actually IBD. To resolve this, we searched for parent-progeny dyads within the results for each participant phased profile.
We used GEDmatch 3-D Chromosome Browser to identify matching profiles that also matched each other sharing at least 3,400 cMs, indicating parent-progeny relatedness (Bettinger & Perl, 2017). Each parent-progeny dyad had from 2 to 6 matches with the exception being for the siblings (i.e., Progeny 8.1 and Progeny 8.2) who had 20 and 28 matches respectively (see Table 2). Every 2 matches consisted of 1 identified dyad. For example, the phased profile for Parent 1 and Progeny 1 matched 2 identified parent-progeny dyads in the database. Parent 5 and Progeny 5 matched 3 identified dyads in the database.
We then did a one-to-one comparison between the phased profile and each member of the dyad to ensure that the matching segment matched. This is useful in ensuring that the segment found in the GEDmatch database is also identical-by-descent for the discovered dyad. The Total DNA Shared (cMs, single segment) in Table 2 is the amount that all four match each other (i.e., Ghanaaian parent-progeny dyad and parent-progeny found in database). In each matching, the dyad set matched on a single segment such that the shared cM is the amount shared on a single segment matching all four in the dyad set. For example, Parent 1 and Progeny 1 shared 9.7 cMs with a parent-progeny dyad found in the database. We regarded each dyad match as having an IBD segment. This means that the DNA segment was inherited by the Ghanaian parent and progeny and the parent and progeny found in the database from a common ancestor within 10 generations, making them biological relatives.
Table 2: Number of Genetic Matches in GEDmatch Database
|Parent||Offspring||Number of Genetically Matching |
|Total DNA |
|Parent 1||Offspring 1.1||4||9.7||819|
|Parent 3||Offspring 3.1||2||8.3||751|
|Parent 4||Offspring 4.1||2||7.0||883|
|Parent 5||Offspring 5.1||6||7.1 – 9.8||522 – 1,236|
|Parent 6||Offspring 6.1||2||10.4||950|
|Parent 8||Offspring 8.1||20||8.1 – 8.3||614 – 893|
|Parent 8||Offspring 8.2||28||8.1 – 11.1||617 – 938|
Note: Note: As of 7 July 2019
The study of family identity among the Kassena people of Ghana toward their diaspora relatives is linked to the phenomenon of ancestral families separated during the Transatlantic Slave Trade reuniting using genetic genealogy. With such a reunification claim, we seek to illuminate the methods used in our study to identify extended relatives. For research, policy, and programs involving family reunification as an intervention, there is a need to develop methods vetted through the disciplines of population genetics and genetic anthropology for determining and interpreting relatedness with tools that are readily available for use in the general population.
Using the tools provided by GEDmatch, we were able to first confirm that our participants consisted of parent-progeny dyads and then to identify parent-progeny dyads within the GEDmatch database who were related to our participants. For discovered relatives in the database who are a part of the African diaspora, this provides evidence that African American and Ghanaian members of ancestral families that were separated during the Transatlantic Slave Trade can be identified and reunited.
After identifying matching parent-progeny dyads within the GEDmatch database, additional steps must be taken to learn more about the dyad’s ancestral history including using GEDmatch’s admixture tools and contacting the match’s representative using the email information that the representative provided. The match’s representative need to also confirm that the discovered parent-progeny dyad is in fact a parent and progeny and not two DNA profiles of one potential genetic match (e.g., duplicate upload or profiles from two different companies) or twins.
Bettinger, B., & Perl, J. (2017). Shared cM Project 3.0 Tool v4 with relationship probabilities. Retrieved from https://dnapainter.com/tools/sharedcmv4
Browning, S. R., & Browning, B. L. (2007). Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering. The American Journal of Human Genetics, 81(5), 1084–1097. https://doi.org/10.1086/521987
Gusev, A., Lowe, J. K., Stoffel, M., Daly, M. J., Altshuler, D., Breslow, J. L., Friedman, J. M., Pe’er, I. (2009). Whole population, genome-wide mapping of hidden relatedness. Genome Research, 19(2), 318–326. https://doi.org/10.1101/gr.081398.108
Henn, B. M., Hon, L., Macpherson, J. M., Eriksson, N., Saxonov, S., Pe’er, I., & Mountain, J. L. (2012). Cryptic Distant Relatives Are Common in Both Isolated and Cosmopolitan Genetic Samples. PLoS ONE, 7(4). https://doi.org/10.1371/journal.pone.0034267
Roach, J. C., Glusman, G., Smit, A. F., Huff, C. D., Hubley, R., Shannon, P. T., Rowen, L., Pant, K. P., Goodman, N., Bamshad, M., Shendure, J., Drmanac, R., Jorde, L. B., Hood, L., Galas, D. J. (2010). Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science, 328(5978), 636–639.