Genetic Genealogical Methods Used to Identify African American Diaspora Relatives in the Study of Family Identity among Ghanaian Members of the Kassena Ethnic Group (Part 4)

By: LaKisha David and Leia Jones

Part 1: Introduction
Part 2: Genetic Genealogy Information Type for Genetic Matching
Part 3: Criteria of Relatedness for this Study, Results, and Discussion

Materials and Methods

Participants are 9 parent-offspring dyads (n = 18) and 2 parent-offspring dyads consisting of the same parent for 2 siblings (n = 3) for a total of 11 parent-offspring dyads consisting of 21 individual participants. All participants are at least 18 years of age and are self-identified as members of the Kasena ethnic group residing in Paga, Ghana. Because 2 DNA kits for offspring failed to process, they and their parents were removed from the sample leaving us with a subsample of 9 parent-offspring dyads consisting of 17 individuals. Parents consisted of 4 men and 4 women with an age range of 47 to 80 (M = 64.88, SD = 12.30). Their offspring consisted of 5 men and 4 women with an age range of 19 to 39 (M = 29.44, SD = 8.49). The mean age of the subsample is 46.12 (SD = 20.84).

The (former) Pikworo Slave Camp is located in the Nania village of Paga, Ghana, about 10 km north of Bolgatanga. The Pikworo Slave Camp, primarily used 1500s to 1800s, is associated with both the Transatlantic Slave Trade and the African Slave Trade as a site of bondage before captured people entered slave markets. It is also used for local memorial practices. Although elders in nearby villages still hold memories of the local slave trade, the emphasis of the tour guide is on the site’s connection to people taken to the dungeons along the southern coast and eventually to diaspora locations such as North and South America and the Caribbean Islands. African Americans familiar with the site regard it as having a great historical significance to their own ancestry and family narratives. Genetic genealogy could be used to support claims of relatedness and contemporary biological connections to the African diaspora.

Paga is a town that borders the country of Burkina Faso. It is the capital of the Kassena-Nankana West district. According to the 2010 Housing & Population Census, the Paramount Chief is Paga Pio. The main ethnic groups of the region are Mole-Dagbon, Grusi (of which the Kassena ethnic group is a part of), Mande-Busanga, and Gruma. It has a patrilineal system of inheritance. As of 2010, the population of the Kassena-Nankana West district was 70,667 individuals. In the Kassena-Nankana West district, 14.0% of the population lives in urban areas. The median age is 20, the average age is 26 with 96.7 males per 100 females. The average household size is 5.0 for urban areas and 5.6 for rural areas. Among those 11 years and older, 47.8% are literate in English (some of whom are also literate in a Ghanaian language and/or French) and 49.8% are not literate for any language. Among those 15 years and older, 72.2% are employed.

Saliva Sample Collection

Saliva samples were collected in June – July 2018 and 2016 using collection tubes containing a DNA stabilizing solution. Participants who tested in 2016 (n = 3) were randomly selected by residents of the neighborhood. Offspring (n = 4) of the participants who tested in 2016 were purposively selected in 2018 based on our need to have parent-offspring dyads in the study. The remaining participants (n = 14) who tested in 2018 were randomly selected from a list of potential participants created by one resident of the neighborhood who was not a participant of the study. Potential participants were listed based on their willingness to have both a parent and offspring participate in the study. Saliva samples were collected in one public group gathering in 2016 and one in 2018 at the project site located at the former Pikworo Slave Camp in Paga, Ghana.

Overview of Project Procedures

In June 2018 we organized a community event to continue rapport building and to explain the project to the community. Because the project was developed in consultation with a community member, the emphasis was on being transparent and continuing to build rapport. The next day, selected participants gathered at the project site. Members of the research team explained the project, provided time to answer questions participants may have had, and gathered informed consent. We then gathered the saliva samples, conducted round 1 (July 2018) of focus group discussion data about family meanings and the diaspora, provided a communal project mobile phone for use by participants to communicate with genetic matches, returned to the U.S. with the saliva samples, and then sent the saliva samples to a commercial lab for processing. After the DNA was genotyped, meaning the specific variations of gene markers (alleles) are found, we identified relatives within the GEDmatch database, provided the diaspora genetic matches with the contact information of their genetic matches in Paga, and provided the project coordinator selected from the community in Paga with the email address of the newly discovered diaspora relative. In March 2019, the community project coordinator collected a round 2 of focus group discussion data about family meanings and the diaspora and analyzed it using inductive and deductive thematic analyses.

Procedures for Identifying Genetic Matches in GEDmatch

The essential task was to determine which persons of African descent within a database is related to the participants residing in Ghana, supporting the claim that families that were separated during the Transatlantic Slave Trade are reuniting using autosomal genetic genealogy. We used several tools provided by the web platform GEDmatch (Software Version May 19 2019 00:02:33, Build 37) to identify, and then contact, genetic matches within their database.

Step one was to obtain genetic information in a text datafile for each participant. To obtain genetic information, we sent our saliva samples to Ancestry to process the DNA and create a DNA text datafile. Beyond genotyping accuracy, we selected AncestryDNA services from for two main reasons: (1) the level of accuracy of GERMLINE based matching and (2) access to millions of consumers to potentially match and connect with after the study ends.

Rather than reading the entire genome, AncestryDNA reads the DNA sequence at approximately 700,000 locations, called single nucleotide polymorphisms (SNPs), along the genome (Ball et al., 2016). Along with the company specific products such as the ethnicity estimate and DNA Matches, AncestryDNA provides the raw DNA text datafile that contains the Reference SNP cluster ID (rsID), chromosome and position of allele, and the unordered values of the alleles for up to approximately 700,000 SNPs. This raw DNA text datafile can be downloaded and used in other applications.

Step two was to create phased profiles for each of the 9 participant parent-offspring dyads. Although computational methods are improving, family-based phasing is the only certain way to align the allele datafile by biological parent (Roach et al., 2010; Tewhey et al., 2011), which enabled us to have greater confidence that the DNA segments were identical-by-descent (IBD). IBD segments are segments that were truly inherited from one parent to the offspring. IBD segments are needed to identify genetic matches among unknown testers. To create the phased datafiles, we used the GEDmatch’s Phasing tool which compared the DNA datafiles of the offspring participant with their parents’ DNA datafiles. This created two phased datafiles. One phased datafile consists of the DNA segments shared between the offspring and the biological mother (denoted with “M1”). The other phased datafile consists of the DNA segments shared between the offspring and the biological father (denoted with “P1”). We expected each parent-offspring dyad to share at least 3,400 cMs of DNA. We used the phased datafile that was created between offspring and participating parent who contributed saliva for the study and deleted the phased datafile that was created for the non-participating parent.

Step three was to identify potential genetic matches within the GEDmatch database. To do this, we used their one-to-many comparison tool individually for each of our parent-offspring phased profiles. This provided us with a list of unphased profiles that shared at least 7 cMs in common with our parent-offspring phased participant profiles. When using the one-to-many comparison tool, we had the option to adjust the threshold for the minimum amount of DNA that unphased profiles in the database must share with the phased profile of the participating parent-offspring dyads. As the length of an IBD segment decreases, the tools become increasingly less accurate in identifying genetic matches. While several sources claimed that a minimum of 4 or 5 cM is the appropriate cutoff, we conservatively set the cutoff to a minimum of 7 cM for a single segment. 

Step four was to identify parent-offspring dyads among the list of potential genetic matches for each of our parent-offspring phased datafiles. Selection of parent-offspring dyads among the unphased DNA matching datafiles is necessary to ensure that the matching segment that was IBD for our participant dyad was also IBD for the discovered dyad, thereby further reducing the chances of false-positive matches. To identify parent-offspring dyads among the potential genetic matches, we used a GEDmatch’s 3D chromosome browser. For each phased profile, we viewed the resultant matrix that compared the potential genetic matches with each other and displayed how much DNA each potential genetic match shared with each other. We selected potential genetic matches that shared at least 3,400 cMs with another potential genetic match, indicating a parent-offspring dyad, identical twins, or a second profile for the potential genetic match. If criteria are met, this produced at least one set of matching parent-offspring dyads consisting of four individuals: the parent and their offspring from Ghana who were participants in this study and the parent and their offspring newly discovered within the GEDmatch database.

Step five was to provide additional evidence that each of the four individuals of within a matching set were related to each other by having overlapping segments, meaning that their DNA matched at the same locations along the genome. Observing the same IBD segment in all four samples indicates that the four individuals share a recent common ancestor within 10 generations based on the use of autosomal DNA, SNPs, and current technology. To provide supporting evidence of relatedness, we used GEDmatch’s one-to-one comparison tool which provides the chromosome, start and end location on the genome, amount of shared DNA in cMs, and number of shared SNPs for each segment shared by the two profiles in the comparison. We compared each of the phased participant DNA profiles to both individual profile in the discovered parent-offspring dyad and confirmed that they had overlapping segments. Those meeting this criterion were listed as genetic matches to our participants.

Step six was to contact the administrator (sometimes referred to as the manager) of the parent and offspring profiles by the email provided in the GEDmatch database. For each initial contact, we provided information about the Ghanaian parent and offspring genetic matches, the project, and the contact information for the project phone with the research team member in Ghana. That team member was copied on each email. For newly identified genetic matches who replied, we confirmed that the non-participant members of the genetic match were African American parent and offspring. These newly discovered persons were recorded as being genetic matches with the Ghanaian participant parent and offspring, sharing a common ancestor within 10 generations.

Part 5: Conclusion and Reflections and Thoughts


Ball, C. A., Barber, M. J., Byrnes, J., Carbonetto, P., Chahine, K. G., Curtis, R. E., … Wilmore, L. (2016, March 31). AncestryDNA Matching White Paper: Discovering genetic matches across a massive, expanding genetic database. Retrieved from

Roach, J. C., Glusman, G., Smit, A. F., Huff, C. D., Hubley, R., Shannon, P. T., Rowen, L., Pant, K. P., Goodman, N., Bamshad, M., Shendure, J., Drmanac, R., Jorde, L. B., Hood, L., Galas, D. J. (2010). Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science, 328(5978), 636–639.

Tewhey, R., Bansal, V., Torkamani, A., Topol, E. J., & Schork, N. J. (2011). The importance of phase information for human genomics. Nature Reviews Genetics, 12(3), 215–223.

4 thoughts on “Genetic Genealogical Methods Used to Identify African American Diaspora Relatives in the Study of Family Identity among Ghanaian Members of the Kassena Ethnic Group (Part 4)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.