An Overview of DNA Comparison Matrices in Multiple Kit Analysis

Video Transcription

It’s another video about the Multiple Kit Analysis tool, and today we are going to be talking all about matrices.

Howdy, I’m Andy Lee with Family History Fanatics where we help you understand your DNA, climb your family tree and write your ancestor story along the way. In GEDmatch, there is a tool called the multiple kit analysis tool, and as part of that tool there is a function where we can look at multiple people compared to multiple other people all at the same time. Now, the multiple kit analysis tool is a tier one tool, so you do have to subscribe to the website in order to use it, but there are multiple ways that we can look at multiple people. Now I’m gonna just go over to my tag group selection. I’m going to select one of my tag groups and I’m going to visualize that group so I can start using the multiple kit analysis tool on the visualization options. The one we’re talking about today is the matrices tab for the matrices.

You can see that there are six different options. We can look with the autosomal matrix, we can look with the generations matrix. We can look with the XDNA matrix, the overlap matrix, the fully identical snip matrix, and the full identical regions centimorgan’s matrix comparison. Each one of these comparisons has some useful information that you may want to use in your research. Let me start with the autosomal matrix comparison. I’m gonna leave all of the presets just the same that they are, and I’m going to look at this tag group of some family members of mine after a few seconds or so, depending on how many people you have. This is the page that you’ll be seeing. Now in this you can see that what it has is it has a list of all the people in that group on the side, and it has a list of the exact same people along the top at each one of these intersections of these boxes, that is the amount of DNA that those two people share.

And you can see that where the two same people intersect. You have this gray box and that’s going to just be a diagonal all the way across your matrix because those are the same people and we’re really more concerned with how that person matches somebody else. Now these are color coded, so the more closely related, in other words, if they are parents or siblings, then they’re going to be a green or even a dark green. If they are more first cousins or second cousins or third cousins, we’re going to see oranges to yellows. And if they’re really distant, then they’re going to be red. And if there’s no shared DNA, then the blocks is going to be blank. Now, before auto clustering, this is a tool that I would use to actually begin to create clusters. I download this to a spreadsheet and then move everything around to create my own clusters.

And I still use that because a lot of this DNA data, the amount of shared centimorgans, is very, very valuable. You can start to see, for instance, this person right here, well, they have a parent-child relationship. It looks like a sibling relationship, a sibling relationship, probably a first cousin relationship. So already just by looking at the centor information, I can see a lot of data about many of these people only because I’m comparing it all together. So that was the autosomal matrix comparison. Now, one thing that is very similar to that that provides a lot of the same data is the generations matrix comparison. Let’s take a look at that one. The generation matrix looks almost identical. It is the list of people on the side, the list of people across the top and the intersection boxes. Now, instead of having how much shared centimorgans, it has how many generations removed and it’s been calculated.

And so one is obviously very closely related, parent child 1.2 is usually siblings. And then as you get further and further out, you’re gonna be more and more distantly related. So you can see real quickly, hey, this actually has a nice little box right here, a cluster. It’s basically a cluster. So this is another way to help start to form some of these clusters and see how they might be related. Here’s another really big cluster of people right down here that are fourth or fifth generations, but they all share DNA in common. So these first two tools, the autosomal and the generations matrix comparison are ways of looking at really the same thing just with a slight different change. Now, the XDNA comparison is going to be very similar, but this is just your XDNA matches. So if you’re using really big groups, then you’re not going to get a lot of data.

There’s gonna be a lot of blank boxes. I’m gonna switch over to another tag group in order to do the XDNA comparison. This matrix is of my grandfather’s XDNA matches. Now, because he’s male, he only received one X chromosome and that was from his mother. So all of these matches have to be on his maternal side. Now we can see just from this that some of these are actually pretty close. So for instance, these two people are probably a mother and son or a mother and daughter because of how much DNA on the X chromosome they share. But even these others are actually significant enough, 15 or more cent organs that they might be able to figure out the relationship depending on how far distant the autosomal DNA shows that relationship is. So the X DNA matrix is another way that we can look at a certain subset of our matches.

Next, we get to the overlap matrix comparison. Now, the overlap matrix comparison I don’t find as useful, but if you want to see how different sets may match up with each other as far as their kits and which ones are gonna be the most likely to have good matches, then you might use the overlap matrix comparison. Let me show you. So what we can see on this matrix is that it’s showing how many snips, so in this case 406,000 snips that these two kits share in common. Now, depending on where they tested and when they tested, you might see numbers as low as into the less than a hundred thousand two, as much as here’s one that is 600,000 for those kits that have less overlap. If you wanna understand what that means for your matches, I have a video that I can leave a link to at the end of this video, but for the most part, a lot of the algorithms are going to find the matches that you want based on the number of snips that are already there.

These last two matrices, the fully identical snips and the full identical region are used primarily if you are trying to determine which kits are parent child, or which kits are the same people. And that’s because parent, child and identical twins are the same. People are both reported around 3500 centimorgans. Let me show you how this can help you determine whether they are these same people or whether they’re a parent or a child. So start with a fully identical snip comparison. This is comparing all of the fully identical snips and what fraction of all of your snips are fully identical between any two people. Now, you would think that, oh, well these two people are different. They don’t share a lot. Actually, most people probably have about half to two thirds of all of their snips in common with any other one person. So in this case, what this group of people is, is this is my dad, myself, Phil, who we’ll find out is actually me, both of his parents and his siblings, my dad’s parents, my grandparents, they are not related, but we can see right here that they share a little more than half of all of their snips fully identical.

Now, if I’m looking at myself and Phil, we see that it’s basically 1.09999. That is almost the exact same thing as saying 1.0. All of our snips are the same. On the other hand, if I look at my dad and me, I can see that hey, it’s about two thirds of the snips. If I look at my dad and his siblings, it’s about 0.7 of all of the snips are identical. Let’s go back to an autosomal matrix and see where these 35 hundreds are and how they show up over here. So I can see here that hey, these first four people are 3,500 and Phil has a 3,500 right here with me. And then there is this 35 hundreds over here. This is my dad’s siblings, which means that these are his parents. Where do those show up on this other matrix? Well, we said here were the first four.

Notice that these are about two thirds. And who were all these people? Well, these are all parent child relationships. There’s myself compared to my dad and my dad compared to his parents. The next one was Phil and me who are the same person, and that one was almost a one. And then here are the last ones of the 35 hundreds, and this is my dad’s siblings to his parents or their parents. And it’s also about the exact same amount as these other ones are for a parent-child relationship. So we can quickly see with this which matches are the same person and we can eliminate those kits cuz we only need to compare with one of those kits and which matches our parent sibling relationships. Now if that matrix was showing too much information for you, then go down and use the fully identical CM matrix comparison.

This is comparing those kits that have fully identical, not just half identical, but fully identical regions. So you can quickly see which ones are siblings and which ones are the same person. So for instance, before I was looking at me and Phil, and we both share all of our DNA in common because we are the same person. On the other hand, I can see these siblings of my dad and how much DNA they share, as well as how much DNA they share with each other. What you don’t see on here is you don’t see the parent child relationships. So depending on what you want to examine, you may use the fully identical CM matrix or you may use the fully identical snip matrix. Matrices are a great way to start analyzing multiple matches at the same time. And this tool, the multiple kit analysis tool, is worth having a subscription to GEDmatch. Now, if you want to learn more about overlap, I have a video up here and if you wanna see some more GEDmatch videos, there’s a playlist right here. But be sure to subscribe to our channel and make sure you like it and leave a comment down in the sections below.