Tuesday 31 December 2019

New Year

Happy New Year, and happy birthday to lovely Adhara and her littermates, 6 years old today!

Sunday 15 December 2019

'Embark' dog DNA analysis

People have been asking me for a while about a DNA testing lab that offers a test called 'Embark' that seems to have started up a couple of years ago. A few months ago I tried this test on one of my dogs and had an email exchange with some of the people involved about the test and the results. The staff were generally helpful, but I found my queries got passed around a lot and a lot of them ended up not being answered. I was asked to write a review of it, which I did, and someone contacted me to ask if I could explain the points I made in the review a bit more, which I did as best as I could by email. I have been meaning to write a more detailed response on my blog about the thoughts I had on it, so this is me getting round to it.

'Embark' seems to be set up as a direct competitor to Genoscoper's product 'mydogdna' which I have been using for some years, and is a DNA analysis that provides health tests relevant to a dog's breed as well as other genetic information, such as coat colour, and some information the lab has collected for the breed as a whole, to give an insight on things like heterosis and how your dog compares to its breed. Unfortunately my main conclusion was that Genoscoper's website and tools at their current stage of development are far better designed than Embark's, perhaps as the inevitable result of the headstart it's had.

One of the things that complicates this as a resource for breeders so much is that 'Embark' tries to be something for everyone, and one of the things it tries to be is a toy so that people who have a mixed-breed dog from a shelter can find out what ancestry their dog might have. No doubt this is a really enjoyable feature for the audience it's intended for, and I am absolutely not averse to science as cool toys, particularly if funding from it is (hopefully) being used to bankroll research into more useful things. But unfortunately the software for breeders, which ought to be serious and not a toy, seems to be built on the same architecture as the fun test, which means a lot of it is irrelevant, uninteresting, or just outright bizarre:


I know this dog is a poodle because that's what I wrote on the form when I sent it in. I want hard data on what this analysis has found about the breed and I'm not interested in all this fluff. Furthermore, the 'software' itself is a website that is difficult to navigate and seems to be intended for use on mobile phones. The information that is relevant is sparse and spread over many different pages and difficult to access.


This is something that potentially is interesting -- it's identification of the dog's mitochondria and y-chromosome. This relates to the matriline and patriline of the pedigree, something I've written about before. And the image above is actually all the worthwhile information on two pages, with the rest being similar fluff of no relevance to breeders. It's good that they at least identify the markers, but there is no breed-specific information given -- how many different ones have been found in this breed, and at what occurrence? Which are common, and which are unusual in this breed?


This is completely useless. It concerns the dog leukocyte antigen which has been the subject of various studies in the past. There is no explanation given of what 'low diversity' means (possibly that the dog is homozygous?) There is nothing to identify the DLA haplotypes so they can't be compared with other dogs if you were involved with the other studies. The pie chart manages to convey no information about anything.


These are the colour test results for the dog. This is one thing it does do well (and for once has managed to fit all the data onto one page) so no complaints with that.


This is the health test component. In order to see all the results on one page I had to download a PDF. The two bottom tests I am not really sure why are included as being relevant to standard poodles, as I don't recall seeing any research or hearing of them occurring in the breed. The blue icon on the right I believe is OFA (a registry in the USA that records test results on American dogs) so presumably the OFA doesn't recognise them. The first condition so far as I can find out isn't a disease, but a probably harmless trait found in a spaniel breed that can confuse diagnosis of other issues that occur specifically in this breed. The other tests are widely accepted to be relevant to standard poodles, and this is one of the main benefits for using this test. I have written before about caution in interpreting the results of tests that may not be relevant to a dog's breed.


This is very similar to something Genoscoper's test includes. Unfortunately there is not a lot of information given. 'Genetic COI' is a bizarre term (COI or F is generally considered to be a probability-based estimate of homozygosity calculated from a pedigree). The research papers I've seen by the lab that runs 'Embark' use a term FROH where ROH are runs of homozygosity used to estimate the same thing. I can only assume that FROH is what's on the x-axis in this graph, and that it's calibrated to be directly comparable to F, the inbreeding coefficient calculated from a pedigree. Genoscoper allows people to set their dogs to public view so you can actually compare different dogs, and on their test the heterozygosity tended to come out in the same sort of magnitude to what an accurate 15-generation COI predicted, with slight variation as to be expected (siblings from a line-bred pedigree tested as 33.8 and 34.4 homozygous). The dog this analysis was of has a 15-generation COI of around 4%. It's not possible to compare that with other dogs as there don't seem to be any public data accessible on the site, or if there are, I wasn't able to find them.


This is the most interesting part, and unfortunately it's buried right at the bottom of the results and difficult to find. It's actually a map of the dog's chromosomes showing which regions are homozygous as ascertained by the markers sampled in the analysis (really it should be called homozygosity by chromosome, not 'inbreeding'). This sort of map has huge potential and it's not really being used yet. In the mixed-breed dogs that are on the website, although they are rather difficult to find, there were maps of paired chromosomes that showed six-way-plus mutts and chromosomes recombining in real time, which is absolutely fascinating.

A unique 5-way proper salt-of-the-Earth mutt! It's possible to tell from the proportions and distributions that one of this dog's parents was descended from Shih Tzus and Chihuahuas and the other from the other heritage.

How do they do this? I use analogies a lot and people seem to find it helps. So I am going to say, chromosomes are like books. Books tend to fall apart in particular ways, as do chromosomes when they undergo meiosis to produce sperm and ova. When gametes are formed, each pair of chromosomes in the parent, contributed from each grandparent, break into segments and recombine randomly to make a new single chromosome that will be packed into a sperm or egg ready to make a new organism. A roughly-handled or well-used old book might crack at the spine and the pages might come out in four or five chunks, and this is what happens in the first generation. But if you continue to use the book, you will damage it further, separating individual pages, and you can even rip up pages. But even on a ripped page, certain phrases can be identified as coming from a particular book, for example: It was the best of times, it was the worst of times is is instantly identifiable. By using large 'runs' of genetic data (according to the literature on the test, they are 50 kb, i.e. fifty thousand base pairs) the test presumably identifies genetic 'phrases' found in each breed.

A funny book, in the early stages of meiosis, possibly a victim of its own success.

If a dog's pair of chromosomes both have a long paragraph that is completely identical, this indicates those areas are identical by descent, and probably came about from recent inbreeding. If the areas are fundamentally the same with some small differences, such as American instead of British spelling, this suggests there was a more distant relationship between the parents. If the areas are different, this suggests relative unrelatedness.

The soul becomes dyed with the colour of its thoughts”:"The soul becomes dyed with the colour of its thoughts
(homozygous)
The soul becomes dyed with the colour of its thoughts”:"The soul becomes dyed with the color of its thoughts
(one mutation, possibly indicating more distant inbreeding)
The soul becomes dyed with the colour of its thoughts”:"The soul is dyed with the color of its thought
(similar with obvious changes, implies very distant ancestral relationship)
The soul becomes dyed with the colour of its thoughts”:"The pigment of thought stains the soul
(same meaning, same language, but different -- heterozygous and relatively unrelated origins)

From this kind of data, it should be possible to find out more about a gene pool by looking at homozygosity and comparing dogs to each other. Two dogs of separate breed of very different origin, such as poodles and Pharaoh hounds, would be likely when compared to each other to be almost entirely the fourth example with perhaps some of the third. A breed or species that has already entered the extinction vortex due to depletion of its gene pool, when individuals were compared to each other, would be mainly the first example with some of the second. A breed or species with a viable and properly maintained gene pool is likely to be evenly distributed with perhaps a slight skew to the second and third examples, and it should be possible to find individuals unrelated to each other within the population relatively easily. Unfortunately 'Embark' doesn't provide information on any of this in its present state.

The accuracy of identifying the chromosome fragmentation depends on how close together the ROHs are and what 'blind spots' there are, which I don't know, but from the mixed-breed examples, it seems to be quite high resolution. Unfortunately it's not surprising considering the rest of the results that more effort has been spent developing this into a toy for owners of mixed-breed dogs than a tool for breed conservationists. This has huge potential and the chromosomes could be coloured to show if areas are over- or under-represented in the breed, or you could analyse a dog's parents and grandparents or farther and the test could show you which areas likely came from which ancestor.

The grey vertical lines on the chromosomes are the 'STR tracks' toggle on the bottom of the image If you click 'learn more' it does have some information on this in that these are markers that have been used in some studies of breed populations in the past, and if you mouse over the lines it give some identifying information, but infuriatingly again it doesn't give you any actual results so that you can compare your dog with any dogs that might have been in any of the studies it mentions.

One thing 'Embark' does allow that Genoscoper's version doesn't, is for you to download your dog's raw data. In theory this could be used to compare dogs using third-party software, but I'm not aware of any software available at present. The data itself is a spreadsheet containing a huge list of base pairs, so if someone does write software for it, it's likely to require hard work and someone with serious programming skills.

At the moment, this product shows promise, but it's unfortunately let down by seriously poor tools that fail to allow people to get meaningful information for their results. I understand from the people responsible for it that this is in part because of wanting to thoroughly test any tools they do offer before making them available, which is a valid point. But at the current juncture, it only allows you to look at very limited information on the heterosis of your dog and doesn't allow any comparisons with other dogs in the breed. Genoscoper's test, perhaps showing evidence of being around longer and having had more time to develop, is currently cheaper and lets you compare dogs of the same breed and different breeds, using interactive 3-D plots, and is probably the better option, at least for now.

I should add that while genetic analysis tools like these are fascinating new areas that provide worthwhile information on the gene pool of breeds as a whole, and can be really useful if pedigree information is not available or suspected of being incorrect, I do not think they are a substitute for accurate deep pedigree information for breeding decisions concerning individual dogs, i.e. COI, at their current level of development. That may change as more evidence becomes available, but at the current level of understanding I feel results like this should be used as supplemental information.