WIKIMEDIA, MAGGIE BARTLETT, NHGRIAs many a researcher can attest, not all mice are created equal. Different strains’ genetic backgrounds can greatly influence the outcomes of experiments. To take an example well-known to mouse geneticists, when the gene Apc (adenomatous polyposis coli) is mutated in C57BL/6J mice, the mice develop colon polyps; but the same genetic change, called the “Min mutation,” has almost no ill effects when introduced in another strain, AKR/J. The same experiment produced different results merely because of the strain of lab mice, explains University of Wisconsin geneticist Amy Moser, who worked on the Min mutation.

Now, scientists can better resolve what genes are responsible for such strain-related discrepancies. Reporting today (August 7) in PNAS, Ghent University researchers have published a database of proteins predicted to be nonfunctional in the 36 most popular and important inbred strains of laboratory mice. The database, says Moser, who was not involved in the work, may help researchers better understand their mice and, hopefully, design better experiments as a result. 

The study authors Steven Timmermans, Marc Van Montagu, and Claude Libert have compared the coding sequences of the go-to lab mouse C57BL/6J (black 6) to 36 other laboratory strains and, based on these data, determined which of their proteins are likely to be defective.

Scientists began inbreeding mice, through brother-sister mating at every generation, about a century ago. Once a strain is fully inbred, it is homozygous for every position in the genome and, therefore, the offspring are genetically identical to the parents.

Using inbred mice, researchers can know that the only changes among the mice in their experiments are those that they are introducing; not variation in the genetic background. C57BL/6J was the first mouse genome to be sequenced, but in recent years, scientists at the Wellcome Trust Sanger Institute sequenced the genomes of 36 of the other most commonly used strains of inbred mice. Libert and his colleagues used these sequence data, along with data from other resources, to create their new database.

See also “Mouse Genomes Catalogued

The researchers started with the black 6 reference sequence as well as data about insertions, deletions, and single nucleotide polymorphisms (SNPs) relative to black 6 found in the other 36 strains. For each gene, the researchers took the coding regions of the black 6 sequence, computationally mutated them according to the insertion, deletion, and SNP data for each of the 36 other strains of laboratory mice, and then, also in silico, translated those DNA codons into their corresponding amino acid sequences to predict how mutations in each gene would affect the resulting protein.

The data included in this database are publicly available, but not all in one place, notes Klaus Schughart, who heads the department of infection genetics at the Helmholtz Centre for Infection Research in Germany and was not involved in the work, in an email to The Scientist. Processing those data from multiple locations requires “quite a bit of expertise. . . . The resource which they generated makes it much easier for a larger research community to find important associations between genetic variation and phenotypic consequences.”

The database could help researchers learn more about strains they’re already using to study disease. Or if they are interested in a particular gene, they can potentially find an inbred strain in which that gene is mutated and use that strain to study the gene, Schughart says.

Another application of the database is to help sort through candidate genes that may control a particular phenotype. For example, in the case of the Min mutation that causes colon tumors in black 6 mice but not in AKR/J mice, Moser and others found that the region of the genome that seemed to compensate for the Min mutation—the modifier locus—was within a region of chromosome 4. Subsequently, other researchers found that a gene within that region was mutated in black 6 mice, while it was not mutated in the AKR/J mice, and when they tested the gene, they found that it was indeed the modifier.

But that process of testing candidate genes, of which there may be hundreds, is not easy, Moser says. “If we used the tool that they describe in the paper, we could say, ‘Okay, are there any genes on chromosome 4 where these strains differ from B6?’ And we would have pulled out that mutation.”

This database could potentially help researchers improve their experimental design, Moser says. For example, the strain FVB/NJ carries a gene variant that causes retinal degeneration. “That’s a strain that’s used for making a lot of transgenic mice because, as they mention in the paper, they have big [eggs]—the eggs are easy to inject . . . But you wouldn’t want to use that strain if you wanted to look at the retina. So you should know that, but a lot of people didn’t when they first started making transgenic mice.”

A limitation of the database, says Timmermans, a PhD student in the Libert lab, is that it does not contain information about mutations in noncoding regions of the genome.

“We are looking into noncoding sequences but we are not really sure whether this will be successful because it’s already known that noncoding sequences, even if they are important, they are far less conserved between species than coding sequences,” Timmermans says. “So this experiment is something that’s a work in progress.”

S. Timmermans et al., “Complete overview of protein-inactivating sequence variations in 36 sequenced mouse inbred strains, PNAS, doi:10.1073/pnas.1706168114, 2017.