Avocado genes - Homologous, Orthologous or Paralogous?

by Laurie Meadows

Arabidopsis thaliana, not much more than a ‘steekin little weed’, has proved to be a towering giant in revealing gene structure, function and location in plants.

Genes tend to be conserved over time, even as speciation takes place. It’s been shown to be reasonable to suppose that if a gene found in Arabidopsis does a certain thing with an observable effect, then the same effect observed in a different species may be caused by a gene very similar to that already described in Arabidopsis.

Genes that are common to two (or more) species, and assumed to be from a lineage continuous (up each fork) from the branching point of a common ancestor of the two species were/are termed ‘orthologs’ of each other. But if the gene functioned differently in the 2 species (influenced by, for example, the ‘epigenetic’ effects of other mutated genes in one or both the species) it wasn’t/isn’t of primary importance.

The same gene appearing in two different species, but with no ‘line of descent’, were/are called ‘paralogs’. (In other words, a gene duplicated ‘anew’ in the two species, independently.)

When genomes of Arabidopsis, Vitis, Populus, Solanum lycopericon and now Persea are compared, you can see same or related genes across genera/species. Are they retained from a common ancestor in the distant past and thus orthologs? Arisen anew and thus paralogs? Who knows?

Does it really matter? What do the genes do? How does the expression of these genes differ (if they do)? What other genes affect (or even effect) the expression of the gene in the taxon of interest (in this case, Persea)? This is what is important and interesting.

So I am going with Roy Jensen’s concept (outlined below) and use the term ‘homolog’ and the appropriate adjective (if determinable), unless referring to speciation events and whole genome duplication events in deep time.

“Genomic biology needs to get beyond semantic issues. It needs to focus on defining those sequence-structure-function relationships that are necessary for understanding both the structural origins of biological function and the molecular bases for the divergence of biological function. So, those of us who study the relationships among sequence, structure, and function should discontinue the use of ‘ortholog’ and ‘paralog’, unless we want to focus on the speciation and gene duplication events that produced functional diversity in homologs.

But, unlike Petsko [2], we believe that genomic biologists need to describe, compare, and contrast sequence-structure-function relationships not only for a complete group of homologs but also for subsets of homologs that share particular attributes. Based on our experiences, genomic biologists need words to describe ‘homologs encoded by different genomes’ and ‘homologs that have different functions’.

To accomplish these needs, we suggest the following adjectives to describe homologs: ‘Isofunctional’ homologs exhibit the same function(s); ‘heterofunctional’ homologs exhibit different functions; ‘isospecic’ homologs are found in the same species; and ‘heterospecic’ homologs are in different species.

Jensen RA. Orthologs and paralogs – we need to get it right. Genome Biol. 2001;2(8):INTERACTIONS1002. doi:10.1186/gb-2001-2-8-interactions1002