How Bionemo was built
We started building Bionemo by obtaining all the pathways and reactions from the University of Minnesota Biodegradation and Biotransformation Database (July, 2005). For each reaction we tried to associate one or more proteins by manually searching sequences databases, EMBL and GenBank, using as many query terms as possible (including EC number, reaction name, enzyme name or original publication as stated in the UMBBD). The results were screened manually and the sequences were associated to a reaction. We associated a gene to a reaction only if we could find a journal article describing that particular enzymatic activity for that gene, if the name of the gene was the same name than the one displayed at the sequence database and if it was referring to the same organism, including the strain name.
It is important to stress that all connections between complexes and biochemical reactions are established manually and based on literature, as no sequence similarity information, or other computational function prediction methods, are used, nor articles using those methods are utilized as evidence. Finally, picking the information displayed at the DNA entries, we associate different proteins to enzymatic complexes.
A similar process is followed to retrieve genes and transcriptional units. In the case of regulatory proteins, both their binding sites and the regulatory action they perform over their regulated promoters are also manually obtained from the literature. We started collecting information from the reviews by Tropel & van der Meer (pubmed) and Díaz & Prieto (pubmed),and followed references therein.