I have been doing this amazing course on Stepik about bioinformatics for the past couple of weeks now and I wanted to share a program that I made and a little bit about how it worked.
The problem is that we want to find the origin of replication for the bacterial but how to find it is difficult. However one of the properties that the bacteria replication process has is a tendency to turn a certain nucleotide into another in the 3’ to 5” direction and another nucleotide shift going the other direction.
By looking at a series in the genome that has undergone duplication but has not been fixed or edited by the bacteria we can follow the mutations up and down the genome to find the source of the replication.
For instance if the entire genome is shifting heavily in one direction and then later shifts into another, at the ultimate point we can look and find the area in which the genome began duplication!
This simple code will log when the C nucleotide is identified and increase the mutation count by one. The code will also log the G nucleotide and decrease the mutation count by one. At a single extreme we can locate where the origin of order!
Github Link: here