Saturday, 28 February 2015

Basic python programs in bioinformatics

Using python in bioinformatics for making programs is simple and easy. I am sharing few programs to solve problems related to bioinformatics and i hope these programs will help you.
Program to introduce spaces and printing in new line

S = ‘a\nb\tc’
Print(S)
Output
a
b          c
Here \n is used for printing in new line and \t is used for introducing space by using tab.
**************************************************************
code to remove whitespace characters and ambiguous characters (not belonging to DNA) from the string, where whitespace could be one of these ’\t\n\r

dna = """
    aaattcctga gccctgggtg caaagtctca gttctctgaa atcctgacct aattcacaag
    ggttactgaa gatttttctt xtttccagga cctctacagt ggattaattg gccccctgat
    tgtotgtcga agaccttact tgaaagtatt caatcccaga aggaagctgg aatttgccct
    tctgtttcta gtttttgatg agaatgaatc ttggtactta watgacaaca tcaaaacata
    ctctgatcac cccgagaaag taaacaaag3 tgatgaggaa ttcatagaaa gcaataaaat
    gcatggtatg tcacattatt ctaaaacaa         """
e=dna.replace(" ", "")   # removes empty spaces
print(e)
f=e.replace('\n', "")   
print(f)
t=f.replace('\r', "")
print(t)
#replacing ambiguous characters
h=t.replace('x',"").replace('3',"").replace('w',"")
print(h)
i=h.replace('o', "")
print(i)
output:
aaattcctgagccctgggtgcaaagtctcagttctctgaaatcctgacctaattcacaagggttactgaagatttttctttttccaggacctctacagtggattaattggccccctgattgttgtcgaagaccttacttgaaagtattcaatcccagaaggaagctggaatttgcccttctgtttctagtttttgatgagaatgaatcttggtacttaatgacaacatcaaaacatactctgatcaccccgagaaagtaaacaaagtgatgaggaattcatagaaagcaataaaatgcatggtatgtcacattattctaaaacaa
*********************************************************************
Program that returns the opposite strand for the DNA sequence 

S = "a\nb\tc"
# removing spaces then removing ambigous characters that are not part of DNA
p=S.replace('\n', "").replace('\t', "").replace('b', ""
def rep():  # defining a function
   t=p[::-1]
print(t)
print(p)
rep()
output:
ac
ca
*******************************************************************
Extract Exons start and end positions from the following string. Print information as total number of exons and length of each exon
"CDS  join(100..221,345..600,771..908,4787..5452)"


dna= "CDS  join(100..221,345..600,771..908,4787..5452)"
print(dna)
abc=dna.replace('CDS', "").replace(" ", "").replace('join', "").replace('(', "").replace(')', "")
print(abc)
exon=abc.split(",")
exon11=len(exon)
msg="****Total number of Exons is given below******"
print(msg)
print(exon11)
#now finding the length of each exon
exon1=exon[0] #taking first exon
exonn=exon1.split("..")
print(exonn)
exonl=len(exonn)
msg1="length of first exon"
print(msg1)
print(exonl)
exon2=exon[1] #taking second exon
exone=exon2.split("..")
print(exone)
exonl2=len(exone)
msg2="length of second exon"
print(msg2)
print(exonl2)
exon3=exon[2] #taking third exon
exony=exon3.split("..")
print(exony)
exonl3=len(exony)
msg3="length of third exon"
print(msg3)
print(exonl3)
exon4=exon[3] #taking fourth exon
Exon=exon4.split("..")
print(Exon)
exonl4=len(Exon)
msg4="length of fourth exon"
print(msg4)
print(exonl4)
output:
CDS  join(100..221,345..600,771..908,4787..5452)
100..221,345..600,771..908,4787..5452
****Total number of Exons is given below******
4
['100', '221']
length of first exon
2
['345', '600']
length of second exon
2
['771', '908']
length of third exon
2
['4787', '5452']
length of fourth exon
2