Tuesday, 10 March 2015

Program in python to find the total number of lines in file containing nucleotide sequence in FASTA format

First of all what you need to do is save a FASTA format of any gene in a text file.

fr = open('C:\\Users\\Administrator\\Desktop\\interleukin.txt','r')
fr = open('C:\\Users\\Administrator\\Desktop\\interleukin.txt')
a1= fr.read()  
line=a1.split('\n') # split the \n to divide sequence into indexes and remove the  description part of fasta format leaving only nucleotide sequence
lines=line[1:-2]   #start from 1st index terminating 0 index
print("total lines in the sequence is", len(lines))
print("length of last line is", len(lines[-1]))
print('first seven: ', len(a))   
a = fr.read(7)   # to get the first seven characters