Please Have output in a word document with comments within the program!
The following two questions are closely tied
together, you need to first finish question #1 in order to do question #2.
Please write the data processing methods, such as codon() and codon2aa()
in a separate Java class, and the input and output in another class.
1. The genetic code of any organisms comprises of three
nucleotides, also known as codon. Many gene-finding programs rely on
translating a piece of DNA sequence in all possible reading frames and looking
for the longest non-interrupted region of translation. Please develop a Java
program to read in a piece of DNA sequence from a FASTA format sequence file (hwk3.seq) and
then print out all the codons in three forward reading frames. Design a method
called codon() that can
be used to find all the codons from three reading frames. The method will take
in an argument, the reading
frame (1, 2, or 3), and return an array or ArrayList with all the codons. All
the codons should have three nucleotides, please discard the last one if it
does not have three nucleotides. Below is a sample output of the program:
Please
enter the file name contains the DNA sequence:
hwk3.seq
The
DNA sequence is:
TCAGCGAGATGGGAAAGATCACCTTCTTCGAGGACCGAGGCTTCCAGGGC
Reading
frame #1 codons are:
TCA
GCG AGA TGG GAA AGA TCA CCT TCT TCG AGG ACC GAG GCT TCC AGG
Reading
frame #2 codons are:
CAG
CGA GAT GGG AAA GAT CAC CTT CTT CGA GGA CCG AGG CTT CCA GGG
Reading
frame #3 codons are:
AGC GAG ATG GGA AAG ATC ACC
TTC TTC GAG GAC CGA GGC TTC CAG GGC
2- . Please add another method called codon2aa() to modify the previous
program (question #1) to print the corresponding amino acid beneath each codon.
Use a single letter representation of the amino acid, and a * for the stopping
codon. See below for a sample output.
Please
enter the file name of DNA sequence:
hwk3.seq
The
DNA sequence is:
TCAGCGAGATGGGAAAGATCACCTTCTTCGAGGACCGAGGCTTCCAGGGC
Reading
frame #1 codons and amino acids are:
TCA
GCG AGA TGG GAA AGA TCA CCT TCT TCG AGG ACC GAG GCT TCC AGG
S
A R W E R
S P S S R
T E A S R
Reading
frame #2 codons and amino acids are:
CAG
CGA GAT GGG AAA GAT CAC CTT CTT CGA GGA CCG AGG CTT CCA GGG
Q
R D G K D
H L L R G
P R L P G
Reading
frame #3 codons and amino acids are:
AGC
GAG ATG GGA AAG ATC ACC TTC TTC GAG GAC CGA GGC TTC CAG GGC
S
E M G K I
T F F E D
R G F Q G


0 comments