我在大学的生物信息学课程中有一个项目,我的项目中的一件事是基因预测。
我今天的问题是如何获取字符串中多个特定单词的所有索引。例如,在我的例子中,我想找到所有出现的起始密码子("AUG")
和终止密码子("UAA","UAG", "UGA")
并使用它们来预测基因,只需尝试执行开放阅读框 (ORF)
这是我的初始代码:
private void jButton3ActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
// textArea1.setText(null);\
String str = jTextField1.getText(), y = "", gene = "", dnax = "", text = "";
SymbolList dna = null;
int start_codon_index = -1, stop_codon_index = -1;
if ("".equals(str)) {
jTextArea1.setText("No DNA strand entered.. ");
} else {
if (checksum(str) == 100) {
try {
dna = DNATools.createDNA(str);
} catch (IllegalSymbolException ex) {
Logger.getLogger(m.class.getName()).log(Level.SEVERE, null, ex);
}
try {
dna = DNATools.toRNA(dna);
} catch (IllegalAlphabetException ex) {
Logger.getLogger(m.class.getName()).log(Level.SEVERE, null, ex);
}
dnax = dna.seqString().toUpperCase();
if (dnax.length() % 3 != 0) {
if (dnax.length() % 3 == 1) {
dnax += "-";
}
if (dnax.length() % 3 == 2) {
dnax += "-";
}
}
// System.out.println(dnax);
for (int g = 0; g < dnax.length(); g += 3) {
y = dnax.substring(g, g + 3);
if ("AUG".equals(y)) {
start_codon_index = g;
} else if (start_codon_index != -1 && ("UGA".equals(y) || "UAG".equals(y) || "UAA".equals(y))) {
stop_codon_index = g + 3;
}
}
if (stop_codon_index != -1 && start_codon_index != -1) {
String k = "";
int a = 0;
for (a = start_codon_index; a < stop_codon_index; a++) {
gene += dnax.charAt(a);
}
text += "\nGene start position: " + start_codon_index + "\nGene end position: " + a + "\n Gene: " + gene;
jTextArea1.setText(text);
} else {
jTextArea1.setText("No genes found in Seq: " + dnax);
}
} else {
jTextArea1.setText("Text entered is not a DNA strand..");
}
}
}
这是 checksum() 方法:
private static int checksum(String x) {
int i = 0, checks = 0, count = 0;
char c;
x = x.toUpperCase();
while (i < x.length()) {
c = x.charAt(i);
if (c == 'A' || c == 'T' || c == 'G' || c == 'C' || c == '-') {
count++;
}
i++;
}
try {
checks = (count / x.length()) * 100;
} catch (Exception e) {
e.printStackTrace();
}
return checks;
}
我尝试了其他解决方案,但没有什么对我有用。欢迎任何帮助/建议。