我知道我一定是非常愚蠢的,但我正在尝试使用相当复杂的语句(至少对我而言)查询数据库,并且我得到的行数比我预期的要多,有人知道如何“解决”这个问题吗?
我正在查询的表创建如下:
glycoPeptide | CREATE TABLE `glycoPeptide` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`protein` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1 |
run | CREATE TABLE `run` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`glycoPeptide` int(11) NOT NULL,
`run` enum('spectrum','chromatogram') NOT NULL,
`glycoType` enum('N','O') DEFAULT NULL,
`glycoSite` int(11) DEFAULT NULL,
`pepMass` varchar(5) DEFAULT NULL,
`pepSeq` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `glycoPeptide` (`glycoPeptide`),
CONSTRAINT `run_ibfk_1` FOREIGN KEY (`glycoPeptide`) REFERENCES `glycoPeptide` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1 |
spectrum | CREATE TABLE `spectrum` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`run` int(11) NOT NULL,
`glycoform` varchar(255) DEFAULT NULL,
`spectrum` enum('m/z','intensity') NOT NULL,
PRIMARY KEY (`id`),
KEY `run` (`run`),
CONSTRAINT `spectrum_ibfk_1` FOREIGN KEY (`run`) REFERENCES `run` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=29 DEFAULT CHARSET=latin1 |
precursor | CREATE TABLE `precursor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`run` int(11) NOT NULL,
`retentionTime` time DEFAULT NULL,
`mzValue` float DEFAULT NULL,
`chargeState` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `run` (`run`),
CONSTRAINT `precursor_ibfk_1` FOREIGN KEY (`run`) REFERENCES `run` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=latin1 |
binaryDataArray | CREATE TABLE `binaryDataArray` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`spectrum` int(11) NOT NULL,
`arrayLength` int(11) NOT NULL,
`EncodedLength` int(11) NOT NULL,
`arrayData` text,
PRIMARY KEY (`id`),
KEY `spectrum` (`spectrum`),
CONSTRAINT `binaryDataArray_ibfk_1` FOREIGN KEY (`spectrum`) REFERENCES `spectrum` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=29 DEFAULT CHARSET=latin1 |
我有一些关于 2 种蛋白质(IgG 和 IgE)的测试数据。IgG 仅包含 1 次运行,仅包含 1 个糖位点,因此仅包含 1 个 binaryDataArrays 的“集合”。IgE 包含 3 个糖位点,因此有 3 次运行,每次运行可以包含多个光谱(每组 2 个 binaryDataArray)。
我使用以下查询(我知道使用 JOINS 会更漂亮):
select
precursor.mzValue,
glycoPeptide.protein,
binaryDataArray.arrayLength,
binaryDataArray.encodedLength,
precursor.chargeState,
run.pepMass,
run.PepSeq
from
precursor,
glycoPeptide,
binaryDataArray,
spectrum,
run
where
run.glycoPeptide = glycoPeptide.id AND
spectrum.run = run.id AND
precursor.run = run.id AND
binaryDataArray.spectrum = spectrum.id AND
spectrum.spectrum like 'm/z' AND
precursor.mzValue like '1196.79' AND
glycoPeptide.protein like 'IgE' AND
run.glycoSite like '252' AND
run.glycoType like 'N';
产生 IgG 的结果与我预期的一样:
+---------+---------+-------------+---------------+-------------+---------+-----------+
| mzValue | protein | arrayLength | encodedLength | chargeState | pepMass | PepSeq |
+---------+---------+-------------+---------------+-------------+---------+-----------+
| 933.4 | IgG | 10301 | 22912 | 3 | 1189. | EEQYNSTYR |
+---------+---------+-------------+---------------+-------------+---------+-----------+
1 row in set (0.00 sec)
对于 IgE(使用上面的语句),我得到以下结果:
+---------+---------+-------------+---------------+-------------+---------+-----------+
| mzValue | protein | arrayLength | encodedLength | chargeState | pepMass | PepSeq |
+---------+---------+-------------+---------------+-------------+---------+-----------+
| 1196.79 | IgE | 10301 | 109880 | 3 | 1033. | GTVNLTWSR |
| 1196.79 | IgE | 10301 | 54940 | 3 | 1033. | GTVNLTWSR |
| 1196.79 | IgE | 10301 | 54940 | 3 | 1033. | GTVNLTWSR |
+---------+---------+-------------+---------------+-------------+---------+-----------+
3 rows in set (0.00 sec)
虽然我希望这里只有 1 行,但我似乎无法理解它。
任何帮助将不胜感激
-- 编辑 1 --
据我所知,我编写 where 子句的方式应该与 join 完全一样,所以这不应该是问题......
-- 编辑 2 --
样本数据:
select * from glycoPeptide;
+----+---------+
| id | protein |
+----+---------+
| 1 | IgG |
| 2 | IgE |
+----+---------+
2 rows in set (0.00 sec)
mysql> select * from run;
+----+--------------+----------+-----------+-----------+---------+-----------------+
| id | glycoPeptide | run | glycoType | glycoSite | pepMass | pepSeq |
+----+--------------+----------+-----------+-----------+---------+-----------------+
| 1 | 1 | spectrum | N | 297 | 1189. | EEQYNSTYR |
| 2 | 2 | spectrum | N | 275 | 1516. | NGTLTVTSTLPVGTR |
| 3 | 2 | spectrum | N | 252 | 1033. | GTVNLTWSR |
| 4 | 2 | spectrum | N | 99 | 1556. | VAHTPSSTDWVDNK |
+----+--------------+----------+-----------+-----------+---------+-----------------+
4 rows in set (0.00 sec)
select * from precursor;
+----+-----+---------------+---------+-------------+
| id | run | retentionTime | mzValue | chargeState |
+----+-----+---------------+---------+-------------+
| 1 | 1 | 00:13:32 | 933.4 | 3 |
| 2 | 2 | 00:00:00 | 965.55 | 2 |
| 3 | 2 | 00:00:00 | 912.036 | 2 |
| 4 | 2 | 00:00:00 | 1127.06 | 3 |
| 5 | 3 | 00:00:00 | 1099.97 | 2 |
| 6 | 3 | 00:00:00 | 1153.9 | 3 |
| 7 | 3 | 00:00:00 | 1196.79 | 3 |
| 8 | 4 | 00:00:00 | 1109.5 | 2 |
| 9 | 4 | 00:00:00 | 1157.66 | 2 |
| 10 | 4 | 00:00:00 | 1225.66 | 2 |
| 11 | 4 | 00:00:00 | 1206.47 | 3 |
| 12 | 4 | 00:00:00 | 1328.31 | 3 |
| 13 | 4 | 00:00:00 | 1304.09 | 3 |
| 14 | 4 | 00:00:00 | 1165.04 | 2 |
+----+-----+---------------+---------+-------------+
14 rows in set (0.00 sec)
mysql> select * from spectrum;
+----+-----+-----------+-----------+
| id | run | glycoform | spectrum |
+----+-----+-----------+-----------+
| 1 | 1 | G1F | m/z |
| 2 | 1 | G1F | intensity |
| 3 | 2 | NULL | m/z |
| 4 | 2 | NULL | intensity |
| 5 | 2 | NULL | m/z |
| 6 | 2 | NULL | intensity |
| 7 | 2 | NULL | m/z |
| 8 | 2 | NULL | intensity |
| 9 | 3 | NULL | m/z |
| 10 | 3 | NULL | intensity |
| 11 | 3 | NULL | m/z |
| 12 | 3 | NULL | intensity |
| 13 | 3 | NULL | m/z |
| 14 | 3 | NULL | intensity |
| 15 | 4 | NULL | m/z |
| 16 | 4 | NULL | intensity |
| 17 | 4 | NULL | m/z |
| 18 | 4 | NULL | intensity |
| 19 | 4 | NULL | m/z |
| 20 | 4 | NULL | intensity |
| 21 | 4 | NULL | m/z |
| 22 | 4 | NULL | intensity |
| 23 | 4 | NULL | m/z |
| 24 | 4 | NULL | intensity |
| 25 | 4 | NULL | m/z |
| 26 | 4 | NULL | intensity |
| 27 | 4 | NULL | m/z |
| 28 | 4 | NULL | intensity |
+----+-----+-----------+-----------+
28 rows in set (0.00 sec)
mysql> select id, spectrum, arrayLength, encodedLength from binaryDataArray;
+----+----------+-------------+---------------+
| id | spectrum | arrayLength | encodedLength |
+----+----------+-------------+---------------+
| 1 | 1 | 10301 | 22912 |
| 2 | 2 | 10301 | 3092 |
| 3 | 3 | 10301 | 54940 |
| 4 | 4 | 10301 | 109880 |
| 5 | 5 | 10301 | 54940 |
| 6 | 6 | 10301 | 109880 |
| 7 | 7 | 10301 | 102408 |
| 8 | 8 | 10301 | 109880 |
| 9 | 9 | 10301 | 109880 |
| 10 | 10 | 10301 | 54940 |
| 11 | 11 | 10301 | 54940 |
| 12 | 12 | 10301 | 109880 |
| 13 | 13 | 10301 | 54940 |
| 14 | 14 | 10301 | 109880 |
| 15 | 15 | 10301 | 109880 |
| 16 | 16 | 10301 | 54940 |
| 17 | 17 | 10301 | 54940 |
| 18 | 18 | 10301 | 109880 |
| 19 | 19 | 10301 | 109880 |
| 20 | 20 | 10301 | 54940 |
| 21 | 21 | 10301 | 109880 |
| 22 | 22 | 10301 | 54940 |
| 23 | 23 | 10301 | 54940 |
| 24 | 24 | 10301 | 109880 |
| 25 | 25 | 10301 | 54940 |
| 26 | 26 | 10301 | 109880 |
| 27 | 27 | 10301 | 109880 |
| 28 | 28 | 10301 | 54940 |
+----+----------+-------------+---------------+
28 rows in set (0.00 sec)
-- 编辑 3 --
当前所需的数据无法从数据库中收集,因为其中一个关系不存在(需要能够将光谱链接到前体)。我必须感谢 Radical 先生和 Jack 帮助发现了这个缺陷并接受了 Jack 的回答,因为他在查询中的连接表示法比我做的更容易阅读。