4

我正在尝试获得一个从 1 到 2000 万的序列号表。(或 0 到 2000 万)

对于这个常见问题获得与 MySQL 兼容的解决方案是多么困难,我感到非常震惊。

与此类似:在 MySQL 中创建“数字表”

但答案只有 100 万。我不太了解位移计算。

我见过很多 SQL 答案,但大多数都是针对非 MySQL 的数据库,所以由于缺乏对 MySQL 和其他数据库的了解,我无法采用这些代码。

一些参考资料:

SQL,数字辅助表

创建和填充数字表的最佳方法是什么?


请确保您发布的代码在 MySQL 中兼容并且以分号分隔,以便我可以在 PhpMyAdmin 中运行它。我很感激用名为numbers的列命名的表i

我将对每个解决方案进行基准测试,以便将其存档,并希望下次有人尝试搜索此问题时会出现。


迄今为止的基准:

时间以秒为单位。

+---------------+------------------+---------+-----------+------------+
|    Author     |      Method      | 10,000  | 1,000,000 | 20,000,000 |
+---------------+------------------+---------+-----------+------------+
| Devon Bernard | PHP Many Queries | 0.38847 | 39.32716  | ~ 786.54   |
| Bhare         | PHP Few Queries  | 0.00831 | 0.94738   | 19.58823   |
| psadac,Bhare  | LOAD DATA        | 0.00549 | 0.43855   | 10.55236   |
| kjtl          | Bitwise          | 1.36076 | 1.48300   | 4.79226    |
+---------------+------------------+---------+-----------+------------+
4

8 回答 8

3
-- To use the bitwise solution you need a view of 2 to the power 25.
-- the following solution is derived from http://stackoverflow.com/questions/9751318/creating-a-numbers-table-in-mysql
-- the following solution ran in 43.8 seconds with the primary key, without it 4.56 seconds.

-- create a view that has 2 to the power 25 minus 1

-- 2 ^ 1
CREATE or replace VIEW `two_to_the_power_01_minus_1` AS select 0 AS `n` union all select 1 AS `1`;

-- 2 ^ 2
CREATE or replace VIEW `two_to_the_power_02_minus_1` 
AS select
   ((`hi`.`n` << 1) | `lo`.`n`) AS `n`
from (`two_to_the_power_01_minus_1` `lo` join `two_to_the_power_01_minus_1` `hi`) ;

-- 2 ^ 4
CREATE or replace VIEW `two_to_the_power_04_minus_1` 
AS select
   ((`hi`.`n` << 2 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_02_minus_1` `lo` join `two_to_the_power_02_minus_1` `hi`) ;

-- 2 ^ 8
CREATE or replace VIEW `two_to_the_power_08_minus_1` 
AS select
   ((`hi`.`n` << 4 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_04_minus_1` `lo` join `two_to_the_power_04_minus_1` `hi`) ;

-- 2 ^ 12
CREATE or replace VIEW `two_to_the_power_12_minus_1` 
AS select
   ((`hi`.`n` << 8 ) | `lo`.`n`) AS `n`
from (`two_to_the_power_08_minus_1` `lo` join `two_to_the_power_04_minus_1` `hi`) ;

-- 2 ^ 13
CREATE or replace VIEW `two_to_the_power_13_minus_1`
AS select
   ((`hi`.`n` << 1) | `lo`.`n`) AS `n`
from (`two_to_the_power_01_minus_1` `lo` join `two_to_the_power_12_minus_1` `hi`);



-- create a table to store the interim results for speed of retrieval
drop table if exists numbers_2_to_the_power_13_minus_1;

create table `numbers_2_to_the_power_13_minus_1` (
  `i` int(11) unsigned
) ENGINE=myisam DEFAULT CHARSET=latin1 ;

-- faster 2 ^ 13
insert into numbers_2_to_the_power_13_minus_1( i )
select n from `two_to_the_power_13_minus_1` ;

-- faster 2 ^ 12
CREATE or replace view `numbers_2_to_the_power_12_minus_1`
AS select
   `numbers_2_to_the_power_13_minus_1`.`i` AS `i`
from `numbers_2_to_the_power_13_minus_1`
where (`numbers_2_to_the_power_13_minus_1`.`i` < (1 << 12));

-- faster 2 ^ 25
CREATE or replace VIEW `numbers_2_to_the_power_25_minus_1`
AS select
   ((`hi`.`i` << 12) | `lo`.`i`) AS `i`
from (`numbers_2_to_the_power_12_minus_1` `lo` join `numbers_2_to_the_power_13_minus_1` `hi`);

-- create table for results

drop table if exists numbers ;

create table `numbers` (
  `i` int(11) signed 
  , primary key(`i`)
) ENGINE=myisam DEFAULT CHARSET=latin1;

-- insert the numbers
insert into numbers(i)
select i from numbers_2_to_the_power_25_minus_1
where i <= 20000000 ;

drop view if exists numbers_2_to_the_power_25_minus_1 ;
drop view if exists numbers_2_to_the_power_12_minus_1 ;
drop table if exists numbers_2_to_the_power_13_minus_1 ;
drop view if exists two_to_the_power_13_minus_1 ;
drop view if exists two_to_the_power_12_minus_1 ;
drop view if exists two_to_the_power_08_minus_1 ;
drop view if exists two_to_the_power_04_minus_1 ;
drop view if exists two_to_the_power_02_minus_1 ;
drop view if exists two_to_the_power_01_minus_1 ;
于 2013-01-13T01:51:24.400 回答
2

如果速度是一个问题,你应该使用LOAD DATA INFILE哪个比INSERT根据 mysql doc 更快:

http://dev.mysql.com/doc/refman/5.5/en/insert-speed.html

When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times
faster than using INSERT statements. See Section 13.2.6, “LOAD DATA INFILE Syntax”. 

基本上,您使用您最喜欢的语言(php?)生成 2000 万行,然后使用LOAD DATA INFILE.

http://dev.mysql.com/doc/refman/5.5/en/load-data.html

于 2013-01-12T22:15:28.453 回答
2

我创建此类表的典型方法是从以下开始:

select 0 as num union all select 1 union all select 2 union all
select 3 union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9

现在,在大多数数据库中,您可以使用with语句并执行以下操作:

with digits as (above query)
select d1.num+10*d2.num+100*d3.num+1000*d4.num+10000*d5.num+100000*d6.num+1000000*d7.num+10000000*87.num as num
from   digits d1 cross join
       digits d2 cross join
       digits d3 cross join
       digits d4 cross join
       digits d5 cross join
       digits d6 cross join
       digits d7 cross join
       (select 0 as num union all select 1) d8

不幸的是,在 MySQL 中,您要么需要创建一个临时表,要么重复 union all 语句:

select d1.num+10*d2.num+100*d3.num+1000*d4.num+10000*d5.num+100000*d6.num+1000000*d7.num+10000000*d7.num as num
from (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d1 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d2 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d3 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d4 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d5 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d6 cross join
     (select 0 as num union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
     ) d7 cross join
     (select 0 as num union all select 1) d8

在 MySQL 中,如果你想把它放到一个表中,你可以create table numbers as在 select 之前使用。但是,不同的数据库有不同的语法将选择的结果转储到表中。

于 2013-01-12T22:45:06.697 回答
0

我不确定您是尝试拨打一个电话拨打 2000 万行还是拨打一个电话拨打 2000 万次。第二种情况的一个例子是:

<?php
$i =0;
while($i <= 20000000){
$sql = mysql_query("INSERT INTO table_name VALUES ('$i')");
$i +=1;
}
?>

如果您正在寻找 SQL 解决方案,您不能尝试适应

DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
SELECT TOP 20000000 IDENTITY(int,1,1) AS Number
    INTO NumbersTest
    FROM sys.objects s1
    CROSS JOIN sys.objects s2
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest

取自这篇文章,据报道在平均 56.3 毫秒内生成 10,000 行。

于 2013-01-12T21:39:35.453 回答
0

如果已经引用了这个答案,我们深表歉意。这在我的机器上花费了 18.79 秒(如果重要的话,一台戴尔笔记本电脑)......

它改编自http://datacharmer.blogspot.co.uk/2006/06/filling-test-tables-quickly.html的旧解决方案,但至关重要的是,这不适用于默认的“香草”InnoDB 引擎,如果你一开始就尝试建立PK,它会慢得多。

从好的方面来说,您可以免费获得额外的 1350 万行!

drop table if exists numbers;
create table numbers ( id int not null) engine = myisam;

delimiter $$

drop procedure if exists fill_numbers $$
create procedure fill_numbers()
deterministic
begin
  declare counter int default 1;
  insert into numbers values (1);
  while counter < 20000000
  do
      insert into numbers (id)
          select id + counter
          from numbers;
      select count(*) into counter from numbers;
      select counter;
  end while;
end $$
delimiter ;

call fill_numbers();
于 2013-01-12T23:52:19.853 回答
0

采用 psadac 的使用答案LOAD DATA INFILE以及将 BULK 插入的想法应用于 fwrite:

$fh = fopen("data_num.txt", 'a') or die("can't open file");
$i =1;
while($i <= 20000000) {
    $num_string .= "$i\n";
    if($i % 1000000 == 0) {
        fwrite($fh, $num_string);
        $num_string = "";
    }
    $i +=1;
}
fclose($fh);

$dbh->beginTransaction();
$query = "LOAD DATA INFILE '" . addslashes(realpath("data_num.txt")) . "' INTO TABLE numbers LINES TERMINATED BY '\n';";
    $sth = $dbh->prepare($query);
    $sth->execute();
$dbh->commit();
unlink("data_num.txt");

当我使用 Windows 环境时,我不得不使用 addlashes。

有趣的是,通过仅写入 20 次来归档超过 2000 万次的 BULK 技术会导致约 10 秒,而仅写入 2000 万次则需要约 75 秒。使用字符串连接而不是将值推入数组并内爆产生几乎两倍的速度。

于 2013-01-13T00:04:03.157 回答
0

更简单、更快速的解决方案

(此处为原代码)

CREATE TABLE `numbers` (
  `i` INT(11) SIGNED 
  , PRIMARY KEY(`i`)
) ENGINE=myisam DEFAULT CHARSET=latin1;

INSERT INTO numbers(i) SELECT @row := @row + 1 FROM 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t2, 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t3, 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t4, 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t5, 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t6, 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t7, 
(SELECT 0 UNION ALL SELECT 1) t8, 
(SELECT @row:=0) ti;

在我安装了 MySQL 5.5.29 的笔记本电脑上,将接受的答案与这个答案进行比较:

+-----------------+-------+---------------+
| Method          | Rows  | Time consumed |
+-----------------+-------+---------------+
| Accepted answer | 20M+1 |         42.4s |
+-----------------+-------+---------------+
| This one        | 20M   |         35.9s |
+-----------------+-------+---------------+

大约减少 15% 的时间,没有中间视图或表格,并且更易于阅读。

于 2016-08-09T02:09:27.663 回答
-1

为了回应 Devon Bernard 的回答,我决定使用 PDO Mysql PHP 来处理它,并使用仅几个查询的概念。起初我尝试只用 1 个大查询来完成,但 PHP 在默认设置下内存不足,所以我决定调整为每 100,000 次运行一次。即使分配了足够的内存来容纳,也没有显着的改进。

$i = 1;
$inserts = array();
while($i <= 20000000) {
    $inserts[] = "($i)";

    if($i % 100000 == 0) {
        $dbh->beginTransaction();
        $query = "INSERT INTO numbers(i) VALUES " . implode(',', $inserts) . ";";
            $sth = $dbh->prepare($query);
            $sth->execute();
        $dbh->commit();
        $inserts = array();
    }
    $i +=1;
}
于 2013-01-12T22:47:05.407 回答