following some research, I found a query suitable for my needs: it randomly returns IDs from the table. The ID field is an auto-increment, so there are no holes.
SELECT `mydb`.`myTable`.id
FROM (SELECT Floor (Rand() * (SELECT Count(*)
FROM `mydb`.`myTable`)) num,
@num := @num + 1
FROM (SELECT @num := 0) a,
`mydb`.`myTable`
LIMIT 2000000) b,
`mydb`.`myTable`
WHERE b.num = `mydb`.`myTable`.id
The issue I experience is that the target table (myTable) contains 30-400M records, depending on the situation. In the LIMIT, I want to retrieve 2M randomly selected IDs, however I get a lot of duplicates (which is expected).
Is it possible to de-duplicate the query and yet receive 2M records? I thought to create a table and let it manage the UNIQUE values, but again I will get less than expected.
Any thoughts? Many thanks!