1

I have a service like backupify. Which Downloads data from different social media platforms, Currently i have about 2500 active users, for each user a script runs which gets data from facebook and stores them on Amazon S3, My server is Ec2 Instance on AWS.

I have entries in table like 900 entries for facebook users, There is a PHP script which runs and gets user from database table and then backups data from the facebook and then picks the next user from facebook.

Everything was fine when i was having less than 1000 users, but now i have more than 2500 users problem is that the PHP script halts, or runs for first 100 users and then halts, time out etc. I am running PHP Script fro php -q myscript.php command.

The other problem is that single user scripts takes about 65 seconds to reach the last user from the database table is may take days, so whats the best way to run parrallel on the databse table etc.

Please suggest me what is the best way to backup large amount of data for large amount of users, and i should be able to monitor the cron, somehting like a mangaer.

4

1 回答 1

2

如果我没看错的话,你就有一个针对所有用户的 cron 任务,以某种频率运行,试图一次性处理每个用户的数据。

  1. 您是否尝试发出set_time_limit(0); 在代码的开头?
  2. 此外,如果任务需要资源,您是否考虑为每个 N 用户创建一个单独的 cron 任务(基本上模仿多线程行为;从而利用服务器的多个 CPU 内核)?
  3. 将您的数据写入某种缓存而不是数据库,并让一个单独的任务将缓存内容提交到数据库对您来说可行吗?
  4. 您是否有机会使用内存数据表(非常快)?您需要时不时地将数据库内容保存到磁盘,但是以这个价格,您可以获得快速的数据库访问。
  5. 您是否可以将任务外包以将服务器作为分布式服务进行分离,并将 cron 脚本编写为它们的负载均衡器?
  6. 优化您的代码也可能会有所帮助。例如(如果您还没有这样做),您可以缓冲收集的数据并在脚本末尾的单个事务中提交,这样执行流程就不会因 DB 重复 I/O 阻塞而分散。
于 2013-03-18T11:22:38.580 回答