1

在 MySQL 环境中,有一个专用的 MySQL 服务器和一个专用的应用服务器,哪个更好 -

一个。在连接到数据库服务器的应用服务器上运行无限 Java 代码,根据连接获取一些记录,然后将它们插入数据库

-或者-

湾。在基于联接(选择)执行插入的数据库上运行无限存储过程

在执行时间、数据库负载、内存要求和数据库继续处理其他插入/更新的能力方面需要答案

4

4 回答 4

3

我不确定执行时间、数据库负载和内存要求,但根据我的经验,最好在业务层(而不是数据库)中完成所有逻辑工作。此外,存储过程在大型项目中的可扩展性较低且难以维护。所以我的选择是A。

于 2012-07-21T05:20:35.960 回答
2

Some info is missing, but I'm guessing as for that:

The rows are most obviously not coming in an an infinite rate.

You are most probably polling as for that. That is, you are making some sort of sleep() between cycles.

If you're not - then you should know you could be pressing a high load on the database server in either case.

So, assuming there'll be some sort of sleep (let's say 1 second, for simplicity), it turns out there's not much difference between you Java code and the stored routine code. Why is that?

  • The sleeps are idle any way. No locks will be held during sleep time.
  • Any query you issue in Java code must be issued from routine code and vice versa.
  • There is not much (or at all) computational complexity to your code. You're most probably checking on some MAX(id) from the target table, then INSERT INTO ... SELECT ... FROM ... WHERE id > max_id_as_just_calculated, or something similar.

Execution time may actually be somewhat in favor of routine code, since you do not need to ship result sets back and forth between MySQL and Java. Moreover, you can just INSERT INTO ... SELECT FROM in one query, instead of translating result set into java objects/primitives, then preparing new INSERT query, translating back to MySQL data.

In terms of DB load I see no real difference, again with a slight improvement on routine side due to network delivery time (time in which locks may still be held).

Considerations:

How would you invoke this procedure from Java? It would run for an indefinite amount of time. So would you dedicate a thread to it?

Suppose it crashed (error of some sorts) -- need to be able to re-execute it (not a big deal, just an issue to consider).

You could execute it via the event scheduler -- that would solve many of the above issues: instead of looping via the routine, let the scheduler invoke it every X seconds. But then - consider locks again.

My own preference: I would probably use Java code, or I would use the event scheduler if I'm comfortable adding this logic to the RDBMS.

于 2012-07-21T05:28:42.177 回答
1

I'm not sure there is such a thing as an "infinitely running query". Perhaps you mean a query that is run repeatedly.

Anyway, as a general rule, you will get better throughput if you can avoid the overheads of transferring large amounts data backwards and forwards between the database and an application. On the other hand, if the "thing" you are trying to do is computationally intensive (rather than data intensive) then doing the computation in the application (running on a different machine to the DB) is going to reduce DB load.

Need answer in terms of execution time, db load, memory requirement and ability of the db to continue processing other inserts/updates

It is not possible to quantify these things in the general case, but there are obvious trade-offs:

  • Avoiding transferring lots of data reduces network load and CPU load (in database drivers).
  • But doing "everything" on the database increases the load on the database.

How it will work out in practice will depend critically on the details of the actual use-case.

于 2012-07-21T05:24:05.120 回答
1

For some databases I would opt for the stored procedure. Why shift data about and besides the database has knowledge about that data.

But - and it is a bit failing (IMHO) that MySql you cannot have commit or rollback inside a stored procedure. So I would think that an infinite stored procedure in the MySql context will not work as expected.

于 2012-07-21T05:29:25.927 回答