0

I'm looking to optimize my cursor based update or actually replacing it...

Situation

We're carrying out promotional campaigns and I'd like to track user activity per campaign.

Logic

Each campaign is pushed to a specific batch - Segment of our customer base

CREATE TABLE `segments` (
  `campaign_id` int(6) DEFAULT NULL,
  `customer_id` varchar(20) DEFAULT NULL,
  `tracking_start_date` date DEFAULT NULL,
  `tracking_end_date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

The tracking_start_date is the date of the campaign while tracking_end_date is the date tracking should end.

Each campaign has it's own "Call to Action (cta)" which is the transaction type we're pushing and hoping customers will start using after the campaign.

CREATE TABLE `cta` (
  `campaign_id` int(11) DEFAULT NULL,
  `Date` date DEFAULT NULL,
  `segment` varchar(100) DEFAULT NULL,
  `message` varchar(320) DEFAULT NULL,
  `Size` int(11) DEFAULT NULL,
  `cta` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

By default the tracking_end_date in the segments table is set to last day of the month but I've created a procedure to check and update this field. (Campaign_id's are issued sequentially according to the campaign date and so the earliest campaign has the least campaign_id value and vise versa) Tracking is done on a calender month on month basis.

Update scenario

For each record in the segments table check whether the same customer_id appears in a future campaign and if that campaign with a greater tracking_start_date has the same CTA.

If TRUE: change the tracking_end_date for that record to a day before the new campaign.

If FALSE: retain last day of tracking_start_date month as the tracking_end_date.

If the update is not done, then we would be double/tripple... counting transactions for customers who appear in several campaigns and had the same CTA.

Below is the procedure I'm currently using but the problem is that it's too slow.

These procedure is housed in another one that loops through the campaign_id's for the month then calls this procedure while supplying the relevant campaign_id

    CREATE DEFINER=`root`@`localhost` PROCEDURE `set_campaign_end_date_child`(IN var_campaign_id INT)
BEGIN
 DECLARE done INT DEFAULT 0;
 DECLARE var_customer_id VARCHAR(20);
 DECLARE var_tracking_start_date DATE;
 DECLARE cur1 CURSOR FOR SELECT DISTINCT customer_id FROM segments WHERE campaign_id =var_campaign_id; 
 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
 -- perform cursur update loop now
OPEN cur1;
read_loop: LOOP
 IF done THEN
 LEAVE read_loop;
 END IF;
 FETCH cur1 INTO var_customer_id;
 SELECT DISTINCT DATE INTO var_tracking_start_date FROM cta WHERE campaign_id = var_campaign_id;
 UPDATE segments SET tracking_end_date = 
 (SELECT  IFNULL(DATE_SUB(MIN(tracking_start_date),INTERVAL 1 DAY),LAST_DAY(var_tracking_start_date)) FROM segments_temp 
 WHERE customer_id = var_customer_id 
 AND campaign_id 
 IN(SELECT campaign_id FROM cta WHERE cta IN (SELECT cta FROM cta WHERE campaign_id = var_campaign_id)
 AND campaign_id > var_campaign_id))
 WHERE customer_id = var_customer_id AND campaign_id =var_campaign_id ;
 END LOOP read_loop;
CLOSE cur1;
 END$$

DELIMITER ;

PS: Before initiating the procedure, I make a copy of the segments table in another one called segments_temp and do the comparison from there (this is because MySQL cannot do updates from a self referencing query)

Hope I'm clear & thanks in advance for your ideas

4

1 回答 1

1

使用 self-JOIN 允许您引用该segments表两次。如果我正确理解您的代码,我认为这是等效的更新:

UPDATE segments AS s1
LEFT JOIN (SELECT customer_id, DATE_SUB(MIN(tracking_start_date),INTERVAL 1 DAY) AS new_tracking_start_date
           FROM segments AS s2
           WHERE campaign_id IN (
                SELECT campaign_id 
                FROM cta 
                WHERE cta IN (
                    SELECT cta 
                    FROM cta 
                    WHERE campaign_id = var_campaign_id)
                AND campaign_id > var_campaign_id)
           GROUP BY customer_id) AS s3 
ON s1.customer_id = s3.customer_id
SET tracking_end_date = IFNULL(new_tracking_start_date, LAST_DAY(tracking_start_date))
WHERE campaign_id = var_campaign_id
于 2015-03-27T11:47:36.157 回答