1

我正在尝试从我的数据库中的特定标签将所有图像存储在 Instagram 上。我的代码调用API并加载第一轮图像

$json =  file_get_contents('https://api.instagram.com/v1/tags/hahanotfunny/media/recent?client_id='. $client_id .'&max_tag_id='. $max_tag_id)  ;

一旦它通过响应的第一页,它就会通过抓取响应的分页部分中的链接来检查是否还有另一个页面。它还会检查“max_tag_id”是否有值并更新我的数据库中的“max”值。我有一个最大值的原因是,当我收到“实时”的回复说有新图像时,我会从最后一个最大值开始下载它们。但是,我的代码有问题。如果我们在响应的最后一页(没有更多的分页链接),则没有“max_tag_id”变量,因此没有更新数据库。因此,下次我的刮刀运行时,它会从最后一个已知的“max_tag_id”开始,这会导致最后一页上的重复图像记录在我的数据库中。

所以,我的问题是,当我收到另一个“实时”警报说新图像可用于特定标签时,我如何从存储在数据库中的最后一条记录开始找到它们?

$dbConnection = new PDO('mysql:dbname=XXXXXXX;host=127.0.0.1;charset=utf8', 'XXXXXXXXX', 'XXXXX');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

function getMax() {
global $dbConnection;

$tag = 'hahanotfunny';
$selectTotals = $dbConnection->prepare("SELECT * FROM instagram_time WHERE tag = :tag LIMIT 1");
$selectTotals->execute(array(':tag' => $tag));

foreach ($selectTotals as $time) {
    $max = $time['max'];    
}

return $max;
}

function updateMax($data) {
global $dbConnection;

$tag = 'hahanotfunny';
$selectTotals = $dbConnection->prepare("UPDATE instagram_time SET max = :maxid WHERE tag = :tag LIMIT 1");
$selectTotals->execute(array(':maxid' => $data, ':tag' => $tag));
}

function fetchData() {
global $dbConnection, $client_id;

$max_tag_id = getMax();
$json =  file_get_contents('https://api.instagram.com/v1/tags/hahanotfunny/media/recent?client_id='. $client_id .'&max_tag_id='. $max_tag_id)  ;
$data = json_decode($json);

$next_max = $data->pagination->next_max_tag_id;

foreach ($data->data as $insta) {
    echo '<br/><img  src="'.$insta->images->low_resolution->url.'"/>';

}

foreach ($data as $object) {
    if ( is_array( $object ) ) {
        foreach ( $object as $media ) {
            $url = $media->images->standard_resolution->url;
            $m_id = $media->id;
            $c_time = $media->created_time;
            $user = $media->user->username;
            $filter = $media->filter;
            $comments = $media->comments->count;
            $caption = $media->caption->text;
            $link = $media->link;
            $low_res=$media->images->low_resolution->url;
            $thumb=$media->images->thumbnail->url;
            $lat = $media->location->latitude;
            $long = $media->location->longitude;
            $loc_id = $media->location->id;

            $data = array(
                'media_id' => $m_id,
                'min_id' => $next_min_id,
                'url' => $url,
                'c_time' => $c_time,
                'user' => $user,
                'filter' => $filter,
                'comment_count' => $comments,
                'caption' => $caption,
                'link' => $link,
                'low_res' => $low_res,
                'thumb' => $thumb,
                'lat' => $lat,
                'long' => $long,
                'loc_id' => $loc_id,
            );

            $selectTotals = $dbConnection->prepare("INSERT INTO instagram_mg (media_id, min_id, url, c_time, user, filter, comment_count, caption, link, low_res_link, thumb, latitude, longitude, loc_id) VALUES (:mediaid, :minid, :url, :ctime, :user, :filter, :commentcount, :caption, :link, :lowreslink, :thumb, :latitude, :longitude, :locid)");

            $selectTotals->execute(array(':mediaid' => $data['media_id'], ':minid' => $data['min_id'], ':url' => $data['url'], ':ctime' => $data['c_time'], ':user' => $data['user'], ':filter' => $data['filter'], ':commentcount' => $data['comment_count'], ':caption' => $data['caption'], ':link' => $data['link'], ':lowreslink' => $data['low_res'], ':thumb' => $data['thumb'], ':latitude' => $data['lat'], ':longitude' => $data['long'], ':locid' => $data['loc_id']));


        }
    }
}


if (isset($next_max)) {
    echo $next_max . "</br>";
    updateMax($next_max);
    fetchData();
} else {
    //$current_time = time();
    //updateMax($current_time); // i tried making the current time the "max_tag_id" but it wouldnt work. 

}


} //fetchData()


fetchData();
4

1 回答 1

0

就个人而言,我会使用数据库触发器和随附的函数来检查重复项。来自 mysql.com:

触发器被定义为在对关联表执行 INSERT、DELETE 或 UPDATE 语句时激活。触发器可以设置为在触发语句之前或之后激活。例如,您可以在插入表的每一行之前或在更新的每一行之后激活触发器。

例如:

CREATE TRIGGER insert_check BEFORE INSERT ON instagram_mg
FOR EACH ROW
BEGIN
    *pseudo-code from here cause I don't know the exacts of mySQL functions*
    if new.media_id is equal to an existing record's media_id, 
        then return false, 
    otherwise, insert the row as normal.
END;

对不起,我不能更具体。我主要处理 postgreSQL 函数,我知道语法略有不同。这是 mySQL 触发器语法的链接:http: //dev.mysql.com/doc/refman/5.0/en/trigger-syntax.html

我希望这有帮助。干杯!

于 2013-06-20T02:59:19.697 回答