180

我正在尝试将用户最近的 Instagram 媒体放在侧边栏上。我正在尝试使用 Instagram API 来获取媒体。

http://instagram.com/developer/endpoints/users/

文档说 GET https://api.instagram.com/v1/users/<user-id>/media/recent/,但它说要传递 OAuth 访问令牌。访问令牌代表代表用户行事的授权。我不希望用户登录 Instagram 来在侧边栏看到这个。他们甚至不需要拥有 Instagram 帐户。

例如,我可以在不登录 Instagram的情况下访问http://instagram.com/thebrainscoop并查看照片。我想通过 API 做到这一点。

在 Instagram API 中,未经用户身份验证的请求通过 aclient_id而不是access_token. 但是,如果我尝试这样做,我会得到:

{
  "meta":{
    "error_type":"OAuthParameterException",
    "code":400,
    "error_message":"\"access_token\" URL parameter missing. This OAuth request requires an \"access_token\" URL parameter."
  }
}

那么,这不可能吗?在不要求用户首先通过 OAuth 登录 Instagram 帐户的情况下,是否无法获取用户的最新(公共)媒体?

4

20 回答 20

342

var name = "smena8m";
$.get("https://images"+~~(Math.random()*3333)+"-focus-opensocial.googleusercontent.com/gadgets/proxy?container=none&url=https://www.instagram.com/" + name + "/", function(html) {
if (html) {
    var regex = /_sharedData = ({.*);<\/script>/m,
        json = JSON.parse(regex.exec(html)[1]),
        edges = json.entry_data.ProfilePage[0].graphql.user.edge_owner_to_timeline_media.edges;

      $.each(edges, function(n, edge) {
          var node = edge.node;
          $('body').append(
              $('<a/>', {
              href: 'https://instagr.am/p/'+node.shortcode,
              target: '_blank'
          }).css({
              backgroundImage: 'url(' + node.thumbnail_src + ')'
          }));
      });
    }
});
html, body {
  font-size: 0;
  line-height: 0;
}

a {
  display: inline-block;
  width: 25%;
  height: 0;
  padding-bottom: 25%;
  background: #eee 50% 50% no-repeat;
  background-size: cover;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

?__a=1您可以使用登录页面地址旁边的 JSON 格式下载任何 Instagram 用户照片供稿,如下所示。无需获取用户 ID 或注册应用程序,无需令牌,无需 oAuth。

min_idmax_id变量可以用于分页,这里是例子

YQL在截断的 iframe 中可能无法在此处工作,因此您可以随时在YQL 控制台中手动检查它

CORS Access-Control-Allow-Headers2018 年 4 月更新:在最新的 instagram 更新后,您无法在客户端 (javascript) 上执行此操作,因为由于限制,无法使用 javascript 设置签名请求的自定义标头。仍然可以通过php基于rhx_gis,csrf_token和请求参数的具有适当签名的任何其他服务器端方法来执行此操作。你可以在这里阅读更多关于它的信息。

2019 年 1 月更新:YQL 已退休,因此,请查看我使用 Google Image Proxy 作为CORSInstagram 页面代理的最新更新!然后只有负面时刻 - 此方法无法使用分页。

PHP解决方案:

    $html = file_get_contents('https://instagram.com/apple/');
    preg_match('/_sharedData = ({.*);<\/script>/', $html, $matches);
    $profile_data = json_decode($matches[1])->entry_data->ProfilePage[0]->graphql->user;
于 2015-11-18T15:20:30.660 回答
127

这已经很晚了,但如果它对某人有帮助的话是值得的,因为我在 Instagram 的文档中没有看到它。

要在(撰写本文时)上执行 GET,https://api.instagram.com/v1/users/<user-id>/media/recent/您实际上不需要 OAuth 访问令牌。

你可以执行https://api.instagram.com/v1/users/[USER ID]/media/recent/?client_id=[CLIENT ID]

[CLIENT ID] 将是通过管理客户端在应用中注册的有效客户端 ID(与用户无关)。您可以通过执行 GET users 搜索请求从用户名中获取 [USER ID]: https://api.instagram.com/v1/users/search?q=[USERNAME]&client_id=[CLIENT ID]

于 2013-12-11T21:03:15.247 回答
53

11.11.2017
由于 Instagram 改变了他们提供这些数据的方式,现在上述方法都不起作用。这是获取用户媒体的新方法:
GET https://instagram.com/graphql/query/?query_id=17888483320059182&variables={"id":"1951415043","first":20,"after":null}
Where:
query_id- 永久值:17888483320059182(请注意,将来可能会更改)。
id- 用户的 ID。它可能带有用户列表。要获取用户列表,您可以使用以下请求:GET https://www.instagram.com/web/search/topsearch/?context=blended&query=YOUR_QUERY
first- 要获取的项目数量。
after- 如果您想从该 id 获取项目,则为最后一个项目的 id。

于 2017-11-11T22:23:36.463 回答
52

我能够使用以下 API 获得用户的最新媒体,无需身份验证(包括描述、喜欢、评论计数)。

https://www.instagram.com/apple/?__a=1

例如

https://www.instagram.com/{username}/?__a=1
于 2017-11-30T10:38:36.097 回答
15

截至上周,Instagram 禁用/media/了 url,我实施了一种解决方法,目前效果很好。

为了解决这个线程中大家的问题,我写了这个:https ://github.com/whizzzkid/instagram-reverse-proxy

它使用以下端点提供所有 instagram 的公共数据:

获取用户媒体:

https://igapi.ga/<username>/media
e.g.: https://igapi.ga/whizzzkid/media 

获取具有限制计数的用户媒体:

https://igapi.ga/<username>/media?count=N // 1 < N < 20
e.g.: https://igapi.ga/whizzzkid/media?count=5

使用 JSONP:

https://igapi.ga/<username>/media?callback=foo
e.g.: https://igapi.ga/whizzzkid/media?callback=bar

代理 API 还会将下一页和上一页 URL 附加到响应中,因此您无需在最后进行计算。

希望你们喜欢!

感谢@350D 发现这个:)

于 2017-03-31T23:11:11.883 回答
14

Instagram API 需要通过 OAuth 进行用户身份验证才能访问用户最近的媒体端点。目前似乎没有任何其他方式可以为用户获取所有媒体。

于 2013-06-29T19:34:10.350 回答
9

由于 Instagram 不断变化(并且设计非常糟糕)的 API 架构,上述大部分内容自 2018 年 4 月起将不再有效。

如果您直接使用该https://www.instagram.com/username/?__a=1方法查询他们的 API,这是访问单个帖子数据的最新路径。

假设您返回的JSON数据是$data您可以使用以下路径示例遍历每个结果:

foreach ($data->graphql->user->edge_owner_to_timeline_media->edges as $item) {

    $content_id = $item->node->id; 
    $date_posted = $item-node->taken_at_timestamp;
    $comments = $item->node->edge_media_to_comment->count;
    $likes = $item->node->edge_liked_by->count;
    $image = $item->node->display_url;
    $content = $item->node->edge_media_to_caption->edges[0]->node->text;
    // etc etc ....
}

最近的变化中的主要内容是graphqledge_owner_to_timeline_media

看起来他们将在2018 年12 月为非“商业”客户取消此 API 访问权限,因此请尽可能充分利用它。

希望它可以帮助某人;)

于 2018-04-03T03:28:22.033 回答
9

这是一个rails解决方案。这是一种后门,实际上是前门。

# create a headless browser
b = Watir::Browser.new :phantomjs
uri = 'https://www.instagram.com/explore/tags/' + query
uri = 'https://www.instagram.com/' + query if type == 'user'

b.goto uri

# all data are stored on this page-level object.
o = b.execute_script( 'return window._sharedData;')

b.close

您返回的对象取决于它是用户搜索还是标签搜索。我得到这样的数据:

if type == 'user'
  data = o[ 'entry_data' ][ 'ProfilePage' ][ 0 ][ 'user' ][ 'media' ][ 'nodes' ]
  page_info = o[ 'entry_data' ][ 'ProfilePage' ][ 0 ][ 'user' ][ 'media' ][ 'page_info' ]
  max_id = page_info[ 'end_cursor' ]
  has_next_page = page_info[ 'has_next_page' ]
else
  data = o[ 'entry_data' ][ 'TagPage' ][ 0 ][ 'tag' ][ 'media' ][ 'nodes' ]
  page_info = o[ 'entry_data' ][ 'TagPage' ][ 0 ][ 'tag' ][ 'media' ][ 'page_info' ]
  max_id = page_info[ 'end_cursor' ]
  has_next_page = page_info[ 'has_next_page' ]
end

然后我通过以下方式构造一个 url 来获得另一页结果:

  uri = 'https://www.instagram.com/explore/tags/' + query_string.to_s\
    + '?&max_id=' + max_id.to_s
  uri = 'https://www.instagram.com/' + query_string.to_s + '?&max_id='\
    + max_id.to_s if type === 'user'
于 2016-07-25T16:21:52.540 回答
8

如果您正在寻找一种方法来生成用于单个帐户的访问令牌,您可以试试这个 -> https://coderwall.com/p/cfgneq

我需要一种方法来使用 instagram api 来获取特定帐户的所有最新媒体。

于 2013-07-17T15:29:12.667 回答
7

只想添加到@350D 答案,因为我很难理解。

接下来是我的代码逻辑:

第一次调用 API 时,我只调用https://www.instagram.com/_vull_ /media/. 当我收到响应时,我会检查more_available. 如果是真的,我会从数组中获取最后一张照片,获取它的 id,然后再次调用 Instagram API,但这次是 https://www.instagram.com/_vull_/media/?max_id=1400286183132701451_1642962433.

这里要知道的重要一点,这个 Id 是数组中最后一张图片的 Id。因此,当用数组中图片的最后一个 id 请求 maxId 时,您将获得接下来的 20 张图片,依此类推。

希望这能澄清事情。

于 2017-01-17T15:07:21.067 回答
7

JSFiddle

Javascript:

$(document).ready(function(){

    var username = "leomessi";
    var max_num_items = 5;

    var jqxhr = $.ajax( "https://www.instagram.com/"+username+"/?__a=1" ).done(function() {
        //alert( "success" );
    }).fail(function() {
        //alert( "error" );
    }).always(function(data) {
        //alert( "complete" )
        items = data.graphql.user.edge_owner_to_timeline_media.edges;
        $.each(items, function(n, item) {
            if( (n+1) <= max_num_items )
            {
                var data_li = "<li><a target='_blank' href='https://www.instagram.com/p/"+item.node.shortcode+"'><img src='" + item.node.thumbnail_src + "'/></a></li>";
                $("ul.instagram").append(data_li);
            }
        });

    });

});

HTML:

<ul class="instagram">
</ul>

CSS:

ul.instagram {
    list-style: none;
}

ul.instagram li {
  float: left;
}

ul.instagram li img {
    height: 100px;
}
于 2018-02-07T00:37:41.723 回答
6

One more trick, search photos by hashtags:

GET https://www.instagram.com/graphql/query/?query_hash=3e7706b09c6184d5eafd8b032dbcf487&variables={"tag_name":"nature","first":25,"after":""}

Where:

query_hash - permanent value(i belive its hash of 17888483320059182, can be changed in future)

tag_name - the title speaks for itself

first - amount of items to get (I do not know why, but this value does not work as expected. The actual number of returned photos is slightly larger than the value multiplied by 4.5 (about 110 for the value 25, and about 460 for the value 100))

after - id of the last item if you want to get items from that id. Value of end_cursor from JSON response can be used here.

于 2018-01-26T12:18:15.043 回答
4

如果您绕过 Oauth,您可能不知道他们是哪个 Instagram 用户。话虽如此,有几种方法可以在没有身份验证的情况下获取 Instagram 图像。

  1. Instagram 的 API 允许您在不进行身份验证的情况下查看用户最受欢迎的图像。使用以下端点:这是链接

  2. Instagram 在此提供标签的 RSS 提要。

  3. Instagram 用户页面是公开的,因此您可以使用 PHP 和 CURL 来获取他们的页面,并使用 DOM 解析器在 html 中搜索您想要的图像标签。

于 2013-10-11T14:09:07.317 回答
4

如果您想在没有 clientID 和访问令牌的情况下搜索用户:

1:如果你想搜索所有你的名字与你的搜索词相似的用户:

将 SeachName 替换为您要搜索的文本:

https://www.instagram.com/web/search/topsearch/?query=SearchName

2:如果你想搜索完全相同的名字用户:

将 UserName 替换为您想要的搜索名称:

https://www.instagram.com/UserName/?__a=1

于 2019-12-23T09:08:47.230 回答
3

您可以使用此 API 来检索 instagram 用户的公共信息:

https://api.lityapp.com/instagrams/thebrainscoop?limit=2(编辑:2021 年 2 月的损坏/恶意软件链接)

如果不设置limit参数,默认发帖数限制在12个

如您在代码中所见,此 api 是在 SpringBoot 中使用 HtmlUnit 制作的:

public JSONObject getPublicInstagramByUserName(String userName, Integer limit) {
    String html;
    WebClient webClient = new WebClient();

    try {
        webClient.getOptions().setCssEnabled(false);
        webClient.getOptions().setJavaScriptEnabled(false);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getCookieManager().setCookiesEnabled(true);

        Page page = webClient.getPage("https://www.instagram.com/" + userName);
        WebResponse response = page.getWebResponse();

        html = response.getContentAsString();
    } catch (Exception ex) {
        ex.printStackTrace();

        throw new RuntimeException("Ocorreu um erro no Instagram");
    }

    String prefix = "static/bundles/es6/ProfilePageContainer.js";
    String suffix = "\"";
    String script = html.substring(html.indexOf(prefix));

    script = script.substring(0, script.indexOf(suffix));

    try {
        Page page = webClient.getPage("https://www.instagram.com/" + script);
        WebResponse response = page.getWebResponse();

        script = response.getContentAsString();
    } catch (Exception ex) {
        ex.printStackTrace();

        throw new RuntimeException("Ocorreu um erro no Instagram");
    }

    prefix = "l.pagination},queryId:\"";

    String queryHash = script.substring(script.indexOf(prefix) + prefix.length());

    queryHash = queryHash.substring(0, queryHash.indexOf(suffix));
    prefix = "<script type=\"text/javascript\">window._sharedData = ";
    suffix = ";</script>";
    html = html.substring(html.indexOf(prefix) + prefix.length());
    html = html.substring(0, html.indexOf(suffix));

    JSONObject json = new JSONObject(html);
    JSONObject entryData = json.getJSONObject("entry_data");
    JSONObject profilePage = (JSONObject) entryData.getJSONArray("ProfilePage").get(0);
    JSONObject graphql = profilePage.getJSONObject("graphql");
    JSONObject user = graphql.getJSONObject("user");
    JSONObject response = new JSONObject();

    response.put("id", user.getString("id"));
    response.put("username", user.getString("username"));
    response.put("fullName", user.getString("full_name"));
    response.put("followedBy", user.getJSONObject("edge_followed_by").getLong("count"));
    response.put("following", user.getJSONObject("edge_follow").getLong("count"));
    response.put("isBusinessAccount", user.getBoolean("is_business_account"));
    response.put("photoUrl", user.getString("profile_pic_url"));
    response.put("photoUrlHD", user.getString("profile_pic_url_hd"));

    JSONObject edgeOwnerToTimelineMedia = user.getJSONObject("edge_owner_to_timeline_media");
    JSONArray posts = new JSONArray();

    try {
        loadPublicInstagramPosts(webClient, queryHash, user.getString("id"), posts, edgeOwnerToTimelineMedia, limit == null ? 12 : limit);
    } catch (Exception ex) {
        ex.printStackTrace();

        throw new RuntimeException("Você fez muitas chamadas, tente mais tarde");
    }

    response.put("posts", posts);

    return response;
}

private void loadPublicInstagramPosts(WebClient webClient, String queryHash, String userId, JSONArray posts, JSONObject edgeOwnerToTimelineMedia, Integer limit) throws IOException {
    JSONArray edges = edgeOwnerToTimelineMedia.getJSONArray("edges");

    for (Object elem : edges) {
        if (limit != null && posts.length() == limit) {
            return;
        }

        JSONObject node = ((JSONObject) elem).getJSONObject("node");

        if (node.getBoolean("is_video")) {
            continue;
        }

        JSONObject post = new JSONObject();

        post.put("id", node.getString("id"));
        post.put("shortcode", node.getString("shortcode"));

        JSONArray captionEdges = node.getJSONObject("edge_media_to_caption").getJSONArray("edges");

        if (captionEdges.length() > 0) {
            JSONObject captionNode = ((JSONObject) captionEdges.get(0)).getJSONObject("node");

            post.put("caption", captionNode.getString("text"));
        } else {
            post.put("caption", (Object) null);
        }

        post.put("photoUrl", node.getString("display_url"));

        JSONObject dimensions = node.getJSONObject("dimensions");

        post.put("photoWidth", dimensions.getLong("width"));
        post.put("photoHeight", dimensions.getLong("height"));

        JSONArray thumbnailResources = node.getJSONArray("thumbnail_resources");
        JSONArray thumbnails = new JSONArray();

        for (Object elem2 : thumbnailResources) {
            JSONObject obj = (JSONObject) elem2;
            JSONObject thumbnail = new JSONObject();

            thumbnail.put("photoUrl", obj.getString("src"));
            thumbnail.put("photoWidth", obj.getLong("config_width"));
            thumbnail.put("photoHeight", obj.getLong("config_height"));
            thumbnails.put(thumbnail);
        }

        post.put("thumbnails", thumbnails);
        posts.put(post);
    }

    JSONObject pageInfo = edgeOwnerToTimelineMedia.getJSONObject("page_info");

    if (!pageInfo.getBoolean("has_next_page")) {
        return;
    }

    String endCursor = pageInfo.getString("end_cursor");
    String variables = "{\"id\":\"" + userId + "\",\"first\":12,\"after\":\"" + endCursor + "\"}";

    String url = "https://www.instagram.com/graphql/query/?query_hash=" + queryHash + "&variables=" + URLEncoder.encode(variables, "UTF-8");
    Page page = webClient.getPage(url);
    WebResponse response = page.getWebResponse();
    String content = response.getContentAsString();
    JSONObject json = new JSONObject(content);

    loadPublicInstagramPosts(webClient, queryHash, userId, posts, json.getJSONObject("data").getJSONObject("user").getJSONObject("edge_owner_to_timeline_media"), limit);
}

这是一个响应示例:
{
  "id": "290482318",
  "username": "thebrainscoop",
  "fullName": "Official Fan Page",
  "followedBy": 1023,
  "following": 6,
  "isBusinessAccount": false,
  "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/447ffd0262082f373acf3d467435f130/5C709C77/t51.2885-19/11351770_612904665516559_678168252_a.jpg",
  "photoUrlHD": "https://scontent-gru2-1.cdninstagram.com/vp/447ffd0262082f373acf3d467435f130/5C709C77/t51.2885-19/11351770_612904665516559_678168252_a.jpg",
  "posts": [
    {
      "id": "1430331382090378714",
      "shortcode": "BPZjtBUly3a",
      "caption": "If I have any active followers anymore; hello! I'm Brianna, and I created this account when I was just 12 years old to show my love for The Brain Scoop. I'm now nearly finished high school, and just rediscovered it. I just wanted to see if anyone is still active on here, and also correct some of my past mistakes - being a child at the time, I didn't realise I had to credit artists for their work, so I'm going to try to correct that post haste. Also; the font in my bio is horrendous. Why'd I think that was a good idea? Anyway, this is a beautiful artwork of the long-tailed pangolin by @chelsealinaeve . Check her out!",
      "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/ab823331376ca46136457f4654bf2880/5CAD48E4/t51.2885-15/e35/16110915_400942200241213_3503127351280009216_n.jpg",
      "photoWidth": 640,
      "photoHeight": 457,
      "thumbnails": [
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/43b195566d0ef2ad5f4663ff76d62d23/5C76D756/t51.2885-15/e35/c91.0.457.457/s150x150/16110915_400942200241213_3503127351280009216_n.jpg",
          "photoWidth": 150,
          "photoHeight": 150
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/ae39043a7ac050c56d741d8b4355c185/5C93971C/t51.2885-15/e35/c91.0.457.457/s240x240/16110915_400942200241213_3503127351280009216_n.jpg",
          "photoWidth": 240,
          "photoHeight": 240
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/ae7a22d09e3ef98d0a6bbf31d621a3b7/5CACBBA6/t51.2885-15/e35/c91.0.457.457/s320x320/16110915_400942200241213_3503127351280009216_n.jpg",
          "photoWidth": 320,
          "photoHeight": 320
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/1439dc72b70e7c0c0a3afcc30970bb13/5C8E2923/t51.2885-15/e35/c91.0.457.457/16110915_400942200241213_3503127351280009216_n.jpg",
          "photoWidth": 480,
          "photoHeight": 480
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/1439dc72b70e7c0c0a3afcc30970bb13/5C8E2923/t51.2885-15/e35/c91.0.457.457/16110915_400942200241213_3503127351280009216_n.jpg",
          "photoWidth": 640,
          "photoHeight": 640
        }
      ]
    },
    {
      "id": "442527661838057235",
      "shortcode": "YkLJBXJD8T",
      "caption": null,
      "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/dc94b38da679826b9ac94ccd2bcc4928/5C7CDF93/t51.2885-15/e15/11327349_860747310663863_2105199307_n.jpg",
      "photoWidth": 612,
      "photoHeight": 612,
      "thumbnails": [
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/c1153c6513c44a6463d897e14b2d8f06/5CB13ADD/t51.2885-15/e15/s150x150/11327349_860747310663863_2105199307_n.jpg",
          "photoWidth": 150,
          "photoHeight": 150
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/47e60ec8bca5a1382cd9ac562439d48c/5CAE6A82/t51.2885-15/e15/s240x240/11327349_860747310663863_2105199307_n.jpg",
          "photoWidth": 240,
          "photoHeight": 240
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/da0ee5b666ab40e4adc1119e2edca014/5CADCB59/t51.2885-15/e15/s320x320/11327349_860747310663863_2105199307_n.jpg",
          "photoWidth": 320,
          "photoHeight": 320
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/02ee23571322ea8d0992e81e72f80ef2/5C741048/t51.2885-15/e15/s480x480/11327349_860747310663863_2105199307_n.jpg",
          "photoWidth": 480,
          "photoHeight": 480
        },
        {
          "photoUrl": "https://scontent-gru2-1.cdninstagram.com/vp/dc94b38da679826b9ac94ccd2bcc4928/5C7CDF93/t51.2885-15/e15/11327349_860747310663863_2105199307_n.jpg",
          "photoWidth": 640,
          "photoHeight": 640
        }
      ]
    }
  ]
}
于 2018-11-22T02:52:55.823 回答
3

好吧,由于/?__a=1现在停止工作,最好使用 curl 并解析 instagram 页面,如以下答案所述:Generate access token Instagram API, without having to log in?

于 2018-04-16T13:37:25.810 回答
2

我真的需要这个功能,但对于 Wordpress。我很合身,而且效果很好

<script>
    jQuery(function($){
        var name = "caririceara.comcariri";
        $.get("https://images"+~~(Math.random()*33)+"-focus-opensocial.googleusercontent.com/gadgets/proxy?container=none&url=https://www.instagram.com/" + name + "/", function(html) {
            if (html) {
                var regex = /_sharedData = ({.*);<\/script>/m,
                  json = JSON.parse(regex.exec(html)[1]),
                  edges = json.entry_data.ProfilePage[0].graphql.user.edge_owner_to_timeline_media.edges;
              $.each(edges, function(n, edge) {
                   if (n <= 7){
                     var node = edge.node;
                    $('.img_ins').append('<a href="https://instagr.am/p/'+node.shortcode+'" target="_blank"><img src="'+node.thumbnail_src+'" width="150"></a>');
                   }
              });
            }
        });
    }); 
    </script>
于 2019-07-08T09:13:09.933 回答
1

下面的 nodejs 代码从 Instagram 页面中抓取热门图片。“ScrapeInstagramPage”功能负责后期老化效果。

var request = require('parse5');
var request = require('request');
var rp      = require('request-promise');
var $       = require('cheerio'); // Basically jQuery for node.js 
const jsdom = require("jsdom");    
const { JSDOM } = jsdom;


function ScrapeInstagramPage (args) {
    dout("ScrapeInstagramPage for username -> " + args.username);
    var query_url = 'https://www.instagram.com/' + args.username + '/';

    var cookieString = '';

    var options = {
        url: query_url,
        method: 'GET',
        headers: {
            'x-requested-with' : 'XMLHttpRequest',
            'accept-language'  : 'en-US,en;q=0.8,pt;q=0.6,hi;q=0.4', 
            'User-Agent'       : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36',
            'referer'          : 'https://www.instagram.com/dress_blouse_designer/',
            'Cookie'           : cookieString,
            'Accept'           : '*/*',
            'Connection'       : 'keep-alive',
            'authority'        : 'www.instagram.com' 
        }
    };


    function dout (msg) {
        if (args.debug) {
            console.log(msg);
        }
    }

    function autoParse(body, response, resolveWithFullResponse) {
        // FIXME: The content type string could contain additional values like the charset. 
        // Consider using the `content-type` library for a robust comparison. 
        if (response.headers['content-type'] === 'application/json') {
            return JSON.parse(body);
        } else if (response.headers['content-type'] === 'text/html') {
            return $.load(body);
        } else {
            return body;
        }
    }

    options.transform = autoParse;


    rp(options)
        .then(function (autoParsedBody) {
            if (args.debug) {
                console.log("Responce of 'Get first user page': ");
                console.log(autoParsedBody);
                console.log("Creating JSDOM from above Responce...");
            }

            const dom = new JSDOM(autoParsedBody.html(), { runScripts: "dangerously" });
            if (args.debug) console.log(dom.window._sharedData); // full data doc form instagram for a page

            var user = dom.window._sharedData.entry_data.ProfilePage[0].user;
            if (args.debug) {
                console.log(user); // page user
                console.log(user.id); // user ID
                console.log(user.full_name); // user full_name
                console.log(user.username); // user username
                console.log(user.followed_by.count); // user followed_by
                console.log(user.profile_pic_url_hd); // user profile pic
                console.log(autoParsedBody.html());
            }

            if (user.is_private) {
                dout ("User account is PRIVATE");
            } else {
                dout ("User account is public");
                GetPostsFromUser(user.id, 5000, undefined);
            }
        })
        .catch(function (err) {
            console.log( "ERROR: " + err );
        });  

    var pop_posts = [];
    function GetPostsFromUser (user_id, first, end_cursor) {
        var end_cursor_str = "";
        if (end_cursor != undefined) {
            end_cursor_str = '&after=' + end_cursor;
        }

        options.url = 'https://www.instagram.com/graphql/query/?query_id=17880160963012870&id=' 
                        + user_id + '&first=' + first + end_cursor_str;

        rp(options)
            .then(function (autoParsedBody) {
                if (autoParsedBody.status === "ok") {
                    if (args.debug) console.log(autoParsedBody.data);
                    var posts = autoParsedBody.data.user.edge_owner_to_timeline_media;

                    // POSTS processing
                    if (posts.edges.length > 0) {
                        //console.log(posts.edges);
                        pop_posts = pop_posts.concat
                        (posts.edges.map(function(e) {
                            var d = new Date();
                            var now_seconds = d.getTime() / 1000;

                            var seconds_since_post = now_seconds - e.node.taken_at_timestamp;
                            //console.log("seconds_since_post: " + seconds_since_post);

                            var ageing = 10; // valuses (1-10]; big value means no ageing
                            var days_since_post = Math.floor(seconds_since_post/(24*60*60));
                            var df = (Math.log(ageing+days_since_post) / (Math.log(ageing)));
                            var likes_per_day = (e.node.edge_liked_by.count / df);
                            // console.log("likes: " + e.node.edge_liked_by.count);
                            //console.log("df: " + df);
                            //console.log("likes_per_day: " + likes_per_day);
                            //return (likes_per_day > 10 * 1000);
                            var obj = {};
                            obj.url = e.node.display_url;
                            obj.likes_per_day = likes_per_day;
                            obj.days_since_post = days_since_post;
                            obj.total_likes = e.node.edge_liked_by.count;
                            return obj;
                        }
                        ));

                        pop_posts.sort(function (b,a) {
                          if (a.likes_per_day < b.likes_per_day)
                            return -1;
                          if (a.likes_per_day > b.likes_per_day)
                            return 1;
                          return 0;
                        });

                        //console.log(pop_posts);

                        pop_posts.forEach(function (obj) {
                            console.log(obj.url);
                        });
                    }

                    if (posts.page_info.has_next_page) {
                        GetPostsFromUser(user_id, first, posts.page_info.end_cursor);
                    }
                } else {
                    console.log( "ERROR: Posts AJAX call not returned good..." );
                }
            })
            .catch(function (err) {
                console.log( "ERROR: " + err );
            }); 
    }
}


ScrapeInstagramPage ({username : "dress_blouse_designer", debug : false});

在这里试试

示例:对于给定的 URL ' https://www.instagram.com/dress_blouse_designer/ ',可以调用函数

ScrapeInstagramPage ({username : "dress_blouse_designer", debug : false});
于 2017-06-17T12:17:31.800 回答
-1

这使用简单的 ajax 调用和迭代图像路径来工作。

        var name = "nasa";
        $.get("https://www.instagram.com/" + name + "/?__a=1", function (data, status) {
            console.log('IG_NODES', data.user.media.nodes);
            $.each(data.user.media.nodes, function (n, item) {
                console.log('ITEMS', item.display_src);
                $('body').append(
                    "<div class='col-md-4'><img class='img-fluid d-block' src='" + item.display_src + "'></div>"
                );
            });
        })
于 2017-12-19T20:59:10.680 回答
-3

这是一个 php 脚本,它下载图像并创建一个带有图像链接的 html 文件。php 版本的信用 350D,这只是详细说明..我建议把它放在一个 cron 工作中,然后根据你的需要经常开火。截至 2019 年 5 月已验证工作

<?
$user = 'smena8m';
$igdata = file_get_contents('https://instagram.com/'.$user.'/');
preg_match('/_sharedData = ({.*);<\/script>/',$igdata,$matches);
$profile_data = json_decode($matches[1])->entry_data->ProfilePage[0]->graphql->user;
$html = '<div class="instagramBox" style="display:inline-grid;grid-template-columns:auto auto auto;">';
$i = 0;
$max = 9;
while($i<$max){
    $imglink = $profile_data->edge_owner_to_timeline_media->edges[$i]->node->shortcode;
    $img = $profile_data->edge_owner_to_timeline_media->edges[$i]->node->thumbnail_resources[0]->src;
    file_put_contents('ig'.$i.'.jpg',file_get_contents($img));
    $html .= '<a href="https://www.instagram.com/p/'.$imglink.'/" target="_blank"><img src="ig'.$i.'.jpg" /></a>';
    $i++;
}
$html .= '</div>';
$instagram = fopen('instagram.html','w');
fwrite($instagram,$html);
fclose($instagram);
?>
于 2019-05-27T07:42:19.770 回答