0

我正在尝试抓取我的应用程序密钥(我将它们替换为“.....”)然后将它们写入要存储的文件中,但 Nokogiri 似乎可以获取 HTML 中的值client_id和值。client_secret

这是我要抓取的 HTML:

<html class=" logged-in" webdriver="true">

    <head></head>
    <body id="client_applications" class="full-width edit">

    <div id="flash-notice" style="left: 354px; display: none; bottom: 30px; opacity: 0;"></div>
    <div id="header"></div>
    <div id="main-wrapper">
        <div id="main-wrapper-inner">
            <h1></h1>
            <form id="edit_client_application_155914" class="edit_client_application throbberform" method="post" enctype="multipart/form-data" action="/you/apps/bakersdoezun756487" accept-charset="utf-8">

    <div style="margin:0;padding:0;display:inline"></div>
    <input id="client_application_id" type="hidden" value="155914" name="client_application[id]"></input>

<input id="redirect_to" type="hidden" value="https://soundcloud.com/you/apps/pusherqueen4348/edit" name="redirect_to"></input>

<div class="field-label-and-required form-group"></div>

<div class="form-group"></div>

<div class="form-group">

    <div class="width_1_3"></div>
    <div class="width_2_3 last">
        <input id="client_id" class="auto-select" type="text" value="....." readonly="readonly" name="client_id"></input>

</div>
<div class="width_1_3"></div>

<div class="width_2_3 last">

    <input id="client_secret" class="auto-select" type="text" value="....." readonly="readonly" name="client_secret"></input>

                </div>
                ::after
            </div>
            <div class="form-group"></div>
            <div class="form-group"></div>
            <div class="form-group"></div>
            <div class="form-buttons-big"></div>
        </form>
        ::after
    </div>
    ::after

</div>

这是我的 Ruby 代码:

url = Curl.get("https://soundcloud.com/you/apps/#{username}/edit") 
    html = Nokogiri::HTML(url.body)


html.css("client_applications").each do |node|
  mainwrapper_html = Nokogiri::HTML(node.inner_html)

  mainwrapper_html.css("main-wrapper").each do |node|
    main_wrapper_inner_html = Nokogiri::HTML(node.inner_html)

    main_wrapper_inner_html.css("main-wrapper-inner").each do |node|
      client_app_html = Nokogiri::HTML(node.inner_html)

      client_app_html.css("edit_client_application throbberform").each do |node|
        form_html = Nokogiri::HTML(node.inner_html)

        form_html.css("form-group").each do |node|
          width_2_3_html = Nokogiri::HTML(node.inner_html)

          width_2_3_html.css("width_2_3 last").each do |node|
            client_id = node.css("client_id").value.to_s
            client_secret = node.css("client_secret").value.to_s 
            file1 = File.new("client_keys.txt","a")
            file1.puts "#{client_id},#{client_secret}"
          end
        end
      end
    end   
  end
end
4

1 回答 1

0

由于这两个<input>元素都有唯一的 ID (client_idclient_secret),您可以直接选择它们。无需迭代整个结构:

client_id     = html.at('#client_id')['value']
client_secret = html.at('#client_secret')['value']

File.open('client_keys.txt', 'a') { |f| 
  f.puts "#{client_id},#{client_secret}"
}
于 2015-04-13T10:51:46.420 回答