0

多年来一直在这里阅读,喜欢它。是时候问我自己的小问题了。

我们曾经使用 wget SSH 从供应商的后端检索产品信息(价格、可用性)。不幸的是,他们改变了方式,削弱了我的数据流:/

我已经对其进行了一些修复,但我无法获取所有数据,因为我的 cookie 不再存储良好。阅读所有谷歌结果,尝试了所有但我缺乏知识。我会坚持 cookie 设置部分,我敢打赌,如果它是固定的,我也可以再次以良好的顺序检索所有文件。

这是我使用的,我没有声誉积分,所以我不能添加多个链接,我将使用“url-here”作为我的https://www.supplier.com的替代品

/usr/local/bin/wget -O /dev/null --cookies=on --keep-session-cookies --no-check-certificate --save-cookies=cookies.txt --post-data='login=123654&password=123654&from=%2F' 'url-here/auth/login.php'

cookie 给出了这个:

# HTTP cookie file.
# Generated by Wget on 2013-07-25 06:03:44.
# Edit at your own risk.

www.supplier.com    FALSE   /   FALSE   0   PHPSESSID   DUMMY

通过浏览器的cookie显示会话的值,这部分似乎不再写了:/

也试过这个,结果都一样:

/usr/local/bin/wget -O /dev/null --cookies=on --keep-session-cookies --save-cookies=cookiepp.txt --referer='url-here/auth/login.php' --post-data='login=123654&password=123654' 'url-here/auth/login.php'

并且还尝试使用手动保存的 cookie 直接退出:

wget --no-check-certificate --span-hosts --header "Cookie:PHPSESSID=h771tqr0spe8ufbq6fash2msf6" "url-here/sortment/s/?group="$i"&excel=1"

最后,ssh命令行流程:

--2013-07-25 06:03:43--  https://www.supplier.com/auth/login.php
Resolving www.supplier.com (www.supplier.com)... 00.000.000.0
Connecting to www.supplier.com (www.supplier.com)|00.000.000.0|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://www.supplier.com/auth/login.php [following]
--2013-07-25 06:03:44--  https://www.supplier.com/auth/login.php
Reusing existing connection to www.supplier.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 6884 (6.7K) [text/html]
Saving to: '/dev/null'

最后,调试信息:

Setting --output-document (outputdocument) to /dev/null
Setting --cookies (cookies) to on
Setting --keep-session-cookies (keepsessioncookies) to 1
Setting --save-cookies (savecookies) to cookiepp.txt
Setting --referer (referer) to https://www.supplier.com/auth/login.php
Setting --post-data (postdata) to login=123654&password=123654
DEBUG output created by Wget 1.14 on freebsd9.1.

URI encoding = 'US-ASCII'
--2013-07-25 07:01:30--  https://www.supplier.com/auth/login.php
Resolving www.supplier.com (www.supplier.com)... 89.105.214.133
Caching www.supplier.com => 89.105.214.133
Connecting to www.supplier.com (www.supplier.com)|89.105.214.133|:443... connected.
Created socket 5.
Releasing 0x00000008021baa40 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 5 to SSL handle 0x000000080204c400
certificate:
  subject: /C=NL/postalCode=7271LB/ST=Gelderland/L=Borculo/street=Hesselinks Es 11/O=Varuvo bv/OU= /OU=Hosted by WideXS/OU=InstantSSL/CN=www.supplier.com
  issuer:  /C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO High-Assurance Secure Server CA
X509 certificate successfully verified and matches host www.supplier.com

---request begin---
POST /auth/login.php HTTP/1.1
Referer: https://www.supplier.com/auth/login.php
User-Agent: Wget/1.14 (freebsd9.1)
Accept: */*
Host: www.supplier.com
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 32

---request end---
[POST data: login=123654&password=123654]
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 302 Found
Date: Thu, 25 Jul 2013 14:01:32 GMT
Server: Apache
P3P: CP="NOI DSP COR NID CURa OUR NOR STA"
Set-Cookie: PHPSESSID=DUMMY; path=/
Status: 302 Found
Location: https://www.supplier.com/auth/login.php
Vary: Accept-Encoding
Content-Length: 0
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=Windows-1252

---response end---
302 Found

Stored cookie www.supplier.com -1 (ANY) / <session> <insecure> [expiry none] PHPSESSID DUMMY
Registered socket 5 for persistent reuse.
URI content encoding = 'Windows-1252'
Location: https://www.supplier.com/auth/login.php [following]
] done.
URI content encoding = None
--2013-07-25 07:01:31--  https://www.supplier.com/auth/login.php
Reusing existing connection to www.supplier.com:443.
Reusing fd 5.

---request begin---
GET /auth/login.php HTTP/1.1
Referer: https://www.supplier.com/auth/login.php
User-Agent: Wget/1.14 (freebsd9.1)
Accept: */*
Host: www.supplier.com
Connection: Keep-Alive
Cookie: PHPSESSID=DUMMY

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Thu, 25 Jul 2013 14:01:32 GMT
Server: Apache
P3P: CP="NOI DSP COR NID CURa OUR NOR STA"
Set-Cookie: PHPSESSID=DUMMY; path=/
Vary: Accept-Encoding
Content-Length: 6884
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Content-Type: text/html; charset=Windows-1252

---response end---
200 OK
Deleted old cookie (to be replaced.)

Stored cookie www.supplier.com -1 (ANY) / <session> <insecure> [expiry none] PHPSESSID DUMMY
URI content encoding = 'Windows-1252'
Length: 6884 (6.7K) [text/html]
Saving to: '/dev/null'

100%[===================================================================================================================>] 6,884       43.1KB/s   in 0.2s

2013-07-25 07:01:32 (43.1 KB/s) - '/dev/null' saved [6884/6884]

Saving cookies to cookiepp.txt.
Done saving cookies.

我真的很迷茫,在这一点上我希望得到其他人的帮助。

4

1 回答 1

1

使用 cURL 修复它:

/usr/local/bin/curl -k -c cookie.txt -d "login_alias=464" -d "password=6446" -d "from=%2F" https://www.supplier.com/auth/login.php
paste theCodesOnePerLine theNamesOnePerLine |
while read i j
do 
/usr/local/bin/curl -k -L -b cookie.txt -o /usr/home/$j.xls 'https://www.supplier.com/?group='$i'&excel=1'
done

不过,感谢页面空间:p

于 2013-07-26T12:57:08.833 回答