-1

您好,我正在尝试从https://ecomiwiki.com/marketplace/floors抓取此表并将其转换为 pandas 数据框。该表显示在 google chrome 中,但在运行请求时不显示。

import pandas as pd
import numpy as np
import sklearn
import requests
from bs4 import BeautifulSoup

url= "https://ecomiwiki.com/marketplace/floors"

name = 'veve'
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')

soup


#df = pd.read_html(str(table), header=0)[0]

Soup 返回这个,但它没有 table 元素。所以 table = soup.find_all('table') 返回一个空列表。

<!DOCTYPE html> <html class="antialiased js-focus-visible" lang="en"><head><meta charset="utf-8"/><link href="/apple-touch-icon.png" rel="apple-touch-icon" sizes="180x180"/><link href="/favicon-32x32.png" rel="icon" sizes="32x32" type="image/png"/><link href="/favicon-16x16.png" rel="icon" sizes="16x16" type="image/png"/><link href="/site.webmanifest" rel="manifest"/><link color="#5bbad5" href="/safari-pinned-tab.svg" rel="mask-icon"/><meta content="#da532c" name="msapplication-TileColor"/><meta content="#ffffff" name="theme-color"/><link href="https://cdnjs.cloudflare.com/ajax/libs/nprogress/0.2.0/nprogress.css" rel="stylesheet"/><meta charset="utf-8"/><meta content="minimum-scale=1, initial-scale=1, width=device-width" name="viewport"/><title>floors.title</title><meta content="floors.updateByHour" name="description"/><meta content="4" name="next-head-count"/><link as="style" href="/_next/static/css/d80c689fe857ac3ac369.css" rel="preload"/><link data-n-g="" href="/_next/static/css/d80c689fe857ac3ac369.css" rel="stylesheet"/><noscript data-n-css=""></noscript><script defer="" nomodule="" src="/_next/static/chunks/polyfills-e7a279300235e161e32a.js"></script><script defer="" src="/_next/static/chunks/webpack-8bfa344c32589116c052.js"></script><script defer="" src="/_next/static/chunks/framework-3af989d3dbeb77832f99.js"></script><script defer="" src="/_next/static/chunks/main-12d082cd9c513dcd9e16.js"></script><script defer="" src="/_next/static/chunks/pages/_app-4c2892829fd3a9a6139a.js"></script><script defer="" src="/_next/static/chunks/75fc9c18-84e7ab66c7989b7a8b6f.js"></script><script defer="" src="/_next/static/chunks/780-5c125aa217c0b02da99d.js"></script><script defer="" src="/_next/static/chunks/857-235054e87e5035a8ae18.js"></script><script defer="" src="/_next/static/chunks/603-9f53fd9fe561a24cc973.js"></script><script defer="" src="/_next/static/chunks/pages/marketplace/floors-b6ed8935c1ad58745308.js"></script><script defer="" src="/_next/static/bDtOEfQATrAWM9BDU1v9o/_buildManifest.js"></script><script defer="" src="/_next/static/bDtOEfQATrAWM9BDU1v9o/_ssgManifest.js"></script></head><body><div id="__next"><div class="relative overflow-x-hidden"><header class="px-5 py-2 absolute w-full z-10 shadow"><div class="flex justify-between items-center"><div><a href="/"><h1 class="flex align-center items-center"><div style="display:inline-block;max-width:100%;overflow:hidden;position:relative;box-sizing:border-box;margin:0"><div style="box-sizing:border-box;display:block;max-width:100%"><img alt="" aria-hidden="true" role="presentation" src="data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMzAiIGhlaWdodD0iMzAiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgdmVyc2lvbj0iMS4xIi8+" style="max-width:100%;display:block;margin:0;border:none;padding:0"/></div><noscript><img alt="Ecomi Wiki logo" class="inline-block mr-2" decoding="async" src="/_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=64&amp;q=75" srcset="/_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=32&amp;q=75 1x, /_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=64&amp;q=75 2x" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></noscript><img alt="Ecomi Wiki logo" class="inline-block mr-2" decoding="async" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></div><span class="font-bold text-base text-white inline-block ml-2">ECOMI</span><span class="text-gray-300 text-base font-medium inline-block ml-1">WIKI</span></h1></a></div><div><nav><ul class="text-right"><li class="inline-block mr-2"><a href="/collectibles"><span class="border border-gray-500 rounded-full mr-3 inline-block text-center" data-effect="solid" data-event="mouseenter mouseleave" data-tip="header.collectibles"></span></a></li><li class="inline-block mr-2"><a href="/marketplace/floors"><span class="border border-gray-500 rounded-full mr-3 inline-block text-center" data-effect="solid" data-event="mouseenter mouseleave" data-tip="header.marketplaceFloors"></span></a></li></ul></nav></div><div><ul><li class="inline-block"><button><span class="p-1 rounded-full mr-3 inline-block text-center h-9 w-9 bg-gray-700"><span class="text-white font-medium text-sm">EN</span></span></button></li><li class="inline-block"><button><span class="p-1 rounded-full mr-3 inline-block text-center h-9 w-9 bg-gray-700"></span></button></li></ul></div></div></header><main class="pt-16"><span class="radial-bg"></span><div class="null"><div class="text-white px-5 mt-20"><p class="font-semibold text-2xl leading-relaxed">floors.autoUpdate</p><p class="block text-base text-gray-300">floors.floorPrices</p><p class="block text-base text-gray-300 mt-5"><a class="text-pink-500" href="/user/vault/valuation">floors.valutValuation</a>floors.basedOff</p></div><nav class="px-5 mt-10"><ul><li class="inline-block mr-3"><button class="bg-pink-500 border border-pink-500 hover:bg-pink-700 text-white font-base py-2 px-4 rounded-full font-semibold text-xs">floors.collectibles</button></li><li class="inline-block mr-3"><button class="bg-transparent border border-white text-white font-base py-2 px-4 rounded-full font-semibold text-xs">floors.comics</button></li></ul></nav><div class="grid grid-cols-1 mt-10 text-white px-5"><div class="mx-auto my-10"><svg class="animate-spin -ml-1 mr-3 h-10 w-10 text-white" fill="none" viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle><path class="opacity-75" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042
1.135 5.824 3 7.938l3-2.647z" fill="currentColor"></path></svg></div></div></div></main><footer class="px-10 text-white bg-gray-900 py-20 border-t border-black mt-20"><div class="container grid grid-cols-1 md:grid-cols-4 gap-10 relative md:place-items-center"><div class="relative -mt-36 md:mt-0"><div style="display:inline-block;max-width:100%;overflow:hidden;position:relative;box-sizing:border-box;margin:0"><div style="box-sizing:border-box;display:block;max-width:100%"><img alt="" aria-hidden="true" role="presentation" src="data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iNDAwIiBoZWlnaHQ9IjMyNiIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiB2ZXJzaW9uPSIxLjEiLz4=" style="max-width:100%;display:block;margin:0;border:none;padding:0"/></div><noscript><img alt="Ecomi Investors logo" class="absolute top-0" decoding="async" src="/_next/image?url=%2Fassets%2Fimages%2Fecomi-art.png&amp;w=828&amp;q=75" srcset="/_next/image?url=%2Fassets%2Fimages%2Fecomi-art.png&amp;w=640&amp;q=75 1x, /_next/image?url=%2Fassets%2Fimages%2Fecomi-art.png&amp;w=828&amp;q=75 2x" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></noscript><img alt="Ecomi Investors logo" class="absolute top-0" decoding="async" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></div></div><div class="col-span-2"><span class="ecomiFont flex align-center items-center"><div style="display:inline-block;max-width:100%;overflow:hidden;position:relative;box-sizing:border-box;margin:0"><div style="box-sizing:border-box;display:block;max-width:100%"><img alt="" aria-hidden="true" role="presentation" src="data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iNTAiIGhlaWdodD0iNTAiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgdmVyc2lvbj0iMS4xIi8+" style="max-width:100%;display:block;margin:0;border:none;padding:0"/></div><noscript><img alt="Ecomi Investors logo" class="inline-block mr-2" decoding="async" src="/_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=128&amp;q=75" srcset="/_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=64&amp;q=75 1x, /_next/image?url=%2Fassets%2Fimages%2Fecomi-rings-white.svg&amp;w=128&amp;q=75 2x" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></noscript><img alt="Ecomi Investors logo" class="inline-block mr-2" decoding="async" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" style="position:absolute;top:0;left:0;bottom:0;right:0;box-sizing:border-box;padding:0;border:none;margin:auto;display:block;width:0;height:0;min-width:100%;max-width:100%;min-height:100%;max-height:100%"/></div><span class="font-semibold text-2xl text-white inline-block ml-2">ECOMI</span><span class="text-gray-300 text-2xl font-medium inline-block ml-1">WIKI</span></span><small class="text-sm mt-5 text-gray-400 block">footer.ecomiWiki<strong>footer.unofficial</strong>footer.notAffiliated</small><small class="text-sm mt-5 text-gray-400 block">footer.noFinancialAdvice</small><a href="/donate"></a></div><div><h6 class="mb-3 font-medium text-gray-300">footer.quickLink</h6><ul class="text-sm"><li class="mb-2"><a class="py-2" href="/marketplace/floors"><span class="leading-6 ml-2 text-pink-600">footer.floors</span></a></li><li class="mb-2"><a class="py-2" href="https://drive.google.com/file/d/1UNE-EvjuMIaWJUfvF3qQiTe0OKLFAJXV/view"><span class="leading-6 ml-2 text-pink-600">footer.whitepaper</span></a></li><li><a class="py-2" href="https://github.com/alienbuild/ecomi_frontend"><span class="leading-6 ml-2 text-pink-600">GitHub</span></a></li></ul></div></div></footer></div></div><script id="__NEXT_DATA__" type="application/json">{"props":{"pageProps":{}},"page":"/marketplace/floors","query":{},"buildId":"bDtOEfQATrAWM9BDU1v9o","runtimeConfig":{"APP_NAME":"ECOMI","API_DEVELOPMENT":"http://localhost:8000/api","API_PRODUCTION":"https://ecomiwiki.com/api","PRODUCTION":true,"DOMAIN_DEVELOPMENT":"http://localhost:3000","DOMAIN_PRODUCTION":"https://ecomiwiki.com"},"nextExport":true,"autoExport":true,"isFallback":false,"scriptLoader":[]}</script><script async="" src="https://www.googletagmanager.com/gtag/js?id=UA-103571215-1"></script><script>
                  window.dataLayer = window.dataLayer || [];
                  function gtag(){dataLayer.push(arguments);}
                  gtag('js', new Date());
                  gtag('config', 'UA-103571215-1');
                </script></body></html>

为什么我可以在网页中看到表格,但 table = soup.find_all('table') 返回一个空列表?

4

1 回答 1

1

您会注意到网页上实现了延迟加载(滚动时加载)。这可以使用 Selenium 轻松完成

安装硒:pip install selenium

从ChromeDriver 网站下载适用于您的 Chrome 版本的驱动程序

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time

# path to chromedriver.exe
driver = webdriver.Chrome("D:/chromedriver/94/chromedriver.exe")
driver.get("https://ecomiwiki.com/marketplace/floors")

def scrolldown(times, waitfor):
    t = 0 
    while t < times:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        time.sleep(waitfor)
        t += 1
        
tbody = WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.XPATH, '//tbody[@class="flex-1 sm:flex-none divide-y divide-gray-700"]')))

scrolldown(10, 2)

print(f"No. of rows found: {len(rows)}")

# getting column heading
thead = driver.find_element(By.XPATH, '//thead[@class="border border-gray-700 hidden sm:table-header-group"]')
all_th = thead.find_elements(By.TAG_NAME, 'th')
headers = [th.text for th in all_th]

# find all rows in tbody
rows = tbody.find_elements(By.TAG_NAME, "tr")

content = []
for row in rows:
    cols = row.find_elements(By.TAG_NAME, "td")
    tmp = []
    for col in cols:
        tmp.append(col.text)
    content.append(tmp)

df = pd.DataFrame(data=content, columns=headers)
print(df)
于 2021-09-30T05:52:30.880 回答