如何从动态更新的网页中提取数据 - javascript

我想从丝芙兰网站上刮取评论。该评论是动态更新的。

经过检查,我发现该评论位于HTML代码中。

<div class="css-eq4i08 " data-comp="Ellipsis Box">Honestly I never write 
reviews but this is a must if you have frizzy after even after straightening 
it! It smells fantastic and it works wonders definitely will be restocking once 
I’m done this one !!</div>

我想编写一个python硒代码来阅读评论。

我写的代码在这里...

from selenium import webdriver
chrome_path = (r"C:/Users/Connectm/Downloads/chromedriver.exe")

driver = webdriver.Chrome(chrome_path)
driver.implicitly_wait(20) 
driver.get("https://www.sephora.com/product/crybaby-coconut-oil-shine-serum-P439093?skuId=2122083&icid2=just%20arrived:p439093")
reviews = driver.find_element_by_xpath('//*[@id="ratings-reviews"]/div[4]/div[2]/div[2]/div[1]/div[3][@data-comp()='Elipsis Box'])
print(reviews.text)

如果我写find_element_by_class,我会空白。

最好的选择是什么?

我正在尝试使用带有属性的xpath。该代码不起作用。
有人请帮我什么是最好的解决方案?

参考方案

要从丝芙兰网站上刮取评论,您必须诱使WebDriverWait使元素可见,您可以使用以下解决方案:

代码块:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.sephora.com/product/crybaby-coconut-oil-shine-serum-P439093?skuId=2122083&icid2=just%20arrived:p439093")
driver.execute_script("arguments[0].scrollIntoView(true);", WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.XPATH, "//div[@id='tabpanel0']/div//b[contains(., 'What Else You Need to Know')]"))))
reviews = WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@data-comp='GridCell Box']//div[@data-comp='Ellipsis Box']")))
for review in reviews:
    print(review.get_attribute("innerHTML"))

控制台输出:

Honestly I never write reviews but this is a must if you have frizzy after even after straightening it! It smells fantastic and it works wonders definitely will be restocking once I’m done this one !!
I really like this product. I was looking for something to tame frizz and fly aways during the winter and this does the job. At first I was nervous it might give a greasy look but it makes my hair smooth and soft. Scent is actually a little subtle for me, but still nice.
This oil-serum is perfect for the right level of hydration without the feel of oil residue. Great for all hair types and my new go-to product.
I LOVE how weightless this oil feels in my hair.. takes away all of my flyaways without looking of feeling greasy.. the packaging is COOL (travel-friendly) and it smells wonderful!!
I tried this when it first dropped on their website. I’ve been using it for about 3 weeks now. And I have to say its just OKAY. Nothing super special about it. I haven’t noticed super smooth hair that isn’t given with other products that cost less. It’s just like any other smoothing serum. I also can’t figure out what the smell is. It doesn’t really smell as pleasant as their other products.
in love!! A tiny bit goes a long way. No more fly aways. No more frizz from touch or environment.

Selenium:如何使RemoteDriver始终附加到当前的浏览器选项卡? - javascript

我正在开发一个Windows应用程序,该应用程序可以通过语音命令操纵浏览器。我想适当地处理用户添加一些标签并根据需要更改所选标签的情况。事实证明,RemoteDriver仅与一个选项卡一起使用,并且可以通过提供选项卡手柄将焦点切换到另一个选项卡。但是我不知道如何获取选定的选项卡句柄并始终检查选定的选项卡是否已更改,或者是否存在始终与选定的选项卡一起使用的方法…

Python Selenium:单击下拉菜单中的选项时可以更改值吗? - javascript

我正在使用python硒进行一些搜索。在我查询的一个网页上,搜索对话框允许我通过下拉菜单指定是否要搜索所有部分或特定部分。要选择哪个部分,该站点在弹出窗口中有一些单独的对话框,我可以单击一个部分,然后在内部为下拉菜单中的“此部分”选项分配选定部分的值。这是检查选择器时的外观: <select ...> <option id="se…

Javascript-从当前网址中删除查询字符串 - javascript

单击提交按钮后,我需要从网址中删除查询字符串值。我可以用jQuery做到这一点吗?当前网址:siteUrl/page.php?key=value 页面提交后:siteUrl/page.php 实际上,我已经从另一个带有查询字符串的页面着陆到当前页面。我需要在页面首次加载时查询字符串值以预填充一些详细信息。但是,一旦我提交了表格,我就需要删除查询字符串值。我已…

Selenium-python单击按钮始终返回错误 - javascript

我正在尝试使用python-selenium绑定单击一个按钮;到目前为止没有任何运气尝试过各种选择器。我正在使用Chromedriver。我可以选择使用elem = driver.find_element(by='xpath', value="//div[@id='gwt-debug-search-button…

Mongo汇总 - javascript

我的收藏中有以下文件{ "_id": ObjectId("54490b8104f7142f22ecc97f"), "title": "Sample1", "slug": "samplenews", "cat": …