logo
down
shadow

Use webdriver,python,beautifulsoup to retrieve dynamic website


Use webdriver,python,beautifulsoup to retrieve dynamic website

By : williamelworthy
Date : November 22 2020, 10:56 AM
help you fix your problem You are missing two key selenium-specific things:
do not use time.sleep() - use Waits use Select class when dealing with select/option - it provides a very nice abstraction
code :
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.select import Select

driver = webdriver.Chrome()
driver.get('http://www.homedepot.com/p/Husky-41-in-16-Drawer-Tool-Chest-and-Cabinet-Set-HOTC4016B1QES/205080371')

# waiting until reviews are loaded
element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, 'BVRRDisplayContentSelectBVFrameID'))
)

select = Select(element)
select.select_by_visible_text('Newest')
reviews = []
for review in driver.find_elements_by_xpath('//span[@itemprop="review"]'):
    name = review.find_element_by_xpath('.//span[@itemprop="name"]').text.strip()
    stars = review.find_element_by_xpath('.//span[@itemprop="ratingValue"]').text.strip()
    description = review.find_element_by_xpath('.//div[@itemprop="description"]').text.strip()

    reviews.append({
        'name': name,
        'stars': stars,
        'description': description
    })

print(reviews)
[
    {'description': u'Very durable product. Worth the money. My husband loves it',
     'name': u'Excellent product',
     'stars': u'5.0'},

    {'description': u'I now have all my tools in one well organized box instead of several boxes and have a handy charging station for cordless tools on the top . Money well spent. Solid box!',
     'name': u'Great!',
     'stars': u'5.0'},

    ...
]


Share : facebook icon twitter icon
retrieve dynamic frame names w/ Selenium WebDriver and Python?

retrieve dynamic frame names w/ Selenium WebDriver and Python?


By : Wai Liux
Date : March 29 2020, 07:55 AM
wish of those help find_elements_by_xpath returns an object of type WebElement. You need to use get_attribute to retrieve the value of an attribute from the WebElement.
code :
framename = driver.find_element_by_xpath("//frame[contains(@name, 'results_frame')]").get_attribute("name")
Scraping website only retrieve 1st item on list Beautifulsoup 3.2.1. Python 2.7

Scraping website only retrieve 1st item on list Beautifulsoup 3.2.1. Python 2.7


By : John M
Date : March 29 2020, 07:55 AM
like below fixes the issue I had to use lxml as html.parser gave errors with this URL. You need to use find_all() not find() like this:
code :
import requests
from bs4 import BeautifulSoup

url = requests.get('https://www.usa.gov/federal-agencies/a') #download webpage with listing A
soup = BeautifulSoup(url.content, 'lxml') #create beautifulSoup class to parse the page
fed_list_a = soup.find(class_ = "one_column_bullet") #extract class with information required

# print(fed_list_a.prettify())

url_list = fed_list_a.find_all('a', class_="url")
for url in url_list:
    print (url.get_text())
AbilityOne Commission
Access Board
Administration for Children and Families (ACF)
Administration for Community Living
Administration for Native Americans
Administration on Aging
Administration on Intellectual and Developmental Disabilities
Administrative Conference of the United States
Administrative Office of the U.S. Courts
Advisory Council on Historic Preservation
African Development Foundation
....
How to Retrieve all links from a dynamic website with selenium python

How to Retrieve all links from a dynamic website with selenium python


By : ashish sharma
Date : March 29 2020, 07:55 AM
Selenium webdriver python finding dynamic text from website

Selenium webdriver python finding dynamic text from website


By : user2508458
Date : March 29 2020, 07:55 AM
it should still fix some issue Website
code :
driver.find_elements_by_xpath("//*[contains(text(), 'Artik')]")
driver.find_element_by_xpath("//*[contains(text(), 'Artik')]").click()
How to scrap Dynamic JavaScript based website using Python REQUESTS and BeautifulSoup?

How to scrap Dynamic JavaScript based website using Python REQUESTS and BeautifulSoup?


By : Bhargav Kansara
Date : March 29 2020, 07:55 AM
wish help you to fix your issue You don't need to know what its Javascript does. Just click the link and use your browser inspector to observe the network request.
In your specific case, the Javascript sends a POST request to "/nationalCategoryList/NationalCategoryList/loadMoreCourses/". So you can send the same request and you'll get back a new HTML string. You can parse that string using BeautifulSoup and get the data you need.
Related Posts Related Posts :
  • Use `tf.image.resize_image_with_crop_or_pad` to resize numpy array
  • Sum number of occurences of string per row
  • Calculating 'Diagonal Distance' in 3 dimensions for A* path-finding heuristic
  • porting PyGST app to GStreamer1.0 + PyGI
  • Connection refused in Tornado test
  • How much time does take train SVM classifier?
  • Turning a string into list of positive and negative numbers
  • Python lists get specific length of elements from index
  • python.exe version 3.3.2 64 & 32 crash while creating .exe file on win 7 64 & 32 with cx_Freeze
  • Efficient nearest neighbour search for sparse matrices
  • django filter_horizontal can't display
  • How to install FLANN and pyflann on Windows
  • How can I plot the same figure standalone and in a subplot in Matplotlib?
  • read-only cells in ipython notebook
  • filling text file with dates
  • error:AttributeError: 'super' object has no attribute 'db_type' when run "python manage.py syncdb" in django
  • python imblearn make_pipeline TypeError: Last step of Pipeline should implement fit
  • Write to csv: columns are shifted when item in row is empty (Python)
  • DuckDuckGo search returns 'List Index out of range'
  • Python function which can transverse a nested list and print out each element
  • Python installing xlwt module error
  • Python mayavi: Adding points to a 3d scatter plot
  • Making a basic web scraper in Python with only built in libraries - Python
  • How to calculate the angle of the sun above the horizon using pyEphem
  • Fix newlines when writing UTF-8 to Text file in python
  • How to convert backward slash command in python to run on Linux
  • PyCharm Code Inspection doesn't include PEP 8
  • How can I use Python namedtuples to insert rows into mySQL database
  • Increase / Decrease Mac Address in Python from String
  • Scrollable QLabel image in PyQt5
  • (Python 2.7) Access variable from class with accessor/mutator
  • Why does "from [Module] import [Something]" takes more time than "import [Module"
  • jira python oauth: how to get the parameters for authentication?
  • Python - How to specify a relative path by jumping a subdirectory?
  • Extract scientific number from string
  • Scrapy: Python cannot find the spider
  • get the values in a given radius from numpy array
  • Is it possible to duplicate a pipe in Python, so that it has one write end but two read ends?
  • Why does wget use Firefox cookies to login on an authenticated webpage?
  • python import behaviour: different objects from same file?
  • Create YoY Graph with Matplotlib
  • Safe use of eval() or alternatives - python
  • Unix change desktop background seamlessly
  • Profiling Python code that uses multiprocessing?
  • How to query a database after render_template
  • shifting right in for loop over indices in python
  • Is there a way to switch code indentation from tabs to spaces across the project, and to keep 'hg annotate' functionalit
  • Disable/Close/Quit/Exit Terminal screen from python on Geany (Ubuntu)
  • for i in xrange() not running the complete script
  • ImportError ropevim using ropevim plugin in vim
  • How to read each line from a file into list word by word in Python
  • Creating Unique Names
  • python split a string when a keyword comes after a pattern
  • Same Python code returns different results for same input string
  • Call a Flask function every few minutes
  • Python: Using Ghost for dynamic webscraping
  • How to make while iteration faster?
  • Struggling to resolve "a float is required error" in python
  • Read data with NAs into python and calculate mean row-wise
  • How to print telnet response line by line?
  • shadow
    Privacy Policy - Terms - Contact Us © ourworld-yourmove.org