Selene¶
A framework for efficient, consistent and maintainable object-oriented webscraping. Wraps and combines both Selenium Webdriver and BeautifulSoup.
Principles¶
Websites are best represented using an Object-Oriented Programming approach.
All websites are made out of pages.
All pages can be represented as Page objects
The base of all Page objects can ultimately derive from inheriting a general Page object.
All webpages are made out of elements.
All elements can be represented as Element objects
The base of all Element objects can ultimately derive from inheriting a general Element object.
Features¶
General Page and Element objects.
Selenium-based Page and Element objects with methods that wrap Selenium Webdriver.
Soup-based Page and Element objects with methods that wrap BeautifulSoup.
How to use it¶
Selene is a framework. It provides base Classes that can be used to create website-specific Page
and Element
objects by sub-classing the base Classes. Please refer to the documentation for an example of how this is done.
Installation¶
To install from scratch
Install Chrome and chromedriver:
sudo apt-get update
apt install -y chromium-chromedriver chromium-browser
Install:
pip install .
Changelog¶
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
v1.0.2 - 2024-01-31¶
Fixed¶
Removed reference to desired capabilities from webdriver functionality
v1.0.1 - 2023-01-12¶
Fixed¶
Bug in logger for bool_url_does_not_contain function
Bug in scrolling functionality within an element
v1.0.0 - 2022-06-08¶
Added¶
Additional test coverage
Documentation
New features for virtual display driver option and mouse move
v0.1.0 - 2022-03-07¶
Port from previous repository with some additional testing etc.