Selene

A framework for efficient, consistent and maintainable object-oriented webscraping. Wraps and combines both Selenium Webdriver and BeautifulSoup.

Selene documentation.

Principles

  1. Websites are best represented using an Object-Oriented Programming approach.

  2. All websites are made out of pages.

  3. All pages can be represented as Page objects

  4. The base of all Page objects can ultimately derive from inheriting a general Page object.

  5. All webpages are made out of elements.

  6. All elements can be represented as Element objects

  7. The base of all Element objects can ultimately derive from inheriting a general Element object.

Features

  • General Page and Element objects.

  • Selenium-based Page and Element objects with methods that wrap Selenium Webdriver.

  • Soup-based Page and Element objects with methods that wrap BeautifulSoup.

How to use it

Selene is a framework. It provides base Classes that can be used to create website-specific Page and Element objects by sub-classing the base Classes. Please refer to the documentation for an example of how this is done.

Installation

To install from scratch

  • Install Chrome and chromedriver:

sudo apt-get update

apt install -y chromium-chromedriver chromium-browser

  • Install:

pip install .

Changelog

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

v1.0.2 - 2024-01-31

Fixed
  • Removed reference to desired capabilities from webdriver functionality

v1.0.1 - 2023-01-12

Fixed
  • Bug in logger for bool_url_does_not_contain function

  • Bug in scrolling functionality within an element

v1.0.0 - 2022-06-08

Added
  • Additional test coverage

  • Documentation

  • New features for virtual display driver option and mouse move

v0.1.0 - 2022-03-07

Port from previous repository with some additional testing etc.