Selene soup-based functionality¶
Submodules¶
selene.core.soup.element module¶
- class selene.core.soup.element.ElementSoup(element, logger=None)[source]¶
Bases:
Element
An element class to wrap beautiful soup functionality for finding and returning attributes from soup objects.
- find(*args, **kwargs)[source]¶
Find and return specific elements within the html
- Parameters:
element (str) – the type of html element searched for e.g. ‘div’
attributes (dict) – attributes of the searched element e.g. {“class”: “text-1”}
- Returns:
el
- Return type:
- find_all(*args, **kwargs)[source]¶
Find and return all elements within the html that meet the given criteria
- Parameters:
element (str) – the type of html element searched for e.g. ‘div’
attributes (dict) – attributes of the searched element e.g. {“class”: “text-1”}
- Returns:
els – all ElementSoup that meet criteria
- Return type:
list
- classmethod from_selene(element_selene, logger=None)[source]¶
Initialise an ElementSoup instance from an ElementSelene object. Allow interchangeability between selenium-based on soup-based elements
- Parameters:
element_selene (selene.core.selenium.ElementSelene) –
logger (logging.Logger) – a logger instance (see core.logger.py)
- class selene.core.soup.element.ElementSoupBlank[source]¶
Bases:
ElementSoup
A class for blank soup objects. Used in cases where another method has not returned anything
selene.core.soup.page module¶
- class selene.core.soup.page.PageSoup(url, soup, logger=None)[source]¶
Bases:
Page
A page class to assist any workflow which requires BeautifulSoup.
This is really a way to make Selenium WebDriver and BeautifulSoup more interchangeable, in as far as you can instantiate either a PageSoup or a PageSelene object, and the .find and .find_all function work in similar ways.
Inherits selene.core.page.Page
- find(*args, **kwargs)[source]¶
Find and return specific a specific element within the page html
- Parameters:
element (str) – the type of html element searched for e.g. ‘div’
attributes (dict) – attributes of the searched element e.g. {“class”: “text-1”}
- Returns:
el
- Return type:
- find_all(*args, **kwargs)[source]¶
Find and return all elements within the page html that meet the given criteria
- Parameters:
element (str) – the type of html element searched for e.g. ‘div’
attributes (dict) – attributes of the searched element e.g. {“class”: “text-1”}
- Returns:
els – all ElementSoup that meet criteria
- Return type:
list
- classmethod from_html(url, html, logger=None)[source]¶
Initialise a PageSoup instance from existing html source code.
- Parameters:
url (str) – the url of the page
html (str) – the html code to parse
logger (logging.Logger) – a logger instance (see core.logger.py)