Selene selenium-based functionality¶
Submodules¶
selene.core.selenium.conditions module¶
- selene.core.selenium.conditions.bool_clickable(driver, by, identifier, wait=10, logger=None)[source]¶
- Wait a specified number of seconds until either:
- A found element is clickable 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the element is clickable, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_correct_handle(driver, handle, wait, logger, message='Incorrect handle.')[source]¶
- Wait a specified number of seconds until either:
- The active handle i.e. tab) is the expected one 
- A TimeoutException is raised 
 
 - This is useful when navigating between different tabs. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- handle (str) – the expected handle 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- message (str) – log message (default: “Incorrect handle.”) 
 
- Returns:
- output – True if the active handle is the same as the expected handle, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_element_class_contains(driver, element, wait, logger, string, message='Element class does not contain')[source]¶
- Wait a specified number of seconds until either:
- An element’s class contains a specified string 
- A TimeoutException is raised 
 
 - This is useful for cases where, for example, a dropdown element’s class contains “expanded” only if and when the dropdown has been expanded. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (core.selenium.element.Element) – the instance of the Element class representing the web element 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- string (str) – the string to be found 
- message (str) – log message (default: “Element class does not contain {string}”) 
 
- Returns:
- output – True if the element’s class contains the string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_element_class_does_not_contain(driver, element, wait, logger, string, message='Element class contains')[source]¶
- Wait a specified number of seconds until either:
- An element’s text contains a specified string 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (core.selenium.element.Element) – the instance of the Element class representing the web element 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- string (str) – the string to be found 
- message (str) – log message (default: “Element class contains {string}.”) 
 
- Returns:
- output – True if the element’s class contains the string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_element_text_contains(driver, element, wait, logger, string, message='Element text does not contain')[source]¶
- Wait a specified number of seconds until either:
- An element’s text contains a specified string 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (core.selenium.element.Element) – the instance of the Element class representing the web element 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- string (str) – the string to be found 
- message (str) – log message (default: “Element text does not contain {string}.”) 
 
- Returns:
- output – True if the element’s text contains the string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_element_text_does_not_contain(driver, element, wait, logger, string, message='Element text contains')[source]¶
- Wait a specified number of seconds until either:
- An element’s text DOES NOT contain a specified string 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (core.selenium.element.Element) – the instance of the Element class representing the web element 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- string (str) – the string to be found 
- message (str) – log message (default: “Element text contains {string}.”) 
 
- Returns:
- output – True if the element’s text does not contain the string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_invisible(driver, by, identifier, wait=10, logger=None)[source]¶
- Wait a specified number of seconds until either:
- A found element is NOT visible 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the element is invisible, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_new_handle(driver, n_handles_old, wait, logger, message='No new handles found.')[source]¶
- Wait a specified number of seconds until either:
- The number of window handles (i.e. the number of tabs open) has increased by one 
- A TimeoutException is raised 
 
 - This is useful when navigating between different tabs. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- n_handles_old (int) – the previous number of existing window handles 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- message (str) – log message (default: “No new handles found.”) 
 
- Returns:
- output – True if the number of handles has increased by one, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_scroll_height_changed(driver, wait, logger, height, element=None, message='Scroll height did not change.')[source]¶
- Wait a specified number of seconds until either:
- The page OR an element’s scroll height changes. This is what changes if the height of the page or element increases due to dynamically-generated content. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- height (float) – the original scroll height value 
- element (EITHER selenium.webdriver.remote.webelement.WebElement OR core.selenium.element.Element OR None) – the scrollable element. If None, then the page itself is the element. 
- message (str) – log message (default: “Scroll height did not change.”) 
 
- Returns:
- output – True if the scroll height has changed, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_scroll_position_changed(driver, element, wait, logger, position, message='Scroll position did not change.')[source]¶
- Wait a specified number of seconds until either:
- An element’s scroll position changes. This is what changes as you scroll down a scrollable element. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (EITHER selenium.webdriver.remote.webelement.WebElement OR core.selenium.element.Element) – the scrollable element 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- position (float) – the original scroll position value 
- message (str) – log message (default: “Scroll position did not change.”) 
 
- Returns:
- output – True if the element’s scroll position has changed, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_url_changed(driver, wait, logger, url, message='URL has not changed.')[source]¶
- Wait a specified number of seconds until either:
- The browser’s url changes. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- url (str) – the original url 
- message (str) – log message (default: “URL has not changed.”) 
 
- Returns:
- output – True if the url changes, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_url_contains(driver, wait, logger, string, message='URL does not contain the specified string.')[source]¶
- Wait a specified number of seconds until either:
- The browser’s url contains a specified string. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- message (str) – log message (default: “URL does not contain the specified string.”) 
 
- Returns:
- output – True if the url contains the specified string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_url_does_not_contain(driver, wait, logger, string, message='URL contains the specified string.')[source]¶
- Wait a specified number of seconds until either:
- The browser’s url DOES NOT contain a specified string. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- message (str) – log message (default: “URL contains the specified string.”) 
 
- Returns:
- output – True if the url does not contain the specified string, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_url_expected(driver, wait, logger, url, message='URL is not the expected URL.')[source]¶
- Wait a specified number of seconds until either:
- The browser’s url matches the expected url. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- url (str) – the expected url 
- message (str) – log message (default: “URL is not the expected URL.”) 
 
- Returns:
- output – True if the url is the expected url, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_url_unexpected(driver, wait, logger, url, message='URL is the unexpected URL.')[source]¶
- Wait a specified number of seconds until either:
- The browser’s url matches the UNexpected url. 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- url (str) – the unexpected url 
- message (str) – log message (default: “URL is the unexpected URL.”) 
 
- Returns:
- output – True if the url is the unexpected url, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_visible(driver, by, identifier, wait=10, logger=None)[source]¶
- Wait a specified number of seconds until either:
- A found element is visible 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the element is visible, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.conditions.bool_yoffset_changed(driver, wait, logger, yoffset, message='Y-offset did not change.')[source]¶
- Wait a specified number of seconds until either:
- The y-offset changes. This is what changes as you scroll down a page 
- A TimeoutException is raised 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
- yoffset (float) – the original y-offset value 
- message (str) – log message (default: “Y-offset did not change.”) 
 
- Returns:
- output – True if the y-offset has changed, False otherwise 
- Return type:
- bool 
 
selene.core.selenium.crawler module¶
selene.core.selenium.driver module¶
- selene.core.selenium.driver.get_driver(width=2560, height=1440, user_agent='default', incognito=False, disable_gpu=False, use_display=False)[source]¶
- Get an instance of selenium.webdriver and start browser - Parameters:
- width (int) – the width of the browser 
- height (int) – the height of the browser 
- user_agent – If False, then no user agent is used. If ‘default’, then a default user agent is used. If ‘random’, then a random user agent is selected. Otherwise, the specified user agent is used. 
- incognito (bool) – whether or not to start the browser in incognito mode 
- disable_gpu (bool) – whether or not to disable GPU 
- use_display (bool) – whether or not to use a virtual display 
 
- Returns:
- driver – selenium.webdriver instance 
- Return type:
- selenium.webdriver 
 
- selene.core.selenium.driver.get_user_agent(i)[source]¶
- Get a specific user agent string from core.config.USER_AGENTS - Parameters:
- i (int) – the list index 
- Returns:
- user_agent – The selected user agent 
- Return type:
- str 
 
- selene.core.selenium.driver.get_user_agent_random()[source]¶
- Get a random user agent string from core.config.USER_AGENTS - Returns:
- user_agent – The selected user agent 
- Return type:
- str 
 
- selene.core.selenium.driver.restart_driver(driver, wait=30)[source]¶
- Stop and close the selenium.webdriver instance, wait for a specified number of seconds, then start a new instance - Parameters:
- driver (selenium.webdriver) – the selenium webdriver instance to stop 
- Returns:
- driver – The new selenium.webdriver instance 
- Return type:
- selenium.webdriver 
 
selene.core.selenium.element module¶
- class selene.core.selenium.element.ElementSelene(element, logger=None)[source]¶
- Bases: - Element- An element class to wrap a selenium.webdriver.remote.webelement.WebElement object, in order to: - provide extra functionality 
- make it easier to crawlers to change between handling 
 - Selenium workflows and BeautifulSoup workflows. - Inherits selene.core.element.Element - click(driver)[source]¶
- Click the element. - Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - find(by, identifier, wait=10, log=True)[source]¶
- This:
- wraps core.selenium.tasks.task_find 
- finds only elements which are within this element. 
 
 - Parameters:
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – returns the element if an element is found, None otherwise 
- Return type:
 
 - find_all(by, identifier, wait=10, log=True)[source]¶
- This:
- wraps core.selenium.tasks.task_find_all 
- finds only elements which are within this element. 
 
 - Parameters:
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – returns the elements if one or more element is found, an empty list otherwise 
- Return type:
- list 
 
 - get_attribute(*args, **kwargs)[source]¶
- Gets an attribute from the element. E.g. self.get_attribute(‘href’) would return the hyperlink. - Returns:
- output – the attribute 
- Return type:
- str 
 
 - get_parent(driver)[source]¶
- Get the element’s parent - Returns:
- out 
- Return type:
- ElementSelene object wrapping the parent element 
 
 - get_text()[source]¶
- Get the element’s text - TODO this is redundant, but removing it might break some things - Returns:
- text 
- Return type:
- str 
 
 - has_attribute(*args, **kwargs)[source]¶
- Check whether the element contains a specified attribute. - Returns:
- output – True of the element has the attribute, False otherwise 
- Return type:
- bool 
 
 - scroll_down(driver, wait=10)[source]¶
- Scroll down the element IF the element has a scrollbar. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - scroll_to(driver, position_new, wait=10)[source]¶
- Scroll to a new position on the element IF the element has a scrollbar. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - scroll_to_bottom(driver, wait=10)[source]¶
- Scroll to the bottom of the element IF the element has a scrollbar. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 
selene.core.selenium.page module¶
- class selene.core.selenium.page.PageSelene(driver, url, logger=None, *args, **kwargs)[source]¶
- Bases: - Page- A page class to assist any workflow which requires selenium webdriver. - A website is made out of pages. 
- Dynamically-generated pages require Selenium Webdriver. 
- Each page will need general functionality (e.g. finding and element, scrolling etc.). 
- Inheriting this class provides that general functionality 
 - NOTE 1: Generally, the way to use this object is to initalise using the from_url() method, as this will attach the url to the page AND navigate to the url. - NOTE 2: Any PageSelene object will also contain a PageSoup object (see core.soup.page). This is an attempt to allow both the use of Selenium (for dynamic elements) and BeautifulSoup (for static elements) when scraping. - Inherits selene.core.page.Page - click(driver, by, identifier, wait=10)[source]¶
- Find and click an element on the page. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - static close_all_tabs_except_specified_tab(driver, handle_keep, attempts=3)[source]¶
- Closes all open tabs EXCEPT for the tab given by the specified handle. - Useful for cleanup of any open tabs. - It has an attempts variable, in case it doesn’t work first time. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- handle_keep (str) – the tab/handle to not close. 
- attempts (int) – a number of attempts before returning False 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - expand_scroll_height(driver, wait=1)[source]¶
- Keep scrolling to the bottom of the page, as the page dynamically expands due to the continued scrolling. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - find(driver, by, identifier, wait=10, log=True)[source]¶
- This:
- wraps core.selenium.tasks.task_find 
- returns the result, not as a selenium.webdriver.remote.webelement.WebElement object, 
 - but instead as a core.selenium.element.ElementSelene wrapper object, which gives added functionality. 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – returns the element if an element is found, None otherwise 
- Return type:
 
 - find_all(driver, by, identifier, wait=10, log=True)[source]¶
- This:
- wraps core.selenium.tasks.task_find_all 
- returns the result, not as a list of selenium.webdriver.remote.webelement.WebElement objects, 
 - but instead as a list of core.selenium.element.ElementSelene wrapper objects, which gives added functionality. 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – returns the elements if one or more element is found, an empty list otherwise 
- Return type:
- list 
 
 - find_all_soup(*args, **kwargs)[source]¶
- Each PageSelene object contains a PageSoup object. This wraps the core.soup.page.PageSoup.find_all function, so it can use BeautifulSoup to find elements. - Returns:
- output – the list of ElementSoup instances relating to the found webelements 
- Return type:
- list 
 
 - find_soup(*args, **kwargs)[source]¶
- Each PageSelene object contains a PageSoup object. This wraps the core.soup.page.PageSoup.find function, so it can use BeautifulSoup to find elements. - Returns:
- output – the ElementSoup instance relating to the found webelement 
- Return type:
 
 - classmethod from_url(driver, url, string='', logger=None, *args, **kwargs)[source]¶
- Initialise a PageSelene instance and navigate to the instance’s specified url - Checking the correct url can be done in 2 ways:
- Checking for an exact match 
- Checking whether the url contains a specified string. 
 
 - Parameters:
- driver (selenium.webdriver) – the initialised webdriver instance 
- url (str) – the url of the page 
- string (str) – a specified string for the new url to contain 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
 
 - get_page_soup(driver)[source]¶
- Get a PageSoup object (see core.soup.page) with the current source html code as found by the webdriver instance. - Parameters:
- driver (selenium.webdriver) – the initialised webdriver instance 
- Returns:
- output – PageSoup object initialised using the page’s source html code. 
- Return type:
 
 - This wraps core.selenium.tasks.task_navigate_to_url - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- url (str) – the url to navigate to 
- string (str) – a specified string for the new url to contain 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - classmethod new_tab(driver, url, string='', logger=None)[source]¶
- Initialise a PageSelene instance and navigate to the instance’s specified url in a new tab - Checking the correct url can be done in 2 ways:
- Checking for an exact match 
- Checking whether the url contains a specified string. 
 
 - Parameters:
- driver (selenium.webdriver) – the initialised webdriver instance 
- url (str) – the url of the page 
- string (str) – a specified string for the new url to contain 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
 
 - refresh(driver, wait=0)[source]¶
- Refresh the page by refreshing the driver and re-initialising the PageSelene object. - Parameters:
- driver (selenium.webdriver) – the initialised webdriver instance 
- wait (int) – a number of seconds to wait before re-initialising 
 
- Returns:
- output – re-initialised PageSelene object 
- Return type:
 
 - refresh_until_true(driver, func, message, attempts, *args, **kwargs)[source]¶
- This wraps other functions such as self.find. - If the wrapped function returns anything other than False or None, then this function returns True. - If the wrapped function returns False or None, then this function calls self.refresh. It does so for a number of attempts. If all attempts fail, then this function returns False - This becomes useful if a web page did not load properly, and therefore needs to be refreshed. - Parameters:
- driver (selenium.webdriver) – the initialised webdriver instance 
- func (function) – the function to be wrapped 
- message (str) – the error message to print to the logs 
- attempts (int) – the number of attempts before returning False 
 
- Returns:
- output – False if the function fails a specified number of times; True if it succeeds 
- Return type:
- bool 
 
 - static screenshot_to_local(driver, dirpath, filestem, logger=None)[source]¶
- This wraps core.selenium.tasks.screenshot_to_local - Save a browser screenshot to a local directory - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- dirpath (str) – directory to save file 
- filestem (str) – a string to add to a datetime to create the filename 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
 
 - static screenshot_to_notebook(driver, width=600, height=400, logger=None)[source]¶
- This wraps core.selenium.tasks.task_screenshot_to_notebook - Display a browser screenshot in a Jupyter notebook. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- width (int) – the width of the image 
- height (int) – the height of the image 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
 
 - scroll_down(driver, wait=10)[source]¶
- Scroll down the page. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - scroll_to(driver, position_new, wait=10)[source]¶
- Scroll to a new position on the page. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- position_new (int) – y position in pixels 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 - scroll_to_bottom(driver, wait=10)[source]¶
- Scroll to the bottom of the page. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
 
selene.core.selenium.scripts module¶
- selene.core.selenium.scripts.script_click_element(driver, element)[source]¶
- Execute JavaScript to click an element - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (selenium.webdriver.remote.webelement.WebElement) – the element from which to get the scroll position (if None, then the scroll position of the page is found). 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.scripts.script_expand_all_by_class_name(driver, identifier, attribute, indicator, clickable=None)[source]¶
- WARNING: EXPERIMENTAL - Execute JavaScript to expand a list of dropdown menus. - Steps:
- Find dropdowns by finding all elements with a class name specified with identifier. 
- For each dropdown found:
- Check if the dropdown is expanded or not. This can be done by:
- Does the attribute ‘class’ contain an indicator (e.g. ‘expanded’)? 
- Does the attribute ‘text’ contain an indicator (e.g. ‘Show More’)? 
- Is there an attribute caalled ‘exists’? 
 
 
- Find the element to click to expand the dropdown. Sometimes the clickable elemnt is not the dropdown itself, but is a button inside the dropdown. 
- Click the clickable element. 
 
 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- identifier (str) – the class name to search for 
- attribute (str) – the attribute of the element to check whether it is expanded or not 
- indicator (str) – the indicator within the attribute, which will indicate whether it is expanded or not 
- 'clickable' (str) – the class name of the element within the dropdown which you have to click to expand the dropdown 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.scripts.script_get_parent(driver, element)[source]¶
- Execute JavaScript to get the parent of an element - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (selenium.webdriver.remote.webelement.WebElement) – the element from which to get the scroll position (if None, then the scroll position of the page is found). 
 
- Returns:
- output – the parent WebElement 
- Return type:
- selenium.webdriver.remote.webelement.WebElement 
 
- selene.core.selenium.scripts.script_get_scroll_height(driver, element=None)[source]¶
- Execute JavaScript to get the scroll height of either:
- the page 
- an element with a scroll bar. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (selenium.webdriver.remote.webelement.WebElement) – the element from which to get the scroll height (if None, then the scroll height of the page is found). 
 
- Returns:
- output – the scroll height in pixels 
- Return type:
- int 
 
- selene.core.selenium.scripts.script_get_scroll_position(driver, element=None)[source]¶
- Execute JavaScript to get the scroll position of either:
- the page 
- an element with a scroll bar. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- element (selenium.webdriver.remote.webelement.WebElement) – the element from which to get the scroll position (if None, then the scroll position of the page is found). 
 
- Returns:
- output – the scroll position in pixels 
- Return type:
- int 
 
- selene.core.selenium.scripts.script_scroll_to(driver, position, element=None)[source]¶
- Execute JavaScript to scroll to a position on either:
- the page 
- an element with a scroll bar. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- position (int) – the y position (in pxels) to scroll to 
- element (selenium.webdriver.remote.webelement.WebElement) – the element from which to get the scroll position (if None, then the scroll position of the page is found). 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
selene.core.selenium.tasks module¶
- selene.core.selenium.tasks.mouse_move(driver, max_mouse_moves=10)[source]¶
- performs mouse move, for help with bot mitigation, partially ported from OpenWPM 
- selene.core.selenium.tasks.task_click(driver, by, identifier, wait=10, logger=None)[source]¶
- Click an element using a By. selector and an identifier. - For more info, see: https://selenium-python.readthedocs.io/locating-elements.html - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.tasks.task_close_tab_return_to_url_and_handle(driver, url, handle, string='', wait=10, logger=None)[source]¶
- Close the current tab and check that the driver is back at the expected handle and url. - Checking the correct url can be done in 2 ways:
- Checking for an exact match 
- Checking whether the url contains a specified string. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- handle (str) – the handle to navigate to 
- url (str) – the url to navigate to 
- string (str) – a specified string for the new url to contain 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.tasks.task_find(parent, by, identifier, wait=10, logger=None)[source]¶
- Find an element using a By. selector and an identifier. - For more info, see: https://selenium-python.readthedocs.io/locating-elements.html - If the operation is to find an element on the whole page, then the parent variable is a Selenium Webdriver instance (usually named driver). - If the operation is to find an element on the whole page, then the parent variable is a Selenium WebElement instance (NOT a core.selenium.element.Element instance). - Parameters:
- parent (EITHER selenium.webdriver OR selenium.webdriver.remote.webelement.WebElement) – where to search for the element 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – returns the webelement if it is found; None otherwise 
- Return type:
- [None, selenium.webdriver.remote.webelement.WebElement] 
 
- selene.core.selenium.tasks.task_find_all(parent, by, identifier, wait=10, logger=None)[source]¶
- Find a list of elements using a By. selector and an identifier. - For more info, see: https://selenium-python.readthedocs.io/locating-elements.html - If the operation is to find an element on the whole page, then the parent variable is a Selenium Webdriver instance (usually named driver). - If the operation is to find an element on the whole page, then the parent variable is a Selenium WebElement instance (NOT a core.selenium.element.Element instance). - Parameters:
- parent (EITHER selenium.webdriver OR selenium.webdriver.remote.webelement.WebElement) – where to search for the element 
- by (selenium.webdriver.common.by.By) – see https://selenium-python.readthedocs.io/locating-elements.html 
- identifier (str) – see https://selenium-python.readthedocs.io/locating-elements.html 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – returns a list of webelements if one or more are found; an empty list otherwise 
- Return type:
- list 
 
- Navigate to a new url and check that the url is correct. - Checking the correct url can be done in 2 ways:
- Checking for an exact match 
- Checking whether the url contains a specified string. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- url (str) – the url to navigate to 
- string (str) – a specified string for the new url to contain 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- Navigate to a new url in a new tab, and check that the url is correct. - Checking the correct url can be done in 2 ways:
- Checking for an exact match 
- Checking whether the url contains a specified string. 
 
 - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- url (str) – the url to navigate to 
- string (str) – a specified string for the new url to contain 
- wait (int) – a number of seconds to wait before raising a TimeoutException 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
- Returns:
- output – True if the operation was successful, False otherwise 
- Return type:
- bool 
 
- selene.core.selenium.tasks.task_screenshot_to_local(driver, dirpath, filestem, logger)[source]¶
- Save a browser screenshot to a local directory - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- dirpath (str) – the directory path 
- filestem (str) – a string to add to a datetime to create the filename TODO tidy up workflow? 
- logger (logging.Logger) – a logger instance (see core.logger.py) 
 
 
- selene.core.selenium.tasks.task_screenshot_to_notebook(driver, width, height, logger)[source]¶
- Display a browser screenshot in a Jupyter notebook. - Parameters:
- driver (selenium.webdriver) – a selenium webdriver instance 
- width (int) – the width of the image 
- height (int) – the height of the image 
- logger (logging.Logger) – a logger instance (see core.logger.py)