Selene core functionality and parent objects

Submodules

selene.core.crawler module

class selene.core.crawler.Crawler(id_crawler='Crawler', debug=True)[source]

Bases: object

A parent crawler class to assist any worflow.

log(message, level='DEBUG')[source]

Output a log message, with the appropriate loglevel (default=DEBUG).

Parameters:
  • message (str) – the message to log

  • level (str) – the loglevel of the message

screenshot_to_notebook(driver, debug=None)[source]

Display a thumbnail-sized screenshot to a Jupyter notebook, only if the crawler is in debug mode.

Parameters:
  • driver (selenium.webdriver) – a selenium webdriver instance

  • debug (bool) – whether or not the craler is in debug mode

selene.core.element module

class selene.core.element.Element(element, logger)[source]

Bases: object

A parent Element class. Both ElementSelene and ElementSoup inherit this class.

log(message, level='DEBUG')[source]

Output a log message, with the appropriate loglevel (default=DEBUG).

Parameters:
  • message (str) – the message to log

  • level (str) – the loglevel of the message

selene.core.logger module

selene.core.logger.get_logger(name='log', level='INFO', to_console=True, to_file=False, overwrite=False, dirpath='/notebooks/selene_logger', filename='log.log')[source]

Initialise a logger instance to print, either to file or to a notebook.

Parameters:
  • name (str) – the name of the logger

  • level (str) – the loglevel of the log. Can be DEBUG, INFO, WARNING or EXCEPTION (default INFO)

  • to_console (bool) – whether or not to print the log to console/notebook

  • to_file (bool) – whether or not to print the log to a file

  • overwrite (bool) – whether or not to overwrite an existing file

  • dirpath (str) – the path to a directory to save the log (if to_file is True)

  • filename (str) – the path to a file to save the log (if to_file is True)

Returns:

logger – the logger instance

Return type:

logging.Logger

selene.core.page module

class selene.core.page.Page(url, logger, id_page=0)[source]

Bases: object

A parent Page class. Both PageSelene and PageSoup inherit this class.

log(message, level='DEBUG')[source]

Output a log message, with the appropriate loglevel (default=DEBUG).

Parameters:
  • message (str) – the message to log

  • level (str) – the loglevel of the message

selene.core.utils module

class selene.core.page.Page(url, logger, id_page=0)[source]

Bases: object

A parent Page class. Both PageSelene and PageSoup inherit this class.

log(message, level='DEBUG')[source]

Output a log message, with the appropriate loglevel (default=DEBUG).

Parameters:
  • message (str) – the message to log

  • level (str) – the loglevel of the message

Module contents