wcraas_discovery package

Submodules

wcraas_discovery.cli module

Console script for wcraas_discovery.

wcraas_discovery.wcraas_discovery module

The WCraaS Discovery module is responsible for providing discovery services for the platform.

class wcraas_discovery.wcraas_discovery.DiscoveryWorker(amqp: wcraas_common.config.AMQPConfig, loglevel: int, *args, **kwargs)[source]

Bases: wcraas_common.wcraas_common.WcraasWorker

Discovery Worker for the WCraaS platform. Provides the discover RPC function.

>>> from wcraas_discovery.config import Config
>>> cn = DiscoveryWorker(*Config.fromenv())
discover(url: str) → Dict[str, Dict[str, List[str]]][source]

Faktory entrypoint for the discovery process.

Parameters:url (string) – The url to scrape.
static extract(html_body: str, origin_url: str) → Dict[str, List[str]][source]

Given an html body and its origin URL, extract URLs and categorize them to inbound & outbound.

Parameters:
  • html_body (string) – HTML content of the craweld endpoint
  • origin_url (string) – The URL (endpoint) from which the above body originates
start() → None[source]

Asynchronous runtime for the worker, responsible of managing and maintaining async context open.

Module contents

Top-level package for wcraas_discovery.