cpalgnodes.graph module

Base classes for setting up the underlying graph structure

These classes describe the basic graph structure:

BaseGraph

The basic graph class used in the representation.

NodeBase

The base class for all nodes

GraphBuilder

The class responsible for constructing the data dependency graph and working out the correct execution order

class cpalgnodes.graph.BaseGraph(incoming_graph_data=None, **attr)

Bases: DiGraph

Base for custom graph classes

The edges in this graph represent data dependencies between the nodes. The individual data items are then added to these edges as attributes. Additional strong dependencies not due to a data dependency can also be added. These are marked with an ‘AdditionalStrongDependency’ edge attribute.

property data_dependency_graph

Return a graph only containing the data dependencies

define_edge_attribute(key: str, default_value: Any)

Define a new attribute

Parameters:
  • key (str) – The name of the attribute

  • default_value – The default value to set for the attribute. It can also be a callable, taking the in node, out node and edge as arguments and returning the new value to set.

classmethod edge_attr_dict_factory()

Define attributes to appear on every edge

filter_edges(condition: Callable[[BaseNode, BaseNode, Dict], bool])

Produces a subgraph containing only edges passing the condition

Parameters:

condition (Callable[[BaseNode, BaseNode, Dict], bool]) – The returned subgraph will only contain edges which pass this condition. The arguments are the input and output nodes of the edge and its dictionary of attributes.

property strong_dependency_graph

Return a graph containing the strong dependencies (including) the data dependencies

class cpalgnodes.graph.BaseNode

Bases: ABC

Abstract base class for nodes in the dependency graph

A node is the base unit of the graph. Each node reports the data that it produces and that it consumes. The dependency mechanism then uses this information to calculate the order in which they should be executed.

Each node should have a unique name so that it can be retrieved from the graph.

The extra priority property is used to break ties between nodes in the graph when deciding on the order. Nodes with higher priority will try to be placed before nodes with lower priority, though this will never cause any strong dependency to be violated.

abstract property name: str

The name of this node (should be unique within the graph)

property priority: int

Nodes with a higher priority will try to be ordered earlier

abstract property produces: Iterable

All data that this node produces

abstract property requires: Iterable

All data that this node requires

exception cpalgnodes.graph.CyclicGraphError(cycles: List[List[BaseNode]])

Bases: ValueError

Error raised when the produced graph would contain a cycle

class cpalgnodes.graph.GraphBuilder

Bases: Mapping

Base class for building the dependency graph

This class holds a list of the nodes in a graph. The full graph is then created by the build_graph method. A full linear ordering obeying any weak dependencies and node priorities can be extracted using the order_graph method.

This class also acts as a mapping from node names to the nodes themselves.

The main points for base classes to customise this behaviour are:
  • _graph_cls: (class data member) determines which graph class to setup

  • get_producer: Get the node which produces a particular piece of data

  • _populate_graph: Fill the graph with nodes and edges between them

  • _get_weak_deps: Get any weak dependencies (used to order the graph)

  • _get_strong_deps: Get any extra strong dependencies

  • _make_node_key_func: Create functions to order nodes in the graph

add_node(node: BaseNode)

Add a new node into the node list

If the node already exists this is a no-op

Raises:
add_strong_dependency(before: str | BaseNode, after: str | BaseNode)

Add a strong dependency to the graph

A strong dependency enforces the order onto the graph and will cause an error if it introduces a cycle.

Parameters:
  • before (str, BaseNode) – The node which will come first

  • after (str, BaseNode) – The node which will come second

Raises:

KeyError – One of the nodes is not in the graph

add_weak_dependency(before: str | BaseNode, after: str | BaseNode)

Add a weak dependency to the graph

A weak dependency is an ordering hint to the scheduler and will be ignored if it would cause a cycle

Parameters:
  • before (str, BaseNode) – The node which should come first

  • after (str, BaseNode) – The node which should come second

Raises:

KeyError – One of the nodes is not in the graph

classmethod annotate_order(graph: BaseGraph, order: List[BaseNode], key: str = 'TotalOrder')

Annotate edges in the graph with a boolean saying whether or not it is part of the order

Parameters:
  • graph (BaseGraph) – The graph to annotate

  • order (List[BaseNode]) – The ordering of the nodes to annotate. Should be returned by the order_graph method

  • key (str) – The name of the edge attribute under which to store the ordering

  • otherwise. (The attribute value will be True for an edge in the order and False) –

build_graph(ignore_unmet=True, extra_producers={}, extra_requirements={}) BaseGraph

Create the dependency graph

Parameters:

ignore_unmet (bool) – If False, raise a KeyError if a node depends on data which has no known producer

Return type:

The produced dependency graph

Raises:
  • KeyError – A node depends on data with no known producer and ignore_unmet is False

  • CyclicGraphError – The graph contains cycles

classmethod check_cycles(graph: BaseGraph)

Check for cycles in a graph

Parameters:

graph (BaseGraph) – The graph to search

Raises:

CyclicGraphError – Any cycle is found

get_producer(data, exact: bool = False) BaseNode

Get the node which produces a particular data item

Parameters:
  • data – The data being queried

  • exact (bool) – Some implementations of the builder allow for some data items to alias others. If exact is True it bypasses this mechanism, otherwise it does not. By default False

Raises:

KeyError – No node produces this data:

order_graph(graph: BaseGraph) List[BaseNode]

Calculate the execution order for the graph

The graph will be ordered using the topological sort algorithm. Ties are broken using the function returned by _make_node_key_func

Parameters:

graph (BaseGraph) – The graph to order (should be returned by the build_graph method)

Return type:

A list of the nodes in the chosen execution order

remove_node(node: str | BaseNode)

Remove a node from the graph

Parameters:

node (Union[str, BaseNode]) – The node to remove, can be the node object or its name

Raises:

KeyError – The node does not exist in the graph builder

exception cpalgnodes.graph.NodeAlreadyExistsError(new: BaseNode, existing: BaseNode)

Bases: KeyError

Error raised when attempting to define a node that already exists in the graph

class cpalgnodes.graph.NodeWeakDependencyOrdering(node: BaseNode, depends_on_this: Iterable[BaseNode], this_depends_on: Iterable[BaseNode])

Bases: object

Helper class for doing weak dependency orderings

classmethod from_weak_dep_list(node: BaseNode, weak_deps: Iterable[Tuple[BaseNode, BaseNode]])

Create the proxy from a weak dependency list

Parameters:
  • node (BaseNode) – The node whose weak dependencies are being determined

  • weak_deps (Iterable[Tuple[BaseNode, BaseNode]]) – A list of (source, target) pairs describing the weak dependency

exception cpalgnodes.graph.OutputAlreadyProducedError(new: BaseNode, existing: BaseNode, data)

Bases: ValueError

Error raised when attempting to add a node which produces an output which is already in the graph

exception cpalgnodes.graph.UnmetDataDependencyError(node: BaseNode, data: Any)

Bases: Exception

Error raised when the graph builder cannot satisfy a node’s data requirement