parkour.graph documentation

combine

(combine node cls)(combine node var & args)

Add combine task to job node `node`, as implemented by `Reducer` class `cls`,
or Clojure var `var` and optional `args`.

Source

config

(config node & steps)

Add arbitrary configuration steps to `node`, which may be either a single job
node or a vector of job nodes.

Source

execute

(execute graph conf jname)

Execute Hadoop jobs for the job graph `graph`, which should be a job graph
leaf node or vector of leaf nodes.  Jobs are configured starting with base
configuration `conf` and named based on the string `jname`.  Returns a vector of
the distributed sequences produced by the job graph leaves.

Source

fexecute

(fexecute graph conf jname)

As per `execute`, but require and return only a single result desq;
almost `(comp first execute)`.

Source

input

(input node)

Return a fresh `:input`-stage job graph node consuming from the provided dseq
`node`.  If instead provided an `:input`-stage node or vector of such nodes,
acts as the identity function.

Source

map

(map node cls)(map node var & args)

Add map task to job node `node`, as implemented by `Mapper` class `cls`, or
Clojure var `var` and optional `args`.

Source

node-fn

(node-fn node conf jname)

Return a function for executing the job defined by the job node `node`, using
base configuration `conf` and job name `jname`.

Source

node-job

(node-job node conf jname)

Hadoop `Job` for job node `node`, starting with base configuration `conf`
and named `jname`.

Source

output

(output node dsink)(output node & named-dsinks)

Add output task to job node `node` for writing to `dsink` or named-output
name-dsink pairs `named-dsinks`.  Yields either a new `:input`-stage node
reading from the written output or a vector of such nodes.

Source

partition

(partition node step)(partition node step cls)(partition node step var & args)

Add partition task to the provided job node `node`, as configured by `step`
and optionally implemented by either Partitioner class `cls` or Clojure var
`var` & optional `args`.  The `node` may be either a single job node or a vector
of job nodes to co-group.  The `step` may be either a configuration step or a
vector of the two map-output key & value classes.

Source

reduce

(reduce node cls)(reduce node var & args)

Add reduce task to job node `node`, as implemented by `Reducer` class `cls`,
or Clojure var `var` and optional `args`.

Source

run-graph

(run-graph runner graph outputs)

Execute the data-flow graph described by `graph` using the function executor
`runner`.  Each key in `graph` identifies a particular entry.  Each value is a
tuple of `(inputs, f)`, where `inputs` is a sequence of other `graph` keys and
`f` is a function calculating that entry's result given `inputs`.  Returns a
vector of the result entries for the keys in the collection `outputs`.

Source

run-job

(run-job job)

Run `job` and wait synchronously for it to complete.  Kills the job on
exceptions or JVM shutdown.  Unlike the `Job#waitForCompletion()` method, does
not swallow `InterruptedException`.

Source

shuffle

(shuffle [ckey] [[ckey cval]])

Base shuffle configuration; sets map output key & value types to
the classes `ckey` and `cval` respectively.

Source

sink

(sink node dsink)(sink node & named-dsinks)

Deprecated alias for `output`.

Source

source

(source dseq)

Deprecated alias for `input`.

Source

Generated by Codox

Parkour 0.6.3 API documentation

Namespaces

Public Vars