Writing Quantity Computers¶
This page contains practical guidance for writing your own quantity computers from scratch.
Before we start…¶
Before we start, let us clearly state that “quantity computers” are merely a useful convention. Nothing prevents you from using ChemFit without them. That being said, you should probably use them.
A recommended first step is to check if you can use one of the built-in ways:
If you already have a pure python function implementing your computation, have a look at the
to_quantity_computer()decorator, which also features in the Quickstart examples.If you are using an external simulation tool, like LAMMPS for example, have a look at the
FileBasedQuantityComputerand its corresponding doc page: File-Based Quantity Computers.If you are using ASE, try the
SinglePointASEComputerorMinimizationASEComputerdescribed in ASE-Based Quantity Computers.
If none of the built-in computers are to your taste, think about sub-classing them.
Let’s cook¶
For a completely fresh QuantityComputer, derive from the QuantityComputer base class and implement the _compute() method. That’s it.
The _compute method should accept exactly two arguments: A dictionary of parameters of type dict[str,Any] and an EvaluateContext.
It should return the dictionary of quantities.
This is probably a point at which we should familiarize ourselves with the…
- Golden Rule:
DO NOT MODIFY GLOBAL STATE FROM WITHIN THE COMPUTE METHOD. If you violate this rule, parallel evaluation of your quantity computer can be undefined. It does not have to be, but for everyone’s sake let’s assume it will be.
Importantly, the golden rule applies to instance variables of the computer itself as well.
Let’s illustrate what not to do:
class GoldenRuleViolator(QuantityComputer): # ... def _compute(self, params, ctx): self.bad = params["x"] # <-- bad mojo # ... return {"mojo" : self.bad}
Now what happens if you call the same instance of
GoldenRuleViolatorin parallel? That’s right! Bad things. The reason is that the value ofself.badcould be overwritten by another thread in the middle of the compute function, which would make yourparamsand the returned quantities mismatched.You might say: “Why would I ever do something so stupid?”. Let me just say that you’d be surprised how easy it is to accidentally violate the Golden Rule. Even seemingly harmless patterns can violate this rule accidentally, especially when storing intermediate results on
self.
Therefore, if you have anything to communicate with the outside world, there are two options
Put it in the quantities dict and return it
Write to
ctx.meta
Let’s fix the GoldenRuleViolator:
class GoodCitizen(QuantityComputer):
# ...
def _compute(self, params, ctx):
bad = params["x"]
ctx.meta["bad"] = bad # <-- no problemo
# ...
return {"mojo" : bad}
Now there is no problem. All we ever do is write to bad which is local to the current function evaluation or to ctx.meta.bad which explicitly prevents any kind of race conditions.
Configuring a computer¶
Besides, the parameter dictionary passed on evaluation, a quantity computer may also want to be configured by different external parameters.
Symbolically we can imagine an external parameter \(f\), which influences the computation in some way:
The external parameter \(f\) may for example be
A path to file with atomic coordinates
A constant numeric prefactor
The charge of certain atoms
…
You name it! Anything that influences the quantities and is otherwise fixed.
If the quantity computer is a wrapped python function, it’s easy to bind external parameters. Check this out:
from chemfit.wrap_funcs import to_quantity_computer
@to_quantity_computer(pass_ctx=True)
def computer(params, ctx, f):
...
# Configure f=1
f1_computer = computer.bind(f=1)
# Configure f=2
f2_computer = computer.bind(f=2)
Note
If pass_ctx=True, all arguments except params and ctx must be bound..
If pass_ctx==False, all arguments except params have to be bound.
If we forego the to_quantity_computer() approach and we need external parameters, they should be accepted in the constructor.
from chemfit.abstract_objective_function import QuantityComputer
class Computer(QuantityComputer):
def __init__(self,f):
self.f = f
def _compute(self, params, ctx):
# We can make use of self.f in here
...
Important
A quantity computer becomes fully specified once it depends only on (parameters, ctx). At that point, all external parameters have been
fixed, either via bind() (for wrapped functions) or via the constructor (for class-based implementations).
Using the evaluation context¶
The EvaluateContext
provides a structured way to exchange information during evaluation
without relying on shared state.
It exposes three main fields:
ctx.meta— for results and diagnosticsctx.config— for read-only configurationctx.shared— for controlled shared state
ctx.meta¶
ctx.meta is a dictionary used to record auxiliary information about
the current evaluation. It is safe to write to and is typically used for:
debugging information
intermediate values that are not part of the returned quantities
provenance (e.g. which configuration was used)
performance metrics
ctx.meta["n_iterations"] = n_iter
ctx.meta["converged"] = converged
ctx.meta["structure_id"] = structure_id
Each evaluation has its own meta dictionary, so there are no race
conditions.
In addition to values written during evaluation, quantity computers may
also define static meta data. This is meta data attached to the
computer itself (e.g. a tag or identifier), which is automatically merged
into ctx.meta when the computer is evaluated.
This is useful for recording information that is constant across all evaluations of a given computer, such as:
a label or tag identifying the term
the origin of the data
a fixed configuration identifier
- Important:
Quantities that are part of the computation should be returned from
_compute, not written toctx.meta.- Rule of thumb:
If something is needed for the loss or further computation, return it. If it is only useful for inspection, debugging, or bookkeeping, store it in
ctx.meta.
ctx.config¶
ctx.config provides configuration information to the computation.
It should be treated as read-only.
Typical use cases include:
passing runtime options
controlling execution modes
toggling optional behavior
if ctx.config.get("compute_forces", False):
...
The main purpose of ctx.config is to allow behavior to vary per
evaluation, without requiring reconstruction of the quantity computer
or objective function.
In particular, ctx.config is useful when different evaluations may
run in different execution environments. For example, in distributed or
parallel settings, different calls may:
run on different cluster nodes
use different numbers of cores or GPUs
access different scratch directories
use different execution backends
scratch_dir = ctx.config.get("scratch_dir", "/tmp")
n_cores = ctx.config.get("n_cores", 1)
- Rule of thumb:
Use
ctx.configto influence how the computation is carried out, but never modify it inside_compute.
External parameters vs ctx.config¶
Both external parameters (passed via bind or the constructor) and
ctx.config can influence the behavior of a quantity computer, but
they serve different purposes.
External parameters define what is being computed. They are part of the identity of the quantity computer or objective term.
Typical examples include:
the system or structure being evaluated
a file path or dataset
physical constants or fixed model settings
These values should usually be fixed when constructing or specializing the quantity computer.
computer.bind(atoms_factory=my_structure)
Computer(f=2.0)
In contrast, ctx.config defines how a particular evaluation is
carried out.
Typical examples include:
enabling or disabling optional work
selecting approximate vs. exact evaluation modes
turning diagnostics on or off
passing execution-specific information (e.g. resources, paths)
The main reason to use ctx.config is that it can vary per
evaluation without requiring you to reconstruct the quantity computer
or objective term.
Rule of thumb:
Use external parameters if changing the value creates a different objective term.
Use
ctx.configif changing the value only affects how the same term is evaluated.
For example, changing the atomic structure of a system should be an
external parameter, while enabling additional diagnostics or selecting a
cheap evaluation mode should be handled through ctx.config.
Summary¶
ctx.meta: write freely, per-evaluation diagnostic data (plus static meta data from the computer)ctx.config: read-only, per-evaluation control of executionctx.shared: shared state, use with care
Calling computers from within computers¶
Note
This section is for fairly advanced use and, probably, most relevant if you are looking to implement your own execution wrapper for the CombinedObjectiveFunction, besides the built-in MPI and executor wrappers.
If we want to make calls to other computers from our custom computer, the recommended approach is to make use of the child context system to supply fresh contexts to the inner computers.
Here is a simple demonstration of the idea: We have an outer computer, which accepts a parent EvaluateContext and then later on splits of two child contexts using the child_contexts() context manager.
class OuterComputer(QuantityComputer):
def _compute(self, params, ctx):
# ...
with ctx.child_contexts(2) as child_contexts:
q1 = inner_computer1(params, child_contexts[0])
q2 = inner_computer2(params, child_contexts[1])
# ...
The benefit of this approach is two-fold
We get full meta-data provenance. All of the child meta data can be found in
ctx.meta["children"].Since the inner computers have their own context they can also be evaluated in parallel … although the example above does not make use of this.
Note
For parallel evaluation with an executor, use the map_with_context() function.
Differently from the regular executor map function, it correctly handles the ctx fields even if execution happens in different processes.
See also Parallel Execution.
Child-parent relationships for the different context fields¶
When creating child contexts via
child_contexts(),
the different fields of the context behave differently.
Understanding this behavior is important when composing quantity computers.
ctx.meta¶
Each child context receives its own independent meta dictionary.
During evaluation, child computers write to their own ctx.meta.
After the child_contexts block exits, the parent context collects
all child meta data under:
ctx.meta["children"]
This is a list containing the meta data of each child evaluation, in order.
This ensures full provenance: all information produced by child computations is preserved and accessible from the parent.
ctx.config¶
The config dictionary is passed from parent to child contexts as-is.
All child contexts see the same configuration, allowing them to adapt their behavior consistently.
value = ctx.config.get("mode")
Child contexts should treat config as read-only.
ctx.shared¶
The shared dictionary is shared between parent and child contexts.
This allows child computations to communicate and reuse data, for example through caching.
cache = ctx.shared.setdefault("cache", {})
Because ctx.shared may be accessed concurrently, all access must be
thread-safe.
Configuring child contexts¶
Besides the number of children,
child_contexts()
accepts an optional argument of type
ChildContextConfigurator.
A child context configurator allows you to customize how child contexts are created.
This can be useful when:
distributing work across resources
assigning identifiers or indices to child evaluations
modifying configuration for individual children
implementing custom execution strategies
The configurator is called once per child context and can modify the child context before it is used.
Conceptually, it allows you to control:
def configurator(idx_child_ctx, child_ctx, num_children, parent_ctx):
...
For example, you may want to assign each child a unique identifier:
def configurator(idx_child_ctx, child_ctx, num_children, parent_ctx):
child_ctx.meta["child_index"] = idx_child_ctx
Or adjust configuration per child:
def configurator(idx_child_ctx, child_ctx, num_children, parent_ctx):
child_ctx.config["worker_id"] = idx_child_ctx
This mechanism is particularly useful when writing execution wrappers (e.g. MPI or executor-based parallelization), where different children may correspond to different processes or resources.
- Rule of thumb:
Use a child context configurator when child evaluations need systematic differences in their context. Otherwise, the default behavior is sufficient.