ASE-Based Quantity Computers

The chemfit.ase_objective_function module integrates the Atomic Simulation Environment (ASE) into the generic fitting framework defined in chemfit.abstract_objective_function.

These classes allow you to use ASE calculators directly as QuantityComputer instances, producing structured dictionaries of results (energy, forces, stress, etc.) that can be fed into loss functions and optimizers.

The design is modular — all steps of the computation pipeline (calculator creation, parameter updates, atom loading, and result processing) are fully configurable.

Overview

An ASE-based computer in this framework typically performs these steps:

  1. Create or load an ase.Atoms object.

  2. Optionally modify it (constraints, reorientation, scaling, etc.).

  3. Attach an ASE Calculator to the atoms.

  4. Apply a dictionary of parameters to the calculator.

  5. Run the calculation with atoms.calc.calculate(atoms).

  6. Collect quantities (e.g., energy, forces) into a result dictionary.

This design makes it easy to couple ASE with gradient-free optimizers, or to fit interatomic potentials, empirical force fields, and machine-learned energy models.

Protocols

Several lightweight protocols define the expected behavior of components you can plug into a computer.

Each protocol is just a callable interface that defines one clear responsibility:

  • CalculatorFactory Creates or attaches an ASE calculator to an ase.Atoms object. It must set atoms.calc in place.

  • ParameterApplier Applies a dictionary of fitting parameters (dict[str, Any]) to the calculator currently attached to atoms.calc.

  • AtomsFactory Creates a new ase.Atoms object, e.g. by reading a structure file or generating atoms programmatically.

  • AtomsPostProcessor Optionally modifies an ase.Atoms object after it is created and before the calculator is attached — for example, to set constraints or adjust periodic boundary conditions.

  • QuantityProcessor Extracts data from the finished ASE calculation and returns a dictionary of computed quantities.

Each protocol is checked at runtime via check_protocol() to ensure objects passed into the computers conform to the expected interface.

Atoms Factory Example

A concrete helper, PathAtomsFactory, is provided to read atoms from a file:

from chemfit.ase_objective_function import PathAtomsFactory

atoms_factory = PathAtomsFactory("geometry.traj", index=0)
atoms = atoms_factory()   # returns an ase.Atoms object

SinglePointASEComputer

A single-point computer is the simplest kind of ASE-based quantity computer. It builds an Atoms object, attaches a calculator, applies parameters, and performs one calculation without geometry optimization.

Key arguments:

  • calc_factory - a function attaching a calculator to atoms.

  • param_applier - a function that applies a parameter dictionary.

  • atoms_factory - a factory producing an ase.Atoms object.

  • atoms_post_processor - optional modifier applied before calculation.

  • quantity_processors - list of callables that extract results.

  • tag - optional label for metadata.

The result of __call__(parameters) is a dictionary of quantities, typically including at least "energy" and possibly "forces" or "stress".

Internally, the base class calls all registered quantity processors to build the final result dictionary. The default processor simply returns all entries from calc.results plus the number of atoms.

Metadata

get_meta_data() returns a dictionary with:

  • tag - user-defined label.

  • n_atoms - number of atoms in the system.

  • type - the class name of the computer.

  • last - most recent computed quantities.

Example: Lennard-Jones Objective Term

The Lennard-Jones (LJ) unit test demonstrates how to build a full objective from ASE-based computers.

import functools
from chemfit.abstract_objective_function import QuantityComputerObjectiveFunction
from chemfit.ase_objective_function import SinglePointASEComputer
from chemfit.combined_objective_function import CombinedObjectiveFunction
from chemfit.fitter import Fitter

# Custom user-defined ASE adapters
from conftest import LJAtomsFactory, apply_params_lj, construct_lj, e_lj

def loss_function(quants: dict, e_ref: float):
    return (quants["energy"] - e_ref) ** 2

def lj_ob_term(r: float, eps: float, sigma: float):
    computer = SinglePointASEComputer(
        calc_factory=construct_lj,
        param_applier=apply_params_lj,
        atoms_factory=LJAtomsFactory(r),
        tag=f"lj_{r}",
    )

    return QuantityComputerObjectiveFunction(
        loss_function=functools.partial(loss_function, e_ref=e_lj(r, eps, sigma)),
        quantity_computer=computer,
    )

# Combine many LJ distances into one global objective
r_list = [2.5, 3.0, 3.5]
objective = CombinedObjectiveFunction(
    objective_functions=[lj_ob_term(r, 1.0, 1.0) for r in r_list]
)

fitter = Fitter(objective, initial_params={"epsilon": 2.0, "sigma": 1.5})
optimized_params = fitter.fit_scipy()

print(optimized_params)
# {'epsilon': ~1.0, 'sigma': ~1.0}

This pattern generalizes to any ASE-compatible calculator.

MinimizationASEComputer

A subclass of SinglePointASEComputer that performs a geometry relaxation to the nearest local minimum before running the final single-point calculation.

It uses ASE’s ase.optimize.BFGS optimizer internally.

Initialization parameters:

  • dt - timestep for the optimizer (default: 1e-2).

  • fmax - convergence threshold on maximum force (default: 1e-5).

  • max_steps - maximum number of relaxation steps (default: 2000).

All other arguments are the same as for SinglePointASEComputer.

Workflow

  1. The structure is reset to its reference positions.

  2. Velocities are zeroed.

  3. Calculator parameters are applied.

  4. A BFGS optimization is run until convergence or max steps reached.

  5. The relaxed structure is used for a single-point evaluation.

This class is useful for fitting potentials to equilibrium geometries, or for objectives that depend on relaxed energies rather than fixed configurations.

Quantity Processors

After the ASE calculation, one or more quantity processors are called. Each processor receives the calculator and atoms, and returns a dictionary of key-value pairs, which are merged into the final result.

The default processor is:

def default_quantity_processor(calc, atoms):
    return {**calc.results, "n_atoms": len(atoms)}

You can define additional processors to add, e.g., stress tensors, force norms, or derived physical quantities.

The Default Processor

Every ASE-based computer automatically prepends the built-in default_quantity_processor to its list of quantity processors.

This ensures that the calculator’s raw results (e.g. energy, forces, and other keys in calc.results) are always included in the output dictionary, even if you supply your own custom processors.

Your processors are executed after the default one, allowing you to extend or post-process those quantities without needing to repeat the basic extraction logic.

def my_processor(calc, atoms):
    # calc.results already present thanks to the default processor
    quants = {"force_norm": (calc.results["forces"] ** 2).sum() ** 0.5}
    return quants

computer = SinglePointASEComputer(
    calc_factory=construct_calc,
    param_applier=apply_params,
    atoms_factory=MyAtomsFactory(),
    quantity_processors=[my_processor],  # default comes first automatically
)

result = computer({"epsilon": 1.0, "sigma": 1.0})
# result contains energy, forces, and force_norm

Extending and Customizing

The ASE computers are designed to be composed, not subclassed.

Whenever possible, prefer composition — supplying your own factories, processors, and parameter appliers — rather than inheriting from the base classes. This keeps behavior explicit, reduces hidden state, and makes components easy to test and reuse across projects.

Recommended approach: compose behavior via constructor arguments.

For example, to add an extra computed property without subclassing:

import numpy as np
from chemfit.ase_objective_function import SinglePointASEComputer

def rms_force_processor(calc, atoms):
    f = calc.results.get("forces")
    if f is None:
        return {}
    return {"rms_force": np.sqrt((f**2).mean())}

computer = SinglePointASEComputer(
    calc_factory=construct_lj,
    param_applier=apply_params_lj,
    atoms_factory=LJAtomsFactory(2.5),
    quantity_processors=[rms_force_processor],
)

results = computer({"epsilon": 1.0, "sigma": 1.0})
print(results["energy"], results["rms_force"])

When to subclass

Subclass only when you need to extend lifecycle behavior that cannot be expressed through composition — for example, adding an additional relaxation step (as in MinimizationASEComputer) or modifying metadata structure.

Typical extension points:

  • _compute() — to customize how results are produced.

  • create_atoms_object() — to alter how Atoms are built or validated.

  • get_meta_data() — to expose custom metadata or diagnostic info.

Rule of thumb: Start with composition. Reach for subclassing only if you truly need to change the flow of computation itself.

Case Studies: Custom Quantities via Processors

These examples highlight how to express flexible objectives by composing a QuantityComputer with lightweight quantity processors—no subclassing required.

Assumptions (pseudo-helpers)

For illustration, assume the following small adapters exist:

  • construct_calc(atoms) — attaches an ASE calculator to atoms.calc.

  • apply_params(atoms, params) — updates parameters on atoms.calc.

  • MyAtomsFactory(arg) — creates an ase.Atoms object for the given argument.

(You can think of these as the Lennard–Jones helpers used in the unit tests.)

Dimer Distance Target (with Relaxation)

A simple case is to relax a geometry and match an inter-fragment distance to a reference. The processor augments calc.results with a custom metric (dimer_distance), and the loss depends only on that quantity.

from chemfit.abstract_objective_function import QuantityComputerObjectiveFunction
from chemfit.ase_objective_function import MinimizationASEComputer, PathAtomsFactory
from chemfit.fitter import Fitter

REF_DISTANCE = 3.2

def compute_dimer_distance(calc, atoms):
    return {"dimer_distance" : atoms.get_distance(0, 3)}

objective = QuantityComputerObjectiveFunction(
    loss_function=lambda q: (q["dimer_distance"] - REF_DISTANCE) ** 2,
    quantity_computer=MinimizationASEComputer(
        calc_factory=construct_calc,
        param_applier=apply_params,
        atoms_factory=PathAtomsFactory("ref.traj"),
        quantity_processors=[compute_dimer_distance],
        tag="dimer_distance",
    ),
)

fitter = Fitter(objective_function=objective, initial_params={"epsilon": 1.5, "sigma": 1.2})
optimal_params = fitter.fit_scipy(tol=1e-4, options={"maxiter": 50})

This pattern demonstrates how specialized geometric quantities can be integrated without modifying the computer class itself. The MinimizationASEComputer handles relaxation automatically before the measurement.

Kabsch RMSD Objective

Another example aligns a relaxed structure to a reference configuration using the Kabsch algorithm and minimizes the resulting RMSD. A custom processor caches the reference positions and returns the rotation, translation, and RMSD as new quantities.

from chemfit.abstract_objective_function import QuantityComputerObjectiveFunction
from chemfit.ase_objective_function import MinimizationASEComputer, PathAtomsFactory, AtomsFactory
from chemfit.fitter import Fitter

import chemfit.kabsch as kb

class KabschDistance:
    def __init__(self, atoms_factory: AtomsFactory):
        self.atoms_factory = atoms_factory
        self._positions_ref = None

    def __call__(self, calc, atoms):
        if self._positions_ref is None:
            self._positions_ref = self.atoms_factory().positions

        R, t = kb.kabsch(atoms.positions, self._positions_ref)
        pos_aligned = kb.apply_transform(atoms.positions, R, t)
        rmsd = kb.rmsd(pos_aligned, self._positions_ref)

        return {"kabsch_r": R, "kabsch_t": t, "kabsch_rmsd": rmsd}

objective = QuantityComputerObjectiveFunction(
    loss_function=lambda q: q["kabsch_rmsd"],
    quantity_computer=MinimizationASEComputer(
        calc_factory=construct_calc,
        param_applier=apply_params,
        atoms_factory=PathAtomsFactory("ref.traj"),
        quantity_processors=[KabschDistance(PathAtomsFactory("ref.traj"))],
        tag="kabsch",
    ),
)

fitter = Fitter(objective_function=objective, initial_params={"epsilon": 1.5, "sigma": 1.2})
optimal_params = fitter.fit_scipy(tol=1e-4, options={"maxiter": 50})

Design Notes

  • Composable: all behavior is supplied via small protocol objects.

  • Transparent: metadata always includes the most recent quantities.

  • Reproducible: atoms are lazily created and cached per instance.

  • ASE-native: works directly with ASE calculators and optimizers.

  • Debug-friendly: loggers and metadata help inspect intermediate steps.

These abstractions allow the fitting layer (e.g. chemfit.fitter.Fitter) to remain independent of the simulation backend while still exposing all relevant physical data through the quantity dictionaries.