rootfinder

xitorch.optimize.rootfinder(fcn: Callable[[], torch.Tensor], y0: torch.Tensor, params: Sequence[Any] = [], bck_options: Mapping[str, Any] = {}, method: Optional[Union[str, Callable]] = None, **fwd_options) → torch.Tensor[source]

Solving the rootfinder equation of a given function,

\[\mathbf{0} = \mathbf{f}(\mathbf{y}, \theta)\]

where \(\mathbf{f}\) is a function that can be non-linear and produce output of the same shape of \(\mathbf{y}\), and \(\theta\) is other parameters required in the function. The output of this block is \(\mathbf{y}\) that produces the \(\mathbf{0}\) as the output.

Parameters
  • fcn (callable) – The function \(\mathbf{f}\) with output tensor (*ny)

  • y0 (torch.tensor) – Initial guess of the solution with shape (*ny)

  • params (list) – Sequence of any other parameters to be put in fcn

  • bck_options (dict) – Method-specific options for the backward solve (see xitorch.linalg.solve())

  • method (str or callable or None) – Rootfinder method. If None, it will choose "broyden1".

  • **fwd_options – Method-specific options (see method section)

Returns

The solution which satisfies \(\mathbf{0} = \mathbf{f}(\mathbf{y},\theta)\) with shape (*ny)

Return type

torch.tensor

Example

>>> def func1(y, A):  # example function
...     return torch.tanh(A @ y + 0.1) + y / 2.0
>>> A = torch.tensor([[1.1, 0.4], [0.3, 0.8]]).requires_grad_()
>>> y0 = torch.zeros((2,1))  # zeros as the initial guess
>>> yroot = rootfinder(func1, y0, params=(A,))
>>> print(yroot)
tensor([[-0.0459],
        [-0.0663]], grad_fn=<_RootFinderBackward>)
method="broyden1"
rootfinder(..., method="broyden1", *, alpha=None, uv0=None, max_rank=None, maxiter=None, f_tol=None, f_rtol=None, x_tol=None, x_rtol=None, line_search=True, verbose=False, custom_terminator=None)

Solve the root finder or linear equation using the first Broyden method 1. It can be used to solve minimization by finding the root of the function’s gradient.

References

1

B.A. van der Rotten, PhD thesis, “A limited memory Broyden method to solve high-dimensional systems of nonlinear equations”. Mathematisch Instituut, Universiteit Leiden, The Netherlands (2003). https://web.archive.org/web/20161022015821/http://www.math.leidenuniv.nl/scripties/Rotten.pdf

Keyword Arguments
  • alpha (float or None) – The initial guess of inverse Jacobian is - alpha * I + u v^T.

  • uv0 (tuple of tensors or str or None) – The initial guess of inverse Jacobian is - alpha * I + u v^T. If "svd", then it uses 1-rank svd to obtain u and v. If None, then u and v are zeros.

  • max_rank (int or None) – The maximum rank of inverse Jacobian approximation. If None, it is inf.

  • maxiter (int or None) – Maximum number of iterations, or inf if it is set to None.

  • f_tol (float or None) – The absolute tolerance of the norm of the output f.

  • f_rtol (float or None) – The relative tolerance of the norm of the output f.

  • x_tol (float or None) – The absolute tolerance of the norm of the input x.

  • x_rtol (float or None) – The relative tolerance of the norm of the input x.

  • line_search (bool or str) – Options to perform line search. If True, it is set to "armijo".

  • verbose (bool) – Options for verbosity

method="broyden2"
rootfinder(..., method="broyden2", *, alpha=None, uv0=None, max_rank=None, maxiter=None, f_tol=None, f_rtol=None, x_tol=None, x_rtol=None, line_search=True, verbose=False, custom_terminator=None)

Solve the root finder or linear equation using the second Broyden method 2. It can be used to solve minimization by finding the root of the function’s gradient.

References

2

B.A. van der Rotten, PhD thesis, “A limited memory Broyden method to solve high-dimensional systems of nonlinear equations”. Mathematisch Instituut, Universiteit Leiden, The Netherlands (2003). https://web.archive.org/web/20161022015821/http://www.math.leidenuniv.nl/scripties/Rotten.pdf

Keyword Arguments
  • alpha (float or None) – The initial guess of inverse Jacobian is - alpha * I + u v^T.

  • uv0 (tuple of tensors or str or None) – The initial guess of inverse Jacobian is - alpha * I + u v^T. If "svd", then it uses 1-rank svd to obtain u and v. If None, then u and v are zeros.

  • max_rank (int or None) – The maximum rank of inverse Jacobian approximation. If None, it is inf.

  • maxiter (int or None) – Maximum number of iterations, or inf if it is set to None.

  • f_tol (float or None) – The absolute tolerance of the norm of the output f.

  • f_rtol (float or None) – The relative tolerance of the norm of the output f.

  • x_tol (float or None) – The absolute tolerance of the norm of the input x.

  • x_rtol (float or None) – The relative tolerance of the norm of the input x.

  • line_search (bool or str) – Options to perform line search. If True, it is set to "armijo".

  • verbose (bool) – Options for verbosity

method="linearmixing"
rootfinder(..., method="linearmixing", *, alpha=None, maxiter=None, f_tol=None, f_rtol=None, x_tol=None, x_rtol=None, line_search=True, verbose=False)

Solve the root finding problem by approximating the inverse of Jacobian to be a constant scalar.

Keyword Arguments
  • alpha (float or None) – The initial guess of inverse Jacobian is -alpha * I.

  • maxiter (int or None) – Maximum number of iterations, or inf if it is set to None.

  • f_tol (float or None) – The absolute tolerance of the norm of the output f.

  • f_rtol (float or None) – The relative tolerance of the norm of the output f.

  • x_tol (float or None) – The absolute tolerance of the norm of the input x.

  • x_rtol (float or None) – The relative tolerance of the norm of the input x.

  • line_search (bool or str) – Options to perform line search. If True, it is set to "armijo".

  • verbose (bool) – Options for verbosity