Consider trying TSARKIMEX if the stiff part is strongly nonlinear.
Since this uses a single linear solve per time-step if you wish to lag the jacobian or preconditioner computation you must use also -snes_lag_jacobian_persists true or -snes_lag_jacobian_preconditioner true
udot = f(u)
by the stage equations
k_i = h f(u_0 + sum_j alpha_ij k_j) + h J sum_j gamma_ij k_j
and step completion formula
u_1 = u_0 + sum_j b_j k_j
with step size h and coefficients alpha_ij, gamma_ij, and b_i. Implementing the method in this form would require f(u) and the Jacobian J to be available, in addition to the shifted matrix I - h gamma_ii J. Following Hairer and Wanner, we define new variables for the stage equations
y_i = gamma_ij k_j
The k_j can be recovered because Gamma is invertible. Let C be the lower triangular part of Gamma^{-1} and define
A = Alpha Gamma^{-1}, bt^T = b^T Gamma^{-1}
to rewrite the method as
[M/(h gamma_ii) - J] y_i = f(u_0 + sum_j a_ij y_j) + M sum_j (c_ij/h) y_j
u_1 = u_0 + sum_j bt_j y_j
where we have introduced the mass matrix M. Continue by defining
ydot_i = 1/(h gamma_ii) y_i - sum_j (c_ij/h) y_j
or, more compactly in tensor notation
Ydot = 1/h (Gamma^{-1} \otimes I) Y .
Note that Gamma^{-1} is lower triangular. With this definition of Ydot in terms of known quantities and the current stage y_i, the stage equations reduce to performing one Newton step (typically with a lagged Jacobian) on the equation
g(u_0 + sum_j a_ij y_j + y_i, ydot_i) = 0
with initial guess y_i = 0.