Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atomate2 / OpenMM OPLS-AA Enhancements #1111

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
65f68a7
added opls_lj function; pre-commit fails for (1) eps._value call, per…
Jan 16, 2025
d50a597
eps14 is not the correct value, off by 0.5 and eps should be used ins…
Jan 17, 2025
925e117
initial commit to implementing opls_lj with error checking for proper…
Jan 17, 2025
9864f8d
the use of io.StringIO is seemingly a bug? works fine without needing…
Jan 18, 2025
05ae123
added try/except to handle io.StringIO failure
Jan 18, 2025
91a1c88
added 1-4 scaling conditional statement; fized opls nonbonded method …
Jan 20, 2025
b042cae
try/except failure should specify selenium or webdriver_manager packa…
Jan 20, 2025
5226c75
added generate_opls_xml function with placeholder for ligpargen commands
Jan 24, 2025
84baad8
fixed shutil.copy to shutil.move to avoid redundant files in tmpdir
Jan 24, 2025
9883fe7
fixed linter issues (namely, subprocess.run(...shell=True) was unsafe…
Jan 24, 2025
fc01e94
completed OPLS Docker documentation + except for shifter issues, gene…
Jan 24, 2025
ad4ff11
Merge branch 'materialsproject:main' into feature/OPLS-AA
shehan807 Jan 24, 2025
b45af33
completed boss->ligpargen->docker->podman-hpc->subprocess pipeline to…
Jan 25, 2025
36fd518
minor typo in docs
Jan 25, 2025
3ba99ee
updated tests for download_ and generate_opls_xml functions
Feb 6, 2025
86d1102
removed defusedxml dependency
Mar 6, 2025
979f3f0
removed textwrap dependency
Mar 6, 2025
ea70919
replaced tag-based opls flag to ff_kwargs for opls
Mar 6, 2025
e195cdc
added names_params documentation
Mar 6, 2025
0a7e920
test_create_ff_from_xml
Mar 6, 2025
40fbe0b
pre-commit
Mar 6, 2025
90d326c
ff_kwargs is not None
Mar 6, 2025
ede6f44
all new tests pass pytest
Mar 7, 2025
eedcb68
final pre-commit
Mar 7, 2025
e9cf207
added monkeypatching env variables
Mar 10, 2025
02b15b8
fixed docker and/or podman-hpc edge case
Mar 10, 2025
cd9b200
removed rogue file
Mar 10, 2025
1192345
Merge pull request #2 from materialsproject/main
shehan807 Mar 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions docs/user/codes/openmm.md
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,144 @@ run_locally(flows[rank], ensure_success=True)

</details>

### Varying Forcefields: OPLS

<details>
<summary>Learn to generate OPLS Forcefield Parameters</summary>

The OpenFF Force Fields provide a powerful starting point to simulate a variety of organic materials using general forcefields like [Parsley](https://doi.org/10.1021/acs.jctc.1c00571) and [Sage](https://pubs.acs.org/doi/10.1021/acs.jctc.3c00039). Just as is done through the OpenFF Toolkit and Interchange machinery, one can automate force field generation for custom force fields. For instance, LigParGen is an automatic OPLS-AA parameter generator for small organic molecules with both a [online server](https://traken.chem.yale.edu/ligpargen/) and open-source [repository](https://traken.chem.yale.edu/ligpargen/). You will see that for any custom parameter generation tool, one can create a container environment as a wrapper to plug into the workflow described up until now.

To do so, you will use the `generate_opls_xml(...)` function in `atomate2/openmm/utils`. This function runs a subprocess to call an image of the LigParGen repository (and all of its respective dependencies). Thus, this requires a local installation of [Docker](https://docs.docker.com/get-started/get-docker/) (otherwise, `download_opls_xml` can be run via the LigParGen website server instead). Once you have docker installed locally, `generate_opls_xml(...)` can be unlocked in three steps:

#### 1. Create a Private LigParGen Image

You will need to install [BOSS](https://zarbi.chem.yale.edu/software.html)--once you receive the email, follow the instructions, LICENSE guidelines, and save the `boss` directory in the same directory as the following `Dockerfile`:

```bash
FROM ubuntu:20.04

LABEL org.opencontainers.image.version="20.04"
LABEL org.opencontainers.image.ref.name="ubuntu"

ARG LAUNCHPAD_BUILD_ARCH
ARG RELEASE

RUN dpkg --add-architecture i386 && \
apt-get update && \
apt-get install -y \
libc6:i386 \
libncurses5:i386 \
libstdc++6:i386 \
zlib1g:i386 \
gcc-multilib \
g++-multilib \
binutils \
git \
curl \
libxrender1 \
csh && \
apt-get clean

RUN curl -L -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
rm Miniconda3-latest-Linux-x86_64.sh && \
/opt/conda/bin/conda init

RUN /opt/conda/bin/conda create -n ligpargen -y python=3.7 && \
/opt/conda/bin/conda install -n ligpargen -y -c rdkit rdkit && \
/opt/conda/bin/conda install -n ligpargen -y -c conda-forge openbabel

ENV PATH="/opt/conda/envs/ligpargen/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

RUN git clone https://github.com/Isra3l/ligpargen.git /opt/ligpargen && \
cd /opt/ligpargen && \
/opt/conda/envs/ligpargen/bin/pip install -e .

COPY ./boss /opt/BOSSdir

RUN chmod +x /opt/BOSSdir/*

ENV BOSSdir="/opt/BOSSdir"

WORKDIR /opt/output

RUN echo "source activate ligpargen" > ~/.bashrc

SHELL ["/bin/bash", "-c"]

CMD ["/bin/bash"]
```

It will help to have an account either via [DockerHub](https://hub.docker.com/) or the [NERSC registry](https://registry.nersc.gov/account/sign-in?redirect_url=%2Fharbor%2Fprojects) to *privately* upload your image. Next, follow the docker commands to upload an image to your registry of choice:

```bash
docker build -t USERNAME/ligpargen .
docker login
docker push USERNAME/ligpargen
```
Note: Be sure to check that on DockerHub, the image `Visibility` is set to `Private`.

Now, you can simply pull your image to which ever HPC cluster environment you choose to proceed with,

```bash
docker pull USERNAME/ligpargen:latest
```

On NERSC, users have the option of using [Shifter](https://docs.nersc.gov/development/containers/shifter/how-to-use/) or [Podman](https://docs.nersc.gov/development/containers/podman-hpc/overview/). We recommend Podman in this case to circumvent additional user-level permission requirements. The following Podman commands will work:

```bash
podman-hpc login docker.io
Username: USERNAME
Password:

podman-hpc pull docker.io/USERNAME/ligpargen:latest
```

#### 2. Set environment variables

Set the image name and container software (Docker, Shifter, Apptainer, etc.) to environment variables (consider adding these to your `~/.bashrc`):

```bash
export LPG_IMAGE_NAME="USERNAME/ligpargen:latest"
export CONTAINER_SOFTWARE="podman-hpc" # e.g.
```

#### 3. Run `generate_opls_xml`

A simple function call will create your desired .XML force field file (e.g., `EC.xml`):

```python
from atomate2.openmm.utils import generate_opls_xml

mols = {
"EC": {
"smiles": "C1COC(=O)O1",
"charge": "0", # default_value=0
"checkopt": 3, # default_value=3
"charge_method": "CM1A", # default_value="CM1A"
},
}
generate_opls_xml(mols)
```

Functionally, this is equivalent to running the following LigParGen command:

```bash
ligpargen -n EC -p EC -r EC -c 0 -o 3 -cgen CM1A -s C1COC(=O)O1
```

Now, just like before, you can create an `Interchange` object. Be sure to include `opls` as an `ff_kwargs` so the correct [geometric combination rules](https://traken.chem.yale.edu/ligpargen/openMM_tutorial.html) for OPLS force fields are invoked,

```python
elyte_interchange_job = generate_openmm_interchange(
mol_specs_dicts, ff_xmls=["EC.xml"], ff_kwargs=["opls"]
)
```

See that this general process can work transferably for *any* parameter generation software given you (1) create an image, (2) set the image name as an environment variable, and (3) minimally modify `generate_opls_xml(...)` to your own requirements. In future work, we'll improve this black-box type functionality to support wider parameter generation methods.

</details>

## Analysis with Emmet

For now, you'll need to make sure you have a particular emmet branch installed.
Expand Down
37 changes: 31 additions & 6 deletions src/atomate2/openmm/jobs/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,15 @@
from jobflow import Response
from openmm import Context, LangevinMiddleIntegrator, System, XmlSerializer
from openmm.app import PME, ForceField
from openmm.app.forcefield import NonbondedGenerator
from openmm.app.pdbfile import PDBFile
from openmm.unit import kelvin, picoseconds
from pymatgen.core import Element
from pymatgen.io.openff import get_atom_map

from atomate2.openff.utils import create_mol_spec, merge_specs_by_name_and_smiles
from atomate2.openmm.jobs.base import openmm_job
from atomate2.openmm.utils import opls_lj

try:
import openff.toolkit as tk
Expand All @@ -40,7 +42,10 @@ class XMLMoleculeFF:

def __init__(self, xml_string: str) -> None:
"""Create an XMLMoleculeFF object from a string version of the XML file."""
self.tree = ET.parse(io.StringIO(xml_string)) # noqa: S314
try:
self.tree = ET.parse(io.StringIO(xml_string)) # noqa: S314
except ET.ParseError:
self.tree = ET.parse(xml_string) # noqa: S314

root = self.tree.getroot()
canonical_order = {}
Expand Down Expand Up @@ -153,11 +158,10 @@ def from_file(cls, file: str | Path) -> XMLMoleculeFF:
return cls(xml_str)


def create_system_from_xml(
topology: tk.Topology,
def create_ff_from_xml(
xml_mols: list[XMLMoleculeFF],
) -> System:
"""Create an OpenMM system from a list of molecule specifications and XML files."""
"""Create OpenMM forcefield from a list of molecule specifications and XML files."""
io_files = []
for i, xml in enumerate(xml_mols):
xml_copy = copy.deepcopy(xml)
Expand All @@ -168,7 +172,7 @@ def create_system_from_xml(
for i, xml in enumerate(io_files[1:]): # type: ignore[assignment]
ff.loadFile(xml, resname_prefix=f"_{i + 1}")

return ff.createSystem(topology.to_openmm(), nonbondedMethod=PME)
return ff


@openmm_job
Expand All @@ -179,6 +183,7 @@ def generate_openmm_interchange(
xml_method_and_scaling: tuple[str, float] = None,
pack_box_kwargs: dict = None,
tags: list[str] = None,
ff_kwargs: list[str] = None,
) -> Response:
"""Generate an OpenMM Interchange object from a list of molecule specifications.

Expand Down Expand Up @@ -216,6 +221,8 @@ def generate_openmm_interchange(
toolkit.interchange.components._packmol.pack_box. Default is an empty dict.
tags : List[str], optional
A list of tags to attach to the task document.
ff_kwargs : List[str], optional
A list of additional keyword arguments for force field specification.

Returns
-------
Expand Down Expand Up @@ -280,7 +287,25 @@ def generate_openmm_interchange(
**pack_box_kwargs,
)

system = create_system_from_xml(topology, xml_mols)
ff = create_ff_from_xml(xml_mols)

# obtain 14 scaling values from forcefield
generator = ff.getGenerators()
for gen in generator:
if isinstance(gen, NonbondedGenerator):
c14 = gen.coulomb14scale
lj14 = gen.lj14scale

system = ff.createSystem(topology.to_openmm(), nonbondedMethod=PME)
if (ff_kwargs is not None) and ("opls" in ff_kwargs):
if (c14 != 0.5) or (lj14 != 0.5):
raise ValueError(
f"NonbondedForce class in XML,"
f"<NonbondedForce coulomb14scale='0.5' lj14scale='0.5'>,"
f"does not match OPLS convention,"
f"<NonbondedForce coulomb14scale='{c14}' lj14scale='{lj14}'>."
)
system = opls_lj(system)

# these values don't actually matter because integrator is only
# used to generate the state
Expand Down
Loading