DeepBioisostere

Deep Learning-based Bioisosteric Replacements for Optimization of Multiple Molecular Properties

If you want to re-train DeepBioisostere model without data generation, training data are available at the following Zenodo repository under the Apache License 2.0: https://doi.org/10.5281/zenodo.17804556. After downloading the dataset, go to Training DeepBioisostere.
Or, if you want to re-train DeepBioisostere model with data generation by MMP analysis, go to MMP Analysis.

MMP Analysis

All the necessary source code files are in:

./data

Training DeepBioisostere

After getting data for training from the Zenodo repository or manually running MMPA, you can re-train a new model by:

python ./train_main.py

Training arguments that were used to train DeepBioisostere model in our paper can be found in jobscripts/submit_train.sh.

And go to Optimize a molecule with DeepBioisostere.

Optimize a molecule with DeepBioisostere

An example for molecule optimization with DeepBioisostere can be found in ./example.py and ./example.ipynb files. The process can be divided as 1) initializing DeepBioisotere model, 2) initializing Generator class, and 3) molecule optimization.

All the trained model parameters (.pt files) can be found at ./model_save.

For the molecule optimizaiton process, we provide two options about leaving fragment selection; 1) selection by DeepBioisostere model and 2) manual selection. Below are the full descriptions about the overall process and the two options.

import os
from rdkit import Chem
from scripts.conditioning import Conditioner
from scripts.generate import Generator
from scripts.model import DeepBioisostere
from scripts.property import calc_logP, calc_Mw, calc_QED, calc_SAscore


# Setting smiles to optimize
smi1 = "ClC(Cc1c(C(Nc2c(Br)cccc2)=O)cccc1)=O"
smi2 = "Cc1ccc2cnc(N(C)CCc3ccccn3)nc2c1"

# USER SETTINGS
device = "cpu"
num_cores = 4
batch_size = 512
num_sample_each_mol = 100
new_frag_type = "all"      # one of ["test", "train", "valid", "all"]
properties_to_control = ["mw", "logp"]  # You don't need to worry about the order!

# Set model and fragment library paths (based on this project directory)
properties = sorted(properties_to_control)
proj_dir = os.path.dirname(os.path.abspath(__file__))
model_path = f"{proj_dir}/model_save/DeepBioisostere_{'_'.join(properties)}.pt"
frag_lib_path = f"{proj_dir}/fragment_library/"

# Initialize model and generator
model = DeepBioisostere.from_trained_model(model_path, properties=properties)
conditioner = Conditioner(
    phase="generation",
    properties=properties,
)
generator = Generator(
    model=model,
    processed_frag_dir=frag_lib_path,
    conditioner=conditioner,
    device=device,
    num_cores=num_cores,
    batch_size=batch_size,
    new_frag_type=new_frag_type,
    num_sample_each_mol=num_sample_each_mol,
    properties=properties,
)

# Option 1. Generate with DeepBioisostere
print("Option 1. Generate with DeepBioisostere.")
start_time = time.time()
input_list = [
    (smi1, {"mw": 0, "logp": -1}),
    (smi2, {"mw": 0, "logp": -1}),
]
result_df = generator.generate(input_list)
result_df.to_csv("generation_result.csv", index=False)
print("Elapsed time: ", time.time() - start_time)

# Option 2. Generate with a specific leaving fragment
print("Option 2. Generate with a specific leaving fragment.")
start_time = time.time()
input_list = [
    (smi1, "[*]c1ccccc1[*]", 4, {"mw": 0, "logp": -1}),
    (smi2, "[*]c1ccccn1", 12, {"mw": 0, "logp": -1}),
]
result_df = generator.generate_with_leaving_frag(input_list)
result_df.to_csv("generation_result_with_leaving_frag.csv", index=False)
print("Elapsed time: ", time.time() - start_time)

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
data		data
exps		exps
fragment_library		fragment_library
jobscripts		jobscripts
model_save		model_save
scripts		scripts
LICENSE		LICENSE
README.md		README.md
baseline_example.py		baseline_example.py
environment.yml		environment.yml
example.ipynb		example.ipynb
example.py		example.py
requirements.txt		requirements.txt
run_baseline.py		run_baseline.py
run_generate.py		run_generate.py
setup.py		setup.py
train_main.py		train_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepBioisostere

Table of Contents

Install Dependencies

Training data for DeepBioisostere

MMP Analysis

Training DeepBioisostere

Optimize a molecule with DeepBioisostere

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeepBioisostere

Table of Contents

Install Dependencies

Training data for DeepBioisostere

MMP Analysis

Training DeepBioisostere

Optimize a molecule with DeepBioisostere

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages