Skip to content

Commit 8b47d2a

Browse files
Deepmind Teamdiegolascasas
Deepmind Team
authored andcommitted
Explicitly replace "import tensorflow" with "tensorflow.compat.v1"
PiperOrigin-RevId: 287568660
1 parent 5b22f16 commit 8b47d2a

25 files changed

+34028
-9
lines changed

alphafold_casp13/README.md

+151
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# AlphaFold
2+
3+
This package provides an implementation of the contact prediction network,
4+
associated model weights and CASP13 dataset as published in Nature.
5+
6+
Any publication that discloses findings arising from using this source code must
7+
cite *AlphaFold: Protein structure prediction using potentials from deep
8+
learning* by Andrew W. Senior, Richard Evans, John Jumper, James Kirkpatrick,
9+
Laurent Sifre, Tim Green, Chongli Qin, Augustin Žídek, Alexander W. R. Nelson,
10+
Alex Bridgland, Hugo Penedones, Stig Petersen, Karen Simonyan, Steve Crossan,
11+
Pushmeet Kohli, David T. Jones, David Silver, Koray Kavukcuoglu, Demis Hassabis.
12+
13+
## Setup
14+
15+
### Dependencies
16+
17+
* Python 3.6+.
18+
* [Abseil 0.8.0+](https://github.com/abseil/abseil-py)
19+
* [Numpy 1.16+](https://numpy.org)
20+
* [Six 1.12+](https://pypi.org/project/six/)
21+
* [Sonnet 1.35+](https://github.com/deepmind/sonnet)
22+
* [TensorFlow 1.14](https://tensorflow.org). Not compatible with TensorFlow
23+
2.0+.
24+
* [TensorFlow Probability 0.7.0](https://www.tensorflow.org/probability)
25+
26+
You can set up Python virtual environment with these dependencies inside the
27+
forked `deepmind_research` repository using:
28+
29+
```shell
30+
python3 -m venv alphafold_venv
31+
source alphafold_venv/bin/activate
32+
pip install -r alphafold_casp13/requirements.txt
33+
```
34+
35+
### Input data
36+
37+
The dataset can be downloaded from
38+
[Google Cloud Storage](https://console.cloud.google.com/storage/browser/alphafold_casp13_data).
39+
40+
Download it e.g. using `wget`:
41+
42+
```shell
43+
wget https://storage.googleapis.com/alphafold_casp13_data/casp13_data.zip
44+
```
45+
46+
The zip file contains 1 directory for each CASP13 target and a `LICENSE.md`
47+
file. Each target directory contains the following files:
48+
49+
1. `TARGET.tfrec` file. This is a
50+
[TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) file
51+
with serialized tf.train.Example protocol buffers that contain the features
52+
needed to run the model.
53+
1. `contacts/TARGET.pickle` file(s) with the predicted distogram.
54+
1. `contacts/TARGET.rr` file(s) with the contact map derived from the predicted
55+
distogram. The RR format is described on the
56+
[CASP website](http://predictioncenter.org/casp13/index.cgi?page=format#RR).
57+
58+
Note that for **T0999** the target was manually split based on hits in HHSearch
59+
into 5 sub-targets, hence there are 5 distograms
60+
(`contacts/T0999s{1,2,3,4,5}.pickle`) and 5 RR files
61+
(`contacts/T0999s{1,2,3,4,5}.rr`).
62+
63+
The `contacts/` folder is not needed to run the model, these files are included
64+
only for convenience so that you don't need to run the inference for CASP13
65+
targets to get the contact map.
66+
67+
### Model checkpoints
68+
69+
The model checkpoints can be downloaded from
70+
[Google Cloud Storage](https://console.cloud.google.com/storage/browser/alphafold_casp13_data).
71+
72+
Download them e.g. using `wget`:
73+
74+
```shell
75+
wget https://storage.googleapis.com/alphafold_casp13_data/alphafold_casp13_weights.zip
76+
```
77+
78+
The zip file contains:
79+
80+
1. A directory `873731`. This contains the weights for the distogram model.
81+
1. A directory `916425`. This contains the weights for the background distogram
82+
model.
83+
1. A directory `941521`. This contains the weights for the torsion model.
84+
1. `LICENSE.md`. The model checkpoints have a non-commercial license which is
85+
defined in this file.
86+
87+
Each directory with model weights contains a number of different model
88+
configurations. Each model has a config file and associated weights. There is
89+
only one torsion model. Each model directory also contains a stats file that is
90+
used for feature normalization specific to that model.
91+
92+
## Distogram prediction
93+
94+
### Running the system
95+
96+
You can use the `run_eval.sh` script to run the entire Distogram prediction
97+
system. There are a few steps you need to start with:
98+
99+
1. Download the input data as described above. Unpack the data in the
100+
directory with the code.
101+
1. Download the model checkpoints as described above. Unpack the data.
102+
1. In `run_eval.sh` set the following:
103+
* `DISTOGRAM_MODEL` to the path to the directory with the distogram model.
104+
* `BACKGROUND_MODEL` to the path to the directory with the background
105+
model.
106+
* `TORSION_MODEL` to the path to the directory with the torsion model.
107+
* `TARGET` to the path to the directory with the target input data.
108+
109+
Then run `alphafold_casp13/run_eval.sh` from the `deepmind_research` parent
110+
directory (you will get errors if you try running `run_eval.sh` directly from
111+
the `alphafold_casp13` directory).
112+
113+
The contact prediction works in the following way:
114+
115+
1. 4 replicas (by *replica* we mean a configuration file describing the network
116+
architecture and a snapshot with the network weights), each with slightly
117+
different model configuration, are launched to predict the distogram.
118+
1. 4 replicas, each with slightly different model configuration are launched to
119+
predict the background distogram.
120+
1. 1 replica is launched to predict the torsions.
121+
1. The predictions from the different replicas are averaged together using
122+
`ensemble_contact_maps.py`.
123+
1. The predictions for the 64 × 64 distogram crops are pasted together using
124+
`paste_contact_maps.py`.
125+
126+
When running `run_eval.sh` the output has the following directory structure:
127+
128+
* **distogram/**: Contains 4 subfolders, one for each replica. Each of these
129+
contain the predicted ASA, secondary structure and a pickle file with the
130+
distogram for each crop. It also contains an `ensemble` directory with the
131+
ensembled distograms.
132+
* **background_distogram/**: Contains 4 subfolders, one for each replica. Each
133+
of these contain a pickle file with the background distogram for each crop.
134+
It also contains an `ensemble` directory with the ensembled background
135+
distograms.
136+
* **torsion/**: Contains 1 subfolder as there was only a single replica. This
137+
folder contains contains the predicted ASA, secondary structure, backbone
138+
torsions and a pickle file with the distogram for each crop. It also
139+
contains an `ensemble` directory with the ensembled torsions.
140+
* **pasted/**: Contains distograms obtained from the ensembled distograms by
141+
pasting. An RR contact map file is computed from this pasted distogram.
142+
**This is the final distogram that was used in the subsequent AlphaFold
143+
folding pipeline in CASP13.**
144+
145+
## Data splits
146+
147+
We used a version of [PDB](https://www.rcsb.org/) downloaded on 2018-03-15. The
148+
train/test split can be found in the `train_domains.txt` and `test_domains.txt`
149+
files.
150+
151+
Disclaimer: This is not an official Google product.

alphafold_casp13/asa_output.py

+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Lint as: python3.
2+
# Copyright 2019 DeepMind Technologies Limited
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
"""Class for predicting Accessible Surface Area."""
16+
17+
import tensorflow.compat.v1 as tf
18+
from tensorflow.contrib import layers as contrib_layers
19+
20+
21+
class ASAOutputLayer(object):
22+
"""An output layer to predict Accessible Surface Area."""
23+
24+
def __init__(self, name='asa'):
25+
self.name = name
26+
27+
def compute_asa_output(self, activations):
28+
"""Just compute the logits and outputs given activations."""
29+
asa_logits = contrib_layers.linear(
30+
activations,
31+
1,
32+
weights_initializer=tf.random_uniform_initializer(-0.01, 0.01),
33+
scope='ASALogits')
34+
self.asa_output = tf.nn.relu(asa_logits, name='ASA_output_relu')
35+
36+
return asa_logits

alphafold_casp13/config_dict.py

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Lint as: python3.
2+
# Copyright 2019 DeepMind Technologies Limited
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
"""Utilities for storing configuration flags."""
16+
17+
import json
18+
19+
20+
class ConfigDict(dict):
21+
"""Configuration dictionary with convenient dot element access."""
22+
23+
def __init__(self, *args, **kwargs):
24+
super(ConfigDict, self).__init__(*args, **kwargs)
25+
for arg in args:
26+
if isinstance(arg, dict):
27+
for key, value in arg.items():
28+
self._add(key, value)
29+
for key, value in kwargs.items():
30+
self._add(key, value)
31+
32+
def _add(self, key, value):
33+
if isinstance(value, dict):
34+
self[key] = ConfigDict(value)
35+
else:
36+
self[key] = value
37+
38+
def __getattr__(self, attr):
39+
try:
40+
return self[attr]
41+
except KeyError as e:
42+
raise AttributeError(e)
43+
44+
def __setattr__(self, key, value):
45+
self.__setitem__(key, value)
46+
47+
def __setitem__(self, key, value):
48+
super(ConfigDict, self).__setitem__(key, value)
49+
self.__dict__.update({key: value})
50+
51+
def __delattr__(self, item):
52+
self.__delitem__(item)
53+
54+
def __delitem__(self, key):
55+
super(ConfigDict, self).__delitem__(key)
56+
del self.__dict__[key]
57+
58+
def to_json(self):
59+
return json.dumps(self)
60+
61+
@classmethod
62+
def from_json(cls, json_string):
63+
return cls(json.loads(json_string))

0 commit comments

Comments
 (0)