Merge pull request #32 from MobleyLab/pKa_update

Updates for pKa challenge
samplchallenges · Jan 8, 2018 · 72fccf8 · 72fccf8
2 parents ba3908f + 15f8570
commit 72fccf8
Show file tree

Hide file tree

Showing 8 changed files with 24 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -91,7 +91,7 @@ Three formats of pKa prediction results will be evaluated:
 Detailed instructions for the pKa challenge can be found here: [pKa_challenge_instructions.md](pKa_challenge_instructions.md)
 
 Challenge start date: Oct 25, 2017   
-Challenge submission due: Jan 10, 2018  
+Challenge submission due: Jan 19, 2018  
 
 #### logD prediction
 Distribution coefficients for about 25 fragment- and drug-like small molecules that resemble small molecule protein kinase inhibitors (or fragments thereof).

diff --git a/pKa_challenge_instructions.md b/pKa_challenge_instructions.md
@@ -1,6 +1,6 @@
 # SAMPL6 pKa Challenge Instructions
 
-Challenge timeframe: Oct 25, 2017 to Jan 10, 2018  
+Challenge timeframe: Oct 25, 2017 to Jan 19, 2018  
 
 This challenge consists of predicting microscopic and macroscopic acid dissociation constants (pKas) of 24 small organic molecules. 
 These fragment-like small molecules are selected for their similarity to kinase inhibitors and for experimental tractability. 
@@ -78,7 +78,7 @@ If multiple microscopic pKas have close pKa values and overlapping changes in UV
 
 ## Due Date
 
-Your predictions must be uploaded on the D3R SAMPL6 web-page by January 10, 2018. 
+Your predictions must be uploaded on the D3R SAMPL6 web-page by January 19, 2018. 
 The experimental results will be released immediately after the challenge closes. 
 You must use the provided templates to upload your predictions to the [SAMPL website](https://drugdesigndata.org/about/sampl6). Additional information on using these templates is provided below.
 
@@ -112,11 +112,25 @@ Predicting the fractional microstate populations between pH interval 2 to 12 in
 - If your predicted structure is not included in the list, contact us to make a request for new microstate. See more details in the section below ("A warning about enumerated microstates and requesting the missing microstates").
 - For each pH, report the *natural logarithm* of the fractional microstate populations in scientific notation with three decimals of precision (e.g., 1.02e-4).
 e.g. For a molecule with only two possible microstates A and B `ln(fractional microstate population) = ln(N_A/(N_A+N_B))` where `N_A` and `N_B` represent percentage of microstate populations of A and B.   
-At a pH where 90.0% of the molecules are in microstate B and 10.0% of molecules are in state A  `ln(fractional microstate A population) = ln(0.100/(0.100+0.900)) = -2.30e0`.  
+At a pH where 90.0% of the molecules are in microstate B and 10.0% of molecules are in state A  `ln(fractional microstate A population) = ln(0.100/(0.100+0.900)) = -2.30`.  
+This value must be reported as `-2.30e0` or `-2.30E+00` in your submission files.
+- Please follow scientific notation used by Python programming language, where e or E indicates the decimal base of the scientific notation (not the irrational number "e"). 
 - If your estimate of `fractional microstate population` is 0, thus `ln(fractional microstate population) = ln(0)`, report as `-infinity`, but note that attempting to resolve the log-population of low-population states is important for some of the evaluation metrics.
 - Do not report SEM in this submission type in the "Prediction" section of type II submission template. It is optional to report uncertainty estimates in "Methods" section, but we do not plan to analyze uncertainty estimates for this submission type.
 - For pH values or microstates which you don't have an estimate, leave that cell or line of the csv table empty.
 
+##### Warning for Prediction Type II template file
+Predictions section column headers:  
+`# Microstate ID,2.00,2.10,2.20,2.30,2.40,2.50,2.60,2.70,2.80,2.90,3.00,3.10,3.20,3.30,3.40,3.50,3.60,3.70,3.80,3.90,4.00,4.10,4.20,4.30,4.40,4.50,4.60,4.70,4.80,4.90,5.00,5.10,5.20,5.30,5.40,5.50,5.60,5.70,5.80,5.90,6.00,6.10,6.20,6.30,6.40,6.50,6.60,6.70,6.80,6.90,7.00,7.10,7.20,7.30,7.40,7.50,7.60,7.70,7.80,7.90,8.00,8.10,8.20,8.30,8.40,8.50,8.60,8.70,8.80,8.90,9.00,9.10,9.20,9.30,9.40,9.50,9.60,9.70,9.80,9.90,10.00,10.10,10.20,10.30,10.40,10.50,10.60,10.70,10.80,10.90,11.00,11.10,11.20,11.30,11.40,11.50,11.60,11.70,11.80,11.90,12.00`
+
+The commented-out line above regarding the predictions section of type II submission template describes the column titles of the predictions section. 
+Starting from column 2 to the end, numerical values in the title line indicate the pH at which `ln(fractional microstate populations)` should be reported. 
+
+This line is not an example submission line that illustrate the expected numerical format for predicted values. 
+In the column title line  `2.00`, `2.10`... values mean `ln(fractional population at pH 2.00)`, `ln(fractional population at pH 2.10)` etc.
+Please follow the scientific notation as described above for type II submissions. 
+
+
 #### Prediction Type III - macroscopic pKas
 Predicting the value of  macroscopic pKas between 2 and 12.
 - Fill one `typeIII_macroscopic_pKas.csv` template file for all predicted molecules with one method. You may submit predictions from multiple methods, but you should fill a separate template file for each different method.

diff --git a/physical_properties/pKa/example_submission_files/typeI-MehtapIsik-1.csv b/physical_properties/pKa/example_submission_files/typeI-MehtapIsik-1.csv
@@ -13,7 +13,7 @@
 #
 # The data in each prediction line should be structured as follows:
 # microstate ID of protonated state(HA), microstate ID of deprotonated state(A), microscopic pKa, microscopic pKa SEM
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions:
 SM98_micro005,SM98_micro004,9.44,0.05
 SM99_micro002,SM99_micro001,3.30,0.04

diff --git a/physical_properties/pKa/example_submission_files/typeII-MehtapIsik-1.csv b/physical_properties/pKa/example_submission_files/typeII-MehtapIsik-1.csv
@@ -18,7 +18,7 @@
 # The ln(fractional microstate population) data in each prediction line should be structured as follows:
 # Microstate ID,2.00,2.10,2.20,2.30,2.40,2.50,2.60,2.70,2.80,2.90,3.00,3.10,3.20,3.30,3.40,3.50,3.60,3.70,3.80,3.90,4.00,4.10,4.20,4.30,4.40,4.50,4.60,4.70,4.80,4.90,5.00,5.10,5.20,5.30,5.40,5.50,5.60,5.70,5.80,5.90,6.00,6.10,6.20,6.30,6.40,6.50,6.60,6.70,6.80,6.90,7.00,7.10,7.20,7.30,7.40,7.50,7.60,7.70,7.80,7.90,8.00,8.10,8.20,8.30,8.40,8.50,8.60,8.70,8.80,8.90,9.00,9.10,9.20,9.30,9.40,9.50,9.60,9.70,9.80,9.90,10.00,10.10,10.20,10.30,10.40,10.50,10.60,10.70,10.80,10.90,11.00,11.10,11.20,11.30,11.40,11.50,11.60,11.70,11.80,11.90,12.00
 # 
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions:
 SM98_micro001, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity
 SM98_micro002, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity, -infinity

diff --git a/physical_properties/pKa/example_submission_files/typeIII-MehtapIsik-1.csv b/physical_properties/pKa/example_submission_files/typeIII-MehtapIsik-1.csv
@@ -14,7 +14,7 @@
 #
 # The data in each prediction line should be structured as follows:
 # Molecule ID, macroscopic pKa, macroscopic pKa SEM
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions:
 SM98,9.44,1.02
 SM99,7.39,0.50

diff --git a/physical_properties/pKa/submission_templates/typeIII_macroscopic_pKas.csv b/physical_properties/pKa/submission_templates/typeIII_macroscopic_pKas.csv
@@ -14,7 +14,7 @@
 #
 # The data in each prediction line should be structured as follows:
 # Molecule ID, macroscopic pKa, macroscopic pKa SEM
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions:
 
 

diff --git a/physical_properties/pKa/submission_templates/typeII_microstate_fractional_populations.csv b/physical_properties/pKa/submission_templates/typeII_microstate_fractional_populations.csv
@@ -18,7 +18,7 @@
 # The ln(fractional microstate population) data in each prediction line should be structured as follows:
 # Microstate ID,2.00,2.10,2.20,2.30,2.40,2.50,2.60,2.70,2.80,2.90,3.00,3.10,3.20,3.30,3.40,3.50,3.60,3.70,3.80,3.90,4.00,4.10,4.20,4.30,4.40,4.50,4.60,4.70,4.80,4.90,5.00,5.10,5.20,5.30,5.40,5.50,5.60,5.70,5.80,5.90,6.00,6.10,6.20,6.30,6.40,6.50,6.60,6.70,6.80,6.90,7.00,7.10,7.20,7.30,7.40,7.50,7.60,7.70,7.80,7.90,8.00,8.10,8.20,8.30,8.40,8.50,8.60,8.70,8.80,8.90,9.00,9.10,9.20,9.30,9.40,9.50,9.60,9.70,9.80,9.90,10.00,10.10,10.20,10.30,10.40,10.50,10.60,10.70,10.80,10.90,11.00,11.10,11.20,11.30,11.40,11.50,11.60,11.70,11.80,11.90,12.00
 # 
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions:
 
 

diff --git a/physical_properties/pKa/submission_templates/typeI_microscopic_pKas_and_microstates.csv b/physical_properties/pKa/submission_templates/typeI_microscopic_pKas_and_microstates.csv
@@ -13,7 +13,7 @@
 #
 # The data in each prediction line should be structured as follows:
 # microstate ID of protonated state(HA), microstate ID of deprotonated state(A), microscopic pKa, microscopic pKa SEM
-# The list of predictions must begin with the "Prediction:" keyword, as illustrated here.
+# The list of predictions must begin with the "Predictions:" keyword, as illustrated here.
 Predictions: