- 
                Notifications
    
You must be signed in to change notification settings  - Fork 734
 
Implementation of fetch_pdb() #4943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 101 commits
7899f3d
              44393be
              b1f6002
              9c6e87a
              aecefc9
              9510cc6
              8c1a196
              f0e30ed
              1c7d909
              eb23ed1
              b0c7f5a
              f2ec203
              a21fd94
              d58bed9
              1147b6d
              91feb16
              ddcef9e
              bf3e07f
              09cc409
              d78a954
              560e1c2
              c43c10d
              e6a0f05
              ada1b38
              043c006
              ea5c5b7
              5d6d3e8
              6590c42
              6e9b9f3
              252b23c
              440e3b8
              c3f74f9
              10f66be
              cda3559
              cecd570
              03638c8
              544de38
              fdaacf1
              215ee43
              5990939
              7f7387f
              f3456a5
              8b8492f
              3fea571
              f5d6a9f
              64ac4e5
              0f54e8e
              867614a
              c85fd75
              96dbf05
              b15d148
              2d10ad3
              ab7bc8a
              9289792
              96d7341
              c74a46e
              6b20e86
              0d793e9
              d964bc5
              124d06a
              b8f7a81
              d78bae6
              577ac9d
              608d991
              07d124c
              8a9ac84
              939d5f0
              7107aa4
              c869bbc
              557b1e9
              f3a4d7b
              2a97d9b
              9d0f53a
              595423a
              eed80ed
              802183f
              bf9292c
              b595f09
              e93c73a
              5f407ba
              f09115a
              e2141a8
              bf81128
              934eda3
              9b8da31
              0b80840
              a7519af
              72c24e0
              ffcc270
              b07a16d
              a2aff4c
              98fa75b
              1e635c4
              447de56
              c28110f
              02d81a3
              adbfabb
              31a8f7b
              e2a28ec
              546538a
              735586f
              b5da9be
              b0c808a
              ab0f635
              d2f7857
              09e7ef5
              6beffd8
              File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| 
          
            
          
           | 
    @@ -57,13 +57,15 @@ | |||||||
| * :class:`MDAnalysis.core.universe.Universe` | ||||||||
| 
     | 
||||||||
| 
     | 
||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| Classes | ||||||||
                
      
                  orbeckst marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| Classes and Functions | ||||||||
| ------- | ||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
              
                Outdated
          
            Show resolved
            Hide resolved
         | 
||||||||
| 
     | 
||||||||
| .. autoclass:: PDBParser | ||||||||
| :members: | ||||||||
| :inherited-members: | ||||||||
| 
     | 
||||||||
| .. autofunction:: fetch_pdb | ||||||||
| 
     | 
||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
              
                Outdated
          
            Show resolved
            Hide resolved
         | 
||||||||
| """ | ||||||||
| import numpy as np | ||||||||
| import warnings | ||||||||
| 
          
            
          
           | 
    @@ -95,6 +97,29 @@ | |||||||
| # Set up a logger for the PDBParser | ||||||||
| logger = logging.getLogger("MDAnalysis.topology.PDBParser") | ||||||||
| 
     | 
||||||||
| try: | ||||||||
| import pooch | ||||||||
| except ImportError: | ||||||||
| HAS_POOCH = False | ||||||||
| else: | ||||||||
| HAS_POOCH = True | ||||||||
| 
     | 
||||||||
                
      
                  orbeckst marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| DEFAULT_CACHE_NAME_DOWNLOADER = "MDAnalysis_pdbs" | ||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| 
     | 
||||||||
| # These file formats are here (https://www.rcsb.org/docs/programmatic-access/file-download-services) under "PDB entry files" | ||||||||
| SUPPORTED_FILE_FORMATS_DOWNLOADER = ( | ||||||||
| "cif", | ||||||||
| "cif.gz", | ||||||||
| "bcif", | ||||||||
| "bcif.gz", | ||||||||
| "xml", | ||||||||
| "xml.gz", | ||||||||
| "pdb", | ||||||||
| "pdb.gz", | ||||||||
| "pdb1", | ||||||||
| "pdb1.gz", | ||||||||
| ) | ||||||||
| 
     | 
||||||||
| 
     | 
||||||||
| def float_or_default(val, default): | ||||||||
| try: | ||||||||
| 
          
            
          
           | 
    @@ -515,3 +540,131 @@ def _parse_conect(conect): | |||||||
| bond_atoms = (int(conect[11 + i * 5: 16 + i * 5]) for i in | ||||||||
| range(n_bond_atoms)) | ||||||||
| return atom_id, bond_atoms | ||||||||
| 
     | 
||||||||
| 
     | 
||||||||
| def fetch_pdb( | ||||||||
| pdb_ids=None, | ||||||||
| cache_path=None, | ||||||||
| progressbar=False, | ||||||||
| file_format="pdb.gz", | ||||||||
| ): | ||||||||
| """ | ||||||||
| Download one or more PDB files from the RCSB Protein Data Bank and cache | ||||||||
| them locally. | ||||||||
| 
     | 
||||||||
| Given one or multiple PDB IDs, downloads the corresponding structure files | ||||||||
| format and stores them in a local cache directory. If files are cached on | ||||||||
| disk, fetch_pdb() will skip the download and use the cached version instead. | ||||||||
| 
     | 
||||||||
| Returns the path(s) as a string to the downloaded file(s). | ||||||||
| 
     | 
||||||||
| Parameters | ||||||||
| ---------- | ||||||||
| pdb_ids : str or sequence of str | ||||||||
| A single PDB ID as a string, or a sequence of PDB IDs to fetch. | ||||||||
| cache_path : str or pathlib.Path | ||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| Directory where downloaded file(s) will be cached. | ||||||||
                
      
                  jauy123 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||||||
| file_format : str | ||||||||
| The file extension/format to download (e.g., "cif", "pdb"). | ||||||||
| See the Notes section below for a list of all supported file formats. | ||||||||
| progressbar : bool, optional | ||||||||
| If True, display a progress bar during file downloads. Default is False. | ||||||||
| 
     | 
||||||||
| Returns | ||||||||
| ------- | ||||||||
| str or list of str | ||||||||
| The path(s) to the downloaded file(s). Returns a single string if | ||||||||
| one PDB ID is given, or a list of strings if multiple PDB IDs are | ||||||||
| provided. | ||||||||
| 
     | 
||||||||
| Raises | ||||||||
| ------ | ||||||||
| ValueError | ||||||||
| For an invalid file format. Supported file formats are under Notes. | ||||||||
| 
     | 
||||||||
| requests.exceptions.HTTPError | ||||||||
| If an invalid PDB code is specified. This is a pooch exception. | ||||||||
| 
     | 
||||||||
| Notes | ||||||||
| ----- | ||||||||
| This function uses the `RCSB File Download Services`_ for directly downloading | ||||||||
| structure files via https. | ||||||||
| 
     | 
||||||||
| .. _`RCSB File Download Services`: | ||||||||
| https://www.rcsb.org/docs/programmatic-access/file-download-services | ||||||||
| 
     | 
||||||||
| The RCSB currently provides data in 'cif', 'cif.gz', 'bcif', 'bcif.gz', 'xml', | ||||||||
| 'xml.gz', 'pdb', 'pdb.gz', 'pdb1', 'pdb1.gz' file formats and can therefore be | ||||||||
| downloaded. Not all of these formats can be currently read with MDAnalysis. | ||||||||
| 
     | 
||||||||
| Cache, controlled by the `cache_patch` parameter, is handled internally by pooch. | ||||||||
                
       | 
||||||||
| Cache, controlled by the `cache_patch` parameter, is handled internally by pooch. | |
| Caching, controlled by the `cache_path` parameter, is handled internally by :mod:`pooch`. | 
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markup
| The default None arguments stores the data files in the platform dependent | |
| The default ``None`` arguments stores the data files in the platform dependent | 
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following could also be deleted if it's already explained in the parameter docs
| `Pooch Default Cache Path`_ under the folder MDAnalysis_pdbs. To clear cache | |
| `Pooch Default Cache Path`_ under the folder :data:`DEFAULT_CACHE_NAME_DOWNLOADER`. To clear cache | 
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
markup and be explicit
| as specified by cache_path. | |
| as specified by `cache_path`; if you used the default then the path to the pooch cache | |
| is ``pooch.os_cache(MDAnalysis.topology.PDBParser.DEFAULT_CACHE_NAME_DOWNLOADER)``. | 
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use intersphix then you can replace the explict link with
:func:`Pooch Default Cache Path<pooch.os_cache>`_ under the folder(actually... test it first)
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converting -> convert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix grammar
        
          
              
                  orbeckst marked this conversation as resolved.
              
          
            Show resolved
            Hide resolved
        
              
          
              
                  p-j-smith marked this conversation as resolved.
              
          
            Show resolved
            Hide resolved
        
              
          
              
                  jauy123 marked this conversation as resolved.
              
          
            Show resolved
            Hide resolved
        
              
          
              
                  jauy123 marked this conversation as resolved.
              
          
            Show resolved
            Hide resolved
        
      | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| 
          
            
          
           | 
    @@ -13,6 +13,7 @@ networkx | |
| numpy>=1.23.2 | ||
| packaging | ||
| parmed | ||
| pooch | ||
| pytest | ||
| scikit-learn | ||
| scipy | ||
| 
          
            
          
           | 
    ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@IAlibay @BradyAJohnston are we sure that we want the import at the top level?
If we do more
fetch_xxx()in the future then we may have to deprecate it again, e.g. in favor of amda.fetch.pdb(...)orUniverse.from_fetched.I think it's ok to leave it here for now because we don't have anything else. If we get more before 3.0, we still have time to deprecate and remove in 3.0.
If it is left in then does it need to be documented at the top level, too?