-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Enabling Cython-based PDB parser backend for speed improvements #139
Comments
@a-r-j This is super cool. Btw. perhaps we don't need to worry about it extra dependencies here because NumPy already uses Cython (https://github.com/numpy/numpy/blob/main/build_requirements.txt), and pandas is build on NumPy, and BioPandas is build on pandas :P |
That's a good point! I was mostly concerned about the potential for build problems (mostly as |
One difference in the comparison is that your Cython implementation only reads |
@Ruibin-Liu Hmm, that's a really great point. I could add a |
Is this thread dead? I really would like to see it happen, we are looking for something fast to replace BioPython for PDB parsing. I also love Biopandas for its familiar interface so this looks like hitting two birds with one stone :) |
@BartoszJanuszNA I don't have time to work on this right now but I think it'd be straightforward if you want to pick it up. |
Describe the workflow you want to enable
Currently, the pure-python of PDB parsing in BioPandas is quite slow - certainly too slow for highthroughput structural bioinformatics or ML.
Describe your proposed solution
I have written a Cython-based implementation (CPDB) which is considerably faster and would like to set this as the default parsing backend. As it stands, I believe this to be one of the fastest (if not the fastest) available PDB parser for Python.
Performance comparison
However, given BioPandas' widespread usage, I am unclear if distributing this with a Cython component will lead to dependency problems for users.
Describe alternatives you've considered, if relevant
Speeding up the passage of time
Additional context
The text was updated successfully, but these errors were encountered: