Skip to content

Debug the trisurf code #550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

choldgraf
Copy link
Contributor

see #487

@monfera
Copy link

monfera commented Aug 19, 2016

Just to have your comment here as well:

looks like a change in the trisurf code broke my brain plotting code. 
Out of curiosity why was show_colorbar added as a parameter, and 
not a keyword? Seems like it'd make more sense to just default to 
False rather than forcing it in the call to _trisurf.

@choldgraf
Copy link
Contributor Author

Actually hold up a sec on this one, looking through the code I think I see a few more bugs...I'll comment on them in this PR

@choldgraf choldgraf changed the title changing a parameter to a flag Debug the trisurf code Aug 19, 2016
@choldgraf
Copy link
Contributor Author

It looks like a bunch of bugs got introduced whenever somebody added the colorbar functionality. E.g., right now if you don't give a colormap, then it throws an error. Moreover, it looks like the output of the _make_colorscale function is not always compatible with the _convert_colorscale_to_rgb function...

@choldgraf
Copy link
Contributor Author

Also I'm not sure why but the current version re-converted facecolor to a list (around L3337)...IMO we should never use lists unless absolutely necessary. A numpy array is almost always a lot faster.

@choldgraf
Copy link
Contributor Author

maybe @Kully or @theengineear can comment on this one?

@choldgraf
Copy link
Contributor Author

(on the plus side, it does seem like the memory leak is not happening. It's a bit hard to tell because my original code isn't working due to the above changes, but seems promising)

@theengineear
Copy link
Contributor

Re #550 (comment)

This was done as a simple fix to make tests work between Python 2 and Python 3. That's really the only important rationale for it. I figured that it wouldn't be too much of a performance knock if it happened once at the end, which is why I suggested it to @Kully.

That said, I haven't looked at the actual implications and there's no other reason to have a list over a numpy array, so we can definitely change that back.

@choldgraf
Copy link
Contributor Author

Cool - I seem to remember arrays being a lot more efficient than lists in the JSON encoding, but I just tried it out and didn't see a difference:

image

Maybe there were optimizations made to the JSONEncoder class? Either way, that was why I was pushing for objects to be made into arrays before being put in the plotly dictionaries. Maybe the performance differences only come up when they're really large lists/arrays or something.

@theengineear
Copy link
Contributor

^^ Yah, that PlotlyJSONEncoder is actually pretty dim...

A couple notes:

  1. It is going to be transforming np.arrays into lists there anyhow.
  2. It does some stupid things to prevent extended JSON from being sent to Plotly https://github.com/plotly/plotly.py/blob/master/plotly/utils.py#L147-L163

There's likely a better way, right now this is basically, encoding, unencoding, and then encoding... It had to do with either Plotly backend or plotlyjs code not being able to work with the extended JSON that Python takes the liberty of using. At least that's what I remember the problem being. Conditions may have changed though, it may not even be a concern. This would be a bigger speedup than the np.array vs list in the output of the color functions, imo.

Feel like digging in 😛?

@monfera
Copy link

monfera commented Aug 20, 2016

Might not be relevant for this processing step, but one of the render speedup tests we've been evaluating (with positive results so far) is the use of JavaScript typed arrays for input. This can avoid expensive array copies, which matter a lot when a model is large. It usually requires arrays of different shape compared to the current API (e.g. [vert00, vert01, vert02, .., vertn0, vertn1, vertn2] for vertex indices, but it's possible to use e.g. scijs/ndarray by our @mikolalysenko to make it look like friendly multidimensional arrays) and we're still in the prototyping stage, initially with the upcoming 2D pointcloud plot.

As a result, we see that the largest single use of the initial render time is taken by the actual CPU->GPU memory copy (we want to cut down on it too e.g. by using gl.TRIANGLE_STRIP). All in all, we expect great speedup in JavaScript+WebGL, perhaps OOM 100ms for a brain mesh update and probably a small multiple of that for an initial render, if the arrays are passed on in the typed array structure.

A conclusion is that due to faster JavaScript processing, any array preprocessing in Python, and the (presumable) array copy, or serialization/parsing from Python to Javascript will likely become a significant chunk of the overall processing time, assuming larger meshes, from hundreds of thousands of triangles.

These details partly belong to the render speedup issue, @choldgraf please consider opening it in plotly.js. Details e.g. baseline benchmarks can be added later.

@monfera
Copy link

monfera commented Aug 20, 2016

There is some old thread on efficiently moving numpy arrays to JavaScript typed arrays, I don't see the specific link in the answer but perhaps interesting

@monfera
Copy link

monfera commented Aug 20, 2016

It looks like a fairly direct approach would be to use webSocketConnection.binaryType = "arraybuffer"; because it should yield an event.data instanceof ArrayBuffer which is directly viewable as a JavaScript typed array of the appropriate type (e.g. Float32Array) and requires no further copy or transformation. I think IPython notebooks already use WebSockets for binary communication with JavaScript so maybe the JSON serializing and parsing can be avoided, and with that, perhaps numerical accuracy is better than with JSON. Sorry if I misunderstood something.

@monfera
Copy link

monfera commented Aug 20, 2016

np.ndarray seems to directly support binary serialization of buffers, and a nice benefit is that Python users could use nice, 2D arrays for trimesh vertices and indices, and they would be directly usable on the other end of the WebSocket as a flat, 1D JS typed arrays, directly suitable for the WebGL gl.bufferData. (JS users could use the above mentioned scijs/ndarray for a similar 2D view effect.)

@theengineear
Copy link
Contributor

@monfera super interesting conversation. I agree that this serialization step (when operating in offline mode, or some sort of local mode) is just causing delays. I wonder if this is the appropriate thread to talk about it in though. I have a feeling that this requires a non-trivial refact to get Python/JS to talk in a different way, no?

Just to be clear, I don't see JSON serialization to save things to Plotly's db / AWS instances going away any time soon. But that said, I think you know a lot more than I do on this topic and I'd happily discuss further. Again, maybe best in another thread though?

Thanks for the thorough technical discussion! This is great!

@monfera
Copy link

monfera commented Aug 22, 2016

@theengineear you're right, I wasn't entirely ontopic, mentioning one significant element (typed JS arrays in plotly.js) of a future speedup in case it has relevance to the Python -> JS communication (one comment above by @choldgraf mentioned converting np.arrays to lists but it seems avoidable). Having been asked to work on initial render speedup in plotly.js it looks like a serial thing so even if the JS part is very fast, it can still be slow end-to-end.

The thought of replacing JSON for database storage didn't occur to me as I only had the Python -> JS direct communication in mind (I assume, perhaps wrongly, that in order to plot something from Python, there's no interim db storage of the data).

@choldgraf
Copy link
Contributor Author

hey all - I'm gonna close this because I'm pretty sure this PR is pretty stale :-/ if anybody wants to take it over then feel free

@choldgraf choldgraf closed this Sep 17, 2017
@monfera
Copy link

monfera commented Sep 18, 2017

@choldgraf thanks for your reminder. There are a variety of things here, which might be why it never got assigned (or simply, limited capacity). To take an inventory:

  1. addition of colorbar causes some error
  2. PR commit: turning the show_colorbar parameter into a flag but this PR conversation doesn't tell about it; the linked issue comment points to benefits, I'm not sure if it's clear to those folks working on the Python binding
  3. performance increase ideas for array transfer between Python and JS
  4. original memory leak seems to be addressed

Where we're at with each is, afaik:

  1. I'm just guessing, but from the description it looks like it might be a plotly.js problem, so an issue in the plotly.js repo with a minimal example would be fantastic, including stating the expectation. You seem to have detailed info as you tied the appearance of the issue to a feature addition, offending version/commit etc. helps triage
  2. I've no info on this, the suggestion was to make it into an item of its own (needs, benefits, critical or nice to have, what it involves)
  3. the separate plotly.js issue covers this topic and has some alternatives collected, but not implemented; looping in @chriddyp as well
  4. no action needed, except perhaps being on the outlook for leaks after item 1 is solved

Also, there's a linked issue referred above, the conversation ended here.

It helps triage and assignment if issues/PRs are isolated and granular, in the repo where it's likely to need the change, and with an issue/PR title that gives the gist of the contents. Whether it's one of the Plotly employed contributors or community contributors, it's easier to take and clear specific items with minimal offending tests and expectations.

@choldgraf
Copy link
Contributor Author

for sure - I just don't have time to work on it ATM and I'm cleaning out my old PR cruft. That's helpful info if somebody else wants to take it over though!

@jackparmer
Copy link
Contributor

@choldgraf JSON serialization slowness aside, we have something exciting in the works for brain volume visualization:

http://fhtr.org/gl-volume3d-demo/

image

Let us know if you have any suggestions for this demo on this based on your experience with pysurfer. This volume viz should land in an official plotly.js release early next year.

@choldgraf
Copy link
Contributor Author

very cool! I love the viz :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants