Sankey node positions overridden for some uneven flows, and rules for node.x and node.y manual positions are not clear

Main problem: Nodes appear in order of data frame under some conditions (such as symmetric flows) but under unknown conditions (some asymmetric flows, but not all), they appear out of order according to other, unknown rules. Manual positioning using node.x and node.y also has unclear rules. I'm trying to work around the lack of a sorting feature but hitting snags all over the place.

Forgive me, I'm rather new to plotly and don't understand how plotly.R interacts with python or js plotly. In trying to solve this problem, I see [Issue #4373 for plotly.js](https://github.com/plotly/plotly.js/issues/4373) describes lack of a sort feature and [Issue #3002 for plotly.py](https://github.com/plotly/plotly.py/issues/3002) states that node.x and node.y cannot be 0. 

My use case is that I want to produce a large set of sankey graphs for flows between 5 specific nodes at Time1 and 5 specific nodes at Time2. For this reason, I would like my nodes to be drawn in the same order every time, no matter the size of the nodes or flows. I wrote script to dynamically find the correct node.y positions for nodes based on their order and size. Even this workaround is running into problems as noted in the code below. 

Minimally, I guess I'm looking for more detailed documentation about node.x and node.y compared to [what is currently in the reference page](https://plotly.com/r/reference/#sankey-node-x). 

More broadly, why is the data frame order of the nodes being overridden, such as in the uneven_flows example below? 

``` r
library(plotly)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> 
#>     layout
library(tidyverse)

my_labels <-
  c(
    "Node 0",
    "Node 1",
    "Node 2",
    "Node 3",
    "Node 4",
    "Node 5",
    "Node 6",
    "Node 7",
    "Node 8",
    "Node 9"
  )

# Uses original data, which includes some flows much larger than others
source_ids <-
  c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4)
target_ids <-
  c(5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9)
varying_flows <-
  c(60, 23, 1, 0, 9, 15, 33, 13, 4, 3, 0, 9, 8, 2, 1, 0, 4, 12, 127, 9, 4, 4, 1, 11, 1)

my_varying_flows <- data.frame(source_ids, target_ids, varying_flows)

fig1 <- plot_ly(
  type = "sankey",
  arrangement = "snap",
  node = list(
    label = my_labels), 
  link = list(
    source = my_varying_flows$source_ids,
    target = my_varying_flows$target_ids,
    value = my_varying_flows$varying_flows))

fig1 <- fig1 %>%
  layout(
    title = list(
      text = "fig1 - varying flows out of order"
    )
  )

# Nodes do not appear in intended order. Node 3, the largest node, appears below
# Node 4, and the right hand nodes are also out of order.

fig1
```
![fig1](https://user-images.githubusercontent.com/49799129/151683469-ba11a91d-91ed-40dc-8a14-3468b935a797.png)

```r
# Build a new set of test data with even, identical flows
even_flows <- rep(10, times = 25)
my_even_flows <- data.frame(source_ids, target_ids, even_flows)

fig2 <- plot_ly(
  type = "sankey",
  arrangement = "snap",
  node = list(


    label = my_labels), 
  link = list(
    source = my_even_flows$source_ids,
    target = my_even_flows$target_ids,
    value = my_even_flows$even_flows))

fig2 <- fig2 %>%
  layout(
    title = list(
      text = "fig2 - even flows in order"
    )
  )

# Displays nodes in intended order, apparently because something behind the
# scenes likes the even flows and keeps the default arrangement.
fig2
```
![fig2](https://user-images.githubusercontent.com/49799129/151683487-f01148c1-25eb-4049-b63b-2b5bb2e7edcc.png)

```r
# Workaround to dynamically determine node.y positions relative to size of nodes
# and sorting order in original data. But even this behaves in unexpected ways,
# and in the node.y argument we need to take the complement of them (i.e., 1 -
# the value generated here).

label_pos_dfs <-
  list(
    # Label positions of source node labels
    my_varying_flows %>%
      group_by(source_ids) %>%
      summarize(n = sum(varying_flows)) %>%
      rename(node.name = source_ids) %>%
      mutate(label.pos = 1 - (cumsum(n) - n/2) / sum(n)),
    
    # Label positions of target node labels
    my_varying_flows %>%
      group_by(target_ids) %>%
      summarize(n = sum(varying_flows)) %>%
      rename(node.name = target_ids) %>%
      mutate(label.pos = 1 - (cumsum(n) - n/2) / sum(n))
  )

my_node_label_y_positions <- 
  lapply(label_pos_dfs, "[", "label.pos") %>% 
  bind_rows() %>% 
  pull(label.pos) 

fig3 <- plot_ly(
  type = "sankey",
  arrangement = "snap",
  node = list(
    label = my_labels,
    
    # Avoiding 0 values seemed to help
    x = c(1e-03, 1e-03, 1e-03, 1e-03, 1e-03, 1, 1, 1, 1, 1),
    
    # Not clear to me why these didn't work and we instead need their
    # complements (e.g., 1 - original value) for correct placement, as if the
    # node.y positions were the distance from the top, not the bottom?
    y = my_node_label_y_positions * -1 + 1), 
  
  link = list(
    source = my_varying_flows$source_ids,
    target = my_varying_flows$target_ids,
    value = my_varying_flows$varying_flows))

fig3 <- fig3 %>%
  layout(
    title = list(
      text = "fig3 - varying flows in intended order with odd workaround!"
    )
  )

# Nodes appear in intended order. 

# fig3
```
![fig3](https://user-images.githubusercontent.com/49799129/151683494-ade9c00c-3c7f-4e1e-811d-af96a21f4ea4.png)

<sup>Created on 2022-01-29 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sankey node positions overridden for some uneven flows, and rules for node.x and node.y manual positions are not clear #2102

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sankey node positions overridden for some uneven flows, and rules for node.x and node.y manual positions are not clear #2102

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions