Skip to content

[FEATURE]: Access values calculated / derived by plotly.js as part of plotting #5552

@alexcjohnson

Description

@alexcjohnson

Description

There are a lot of things that plotly.js has to calculate in the course of creating a plot from your data, and many of these would be useful for other purposes, either (1) to avoid repeating the calculation, (2) because there's some nuance about how the calculation should be done, or (3) because as a user you don't even have access to the information needed to do the calculation.

This is similar to full_figure_for_development, but that feature is intended (per the name) just for use during development in order to understand how plotly.js sets attribute defaults, and which attributes are available and meaningful in a particular situation. It should not be used in production because it's slow and the resulting object can be very large, and it only includes information from the "full" figure data and layout objects, which generally means it captures interactions between attributes but NOT values that are derived from data arrays. One exception is automatically-calculated axis ranges, which clearly depend on the data and ARE available from full_figure_for_development - but we'd love a more efficient (and more accurate, see below) way to get these as well.

The intent here is to allow users to retrieve calculated values that may be useful for other purposes in production situations, without all the overhead of full_figure_for_development. But I'm imagining the usage (and implementation) would be similar to full_figure_for_development, with the exception that that one is implemented via Kaleido only, whereas for this purpose at least if you've displayed your figure as a FigureWidget it should be possible to query the displayed figure directly rather than remaking it in Kaleido. This will ensure any calculations that depend on the display environment (for example the actual displayed size, if that's inherited from the container, and the exact fonts used by the browser) are properly reflected in the result.

Why should this feature be added?

Some examples of information to be included here:

  • Axis ranges
  • tick0 and dtick (or perhaps the list of precise tick values and associated text that we displayed?)
  • Tick rotation angle
  • Margins, so you can align neighboring elements with specific data positions or ensure multiple plots are properly aligned
  • Box and violin plots: statistics shown on the plot: min, max, median, quartiles, standard deviation, fences, outliers (see discussion in [FEATURE]: Support for Method #7 (Pandas/NumPy default) in quartilemethod for Box and Violin plots #5550 re: the ambiguity of quartile values)
  • Grouped plots (bar, histogram, box, violin): width and position offset for each trace
  • Stacked plots (bar, histogram, scatter): final base & top for each trace (this could be large, as it'll be two values per data point, so may need to be opt-in?)
  • Automatic colorscales: the min and max values

I'm sure there are many more. These values would also be useful to JavaScript users, and for their use perhaps we can either just put them into a new data structure we attach to the graph div, or new objects we attach to gd._fullLayout and each trace in gd._fullData, then they can access them directly, no need for a new method. For Python users though this requires a new method to retrieve this info from JS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions