Skip to content

Commit 8d07b26

Browse files
authored
automatic color identity scale (#673)
* automatic color identity scale * only identity if all values are colors (#676) * only identity if all values are colors * comments * don’t use identity if scheme or range * comments on scale type inference * DRY * Update README
1 parent f37cbd2 commit 8d07b26

File tree

6 files changed

+2849
-29
lines changed

6 files changed

+2849
-29
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -145,9 +145,9 @@ Plot.plot({
145145
})
146146
```
147147

148-
Plot supports many scale types. Some scale types are for quantitative datavalues that can be added or subtracted, such as temperature or time. Other scale types are for ordinal or categorical dataunquantifiable values that can only be ordered, such as t-shirt sizes, or values with no inherent order that can only be tested for equality, such as types of fruit. Some scale types are further intended for specific visual encodings for example, as [position](#position-options) or [color](#color-options).
148+
Plot supports many scale types. Some scale types are for quantitative data: values that can be added or subtracted, such as temperature or time. Other scale types are for ordinal or categorical data: unquantifiable values that can only be ordered, such as t-shirt sizes, or values with no inherent order that can only be tested for equality, such as types of fruit. Some scale types are further intended for specific visual encodings: for example, as [position](#position-options) or [color](#color-options).
149149

150-
You can set the scale type explicitly via the *scale*.**type** option, but typically the scale type is inferred automatically from data: strings and booleans imply an ordinal scale; dates imply a UTC scale; anything else is linear. Unless they represent text, we recommend explicitly converting strings to more specific types when loading data (*e.g.*, with d3.autoType or Observable’s FileAttachment). For simplicity’s sake, Plot assumes that data is consistently typed; type inference is based solely on the first non-null, non-undefined value. Certain mark types also imply a scale type; for example, the [Plot.barY](#plotbarydata-options) mark implies that the *x* scale is a *band* scale.
150+
You can set the scale type explicitly via the *scale*.**type** option, though typically the scale type is inferred automatically. Some marks mandate a particular scale type: for example, [Plot.barY](#plotbarydata-options) requires that the *x* scale is a *band* scale. Some scales have a default type: for example, the *radius* scale defaults to *sqrt* and the *opacity* scale defaults to *linear*. Most often, the scale type is inferred from associated data, pulled either from the domain (if specified) or from associated channels. A *color* scale defaults to *identity* if no range or scheme is specified and all associated defined values are valid CSS color strings. Otherwise, strings and booleans imply an ordinal scale; dates imply a UTC scale; and anything else is linear. Unless they represent text, we recommend explicitly converting strings to more specific types when loading data (*e.g.*, with d3.autoType or Observable’s FileAttachment). For simplicity’s sake, Plot assumes that data is consistently typed; type inference is based solely on the first non-null, non-undefined value.
151151

152152
For quantitative data (*i.e.* numbers), a mathematical transform may be applied to the data by changing the scale type:
153153

@@ -252,7 +252,7 @@ Similarly, the *y* and *fy* scales support asymmetric insets with:
252252

253253
The inset scale options can provide “breathing room” to separate marks from axes or the plot’s edge. For example, in a scatterplot with a Plot.dot with the default 3-pixel radius and 1.5-pixel stroke width, an inset of 5 pixels prevents dots from overlapping with the axes. The *scale*.round option is useful for crisp edges by rounding to the nearest pixel boundary.
254254

255-
In addition to the generic *ordinal* scale type, which requires an explicit output range value for each input domain value, Plot supports special *point* and *band* scale types for encoding ordinal data as position. These scale types accept a [*min*, *max*] range similar to quantitative scales, and divide this continuous interval into discrete points or bands based on the number of distinct values in the domain (*i.e.*, the domain’s cardinality). If the associated marks have no effective width along the ordinal dimensionsuch as a dot, rule, or tickthen use a *point* scale; otherwise, say for a bar, use a *band* scale. In the image below, the top *x*-scale is a *point* scale while the bottom *x*-scale is a *band* scale; see [Plot: Scales](https://observablehq.com/@observablehq/plot-scales) for an interactive version.
255+
In addition to the generic *ordinal* scale type, which requires an explicit output range value for each input domain value, Plot supports special *point* and *band* scale types for encoding ordinal data as position. These scale types accept a [*min*, *max*] range similar to quantitative scales, and divide this continuous interval into discrete points or bands based on the number of distinct values in the domain (*i.e.*, the domain’s cardinality). If the associated marks have no effective width along the ordinal dimensionsuch as a dot, rule, or tickthen use a *point* scale; otherwise, say for a bar, use a *band* scale. In the image below, the top *x*-scale is a *point* scale while the bottom *x*-scale is a *band* scale; see [Plot: Scales](https://observablehq.com/@observablehq/plot-scales) for an interactive version.
256256

257257
<img src="./img/point-band.png" width="640" height="144" alt="point and band scales">
258258

@@ -287,7 +287,7 @@ Top-level options are also supported as shorthand: **grid** (for *x* and *y* onl
287287

288288
### Color options
289289

290-
The normal scale types*linear*, *sqrt*, *pow*, *log*, *symlog*, and *ordinal*can be used to encode color. In addition, Plot supports special scale types for color:
290+
The normal scale types*linear*, *sqrt*, *pow*, *log*, *symlog*, and *ordinal*can be used to encode color. In addition, Plot supports special scale types for color:
291291

292292
* *categorical* - equivalent to *ordinal*, but defaults to the *tableau10* scheme
293293
* *sequential* - equivalent to *linear*
@@ -986,7 +986,7 @@ The following channels are optional:
986986

987987
Typically either **x1** and **x2** are specified, or **y1** and **y2**, or both.
988988

989-
If an **interval** is specified, such as d3.utcDay, **x1** and **x2** can be derived from **x**: *interval*.floor(*x*) is invoked for each *x* to produce *x1*, and *interval*.offset(*x1*) is invoked for each *x1* to produce *x2*. The same is true for *y*, *y1*, and *y2*, respectively. If the interval is specified as a number *n*, *x1* and *x2* are taken as the two consecutive multiples of *n* that bracket *x*. The interval may be specified either as as {x, interval} or x: {value, interval}—typically to apply different intervals to x and y.
989+
If an **interval** is specified, such as d3.utcDay, **x1** and **x2** can be derived from **x**: *interval*.floor(*x*) is invoked for each *x* to produce *x1*, and *interval*.offset(*x1*) is invoked for each *x1* to produce *x2*. The same is true for *y*, *y1*, and *y2*, respectively. If the interval is specified as a number *n*, *x1* and *x2* are taken as the two consecutive multiples of *n* that bracket *x*. The interval may be specified either as as {x, interval} or x: {value, interval} to apply different intervals to x and y.
990990

991991
The rect mark supports the [standard mark options](#marks), including insets and rounded corners. The **stroke** defaults to none. The **fill** defaults to currentColor if the stroke is none, and to none otherwise.
992992

@@ -1269,7 +1269,7 @@ Filters the data given the specified *test*. The test can be given as an accesso
12691269

12701270
[<img src="./img/bin.png" width="320" height="198" alt="a histogram of athletes by weight">](https://observablehq.com/@observablehq/plot-bin)
12711271

1272-
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@observablehq/plot-bin) · Aggregates continuous dataquantitative or temporal values such as temperatures or timesinto discrete bins and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both *x* and *y*.
1272+
[Source](./src/transforms/bin.js) · [Examples](https://observablehq.com/@observablehq/plot-bin) · Aggregates continuous dataquantitative or temporal values such as temperatures or timesinto discrete bins and then computes summary statistics for each bin such as a count or sum. The bin transform is like a continuous [group transform](#group) and is often used to make histograms. There are separate transforms depending on which dimensions need binning: [Plot.binX](#plotbinxoutputs-options) for *x*; [Plot.binY](#plotbinyoutputs-options) for *y*; and [Plot.bin](#plotbinoutputs-options) for both *x* and *y*.
12731273

12741274
Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting binned data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order. The outer array is in ascending order according to the associated dimension (*x* then *y*). Empty bins are skipped. By specifying a different aggregation method for the *data* output, as described below, you can change how the binned data is computed. The outputs may also include *filter* and *sort* options specified as aggregation methods, and a *reverse* option to reverse the order of generated bins. By default, empty bins are omitted, and non-empty bins are generated in ascending threshold order.
12751275

@@ -1407,7 +1407,7 @@ Bins on *y*. Also groups on *x* and first channel of *z*, *fill*, or *stroke*, i
14071407

14081408
[<img src="./img/group.png" width="320" height="198" alt="a histogram of penguins by species">](https://observablehq.com/@observablehq/plot-group)
14091409

1410-
[Source](./src/transforms/group.js) · [Examples](https://observablehq.com/@observablehq/plot-group) · Aggregates ordinal or categorical datasuch as namesinto groups and then computes summary statistics for each group such as a count or sum. The group transform is like a discrete [bin transform](#bin). There are separate transforms depending on which dimensions need grouping: [Plot.groupZ](#plotgroupzoutputs-options) for *z*; [Plot.groupX](#plotgroupxoutputs-options) for *x* and *z*; [Plot.groupY](#plotgroupyoutputs-options) for *y* and *z*; and [Plot.group](#plotgroupoutputs-options) for *x*, *y*, and *z*.
1410+
[Source](./src/transforms/group.js) · [Examples](https://observablehq.com/@observablehq/plot-group) · Aggregates ordinal or categorical datasuch as namesinto groups and then computes summary statistics for each group such as a count or sum. The group transform is like a discrete [bin transform](#bin). There are separate transforms depending on which dimensions need grouping: [Plot.groupZ](#plotgroupzoutputs-options) for *z*; [Plot.groupX](#plotgroupxoutputs-options) for *x* and *z*; [Plot.groupY](#plotgroupyoutputs-options) for *y* and *z*; and [Plot.group](#plotgroupoutputs-options) for *x*, *y*, and *z*.
14111411

14121412
Given input *data* = [*d₀*, *d₁*, *d₂*, …], by default the resulting grouped data is an array of arrays where each inner array is a subset of the input data [[*d₀₀*, *d₀₁*, …], [*d₁₀*, *d₁₁*, …], [*d₂₀*, *d₂₁*, …], …]. Each inner array is in input order. The outer array is in natural ascending order according to the associated dimension (*x* then *y*). Empty groups are skipped. By specifying a different aggregation method for the *data* output, as described below, you can change how the grouped data is computed. The outputs may also include *filter* and *sort* options specified as aggregation methods, and a *reverse* option to reverse the order of generated groups. By default, all (non-empty) groups are generated in ascending natural order.
14131413

src/options.js

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,6 @@ export const first = d => d[0];
2525
export const second = d => d[1];
2626
export const constant = x => () => x;
2727

28-
// A few extra color keywords not known to d3-color.
29-
const colors = new Set(["currentColor", "none"]);
30-
3128
// Some channels may allow a string constant to be specified; to differentiate
3229
// string constants (e.g., "red") from named fields (e.g., "date"), this
3330
// function tests whether the given value is a CSS color string and returns a
@@ -37,7 +34,7 @@ const colors = new Set(["currentColor", "none"]);
3734
export function maybeColorChannel(value, defaultValue) {
3835
if (value === undefined) value = defaultValue;
3936
return value === null ? [undefined, "none"]
40-
: typeof value === "string" && (colors.has(value) || color(value)) ? [undefined, value]
37+
: isColor(value) ? [undefined, value]
4138
: [value, undefined];
4239
}
4340

@@ -210,6 +207,38 @@ export function isNumeric(values) {
210207
}
211208
}
212209

210+
export function isColors(values) {
211+
for (const value of values) {
212+
if (value == null) continue;
213+
return isColor(value);
214+
}
215+
}
216+
217+
// Whereas isColors only tests the first defined value and returns undefined for
218+
// an empty array, this tests all defined values and only returns true if all of
219+
// them are valid colors. It also returns true for an empty array, and thus
220+
// should generally be used in conjunction with isColors.
221+
export function isAllColors(values) {
222+
for (const value of values) {
223+
if (value == null) continue;
224+
if (!isColor(value)) return false;
225+
}
226+
return true;
227+
}
228+
229+
// Mostly relies on d3-color, with a few extra color keywords. Currently this
230+
// strictly requires that the value be a string; we might want to apply string
231+
// coercion here, though note that d3-color instances would need to support
232+
// valueOf to work correctly with InternMap.
233+
export function isColor(value) {
234+
if (!(typeof value === "string")) return false;
235+
value = value.toLowerCase();
236+
return value === "currentcolor" || value === "none" || color(value) !== null;
237+
}
238+
239+
// Like a sort comparator, returns a positive value if the given array of values
240+
// is in ascending order, a negative value if the values are in descending
241+
// order. Assumes monotonicity; only tests the first and last values.
213242
export function order(values) {
214243
if (values == null) return;
215244
const first = values[0];

src/scales.js

Lines changed: 49 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import {parse as isoParse} from "isoformat";
2-
import {isOrdinal, isTemporal, order} from "./options.js";
2+
import {isAllColors, isColors, isOrdinal, isTemporal, order} from "./options.js";
33
import {registry, color, position, radius, opacity, symbol, length} from "./scales/index.js";
44
import {ScaleLinear, ScaleSqrt, ScalePow, ScaleLog, ScaleSymlog, ScaleQuantile, ScaleThreshold, ScaleIdentity} from "./scales/quantitative.js";
55
import {ScaleDiverging, ScaleDivergingSqrt, ScaleDivergingPow, ScaleDivergingLog, ScaleDivergingSymlog} from "./scales/diverging.js";
@@ -187,36 +187,67 @@ function Scale(key, channels = [], options = {}) {
187187
}
188188
}
189189

190-
function inferScaleType(key, channels, {type, domain, range}) {
190+
function inferScaleType(key, channels, {type, domain, range, scheme}) {
191+
// The facet scales are always band scales; this cannot be changed.
191192
if (key === "fx" || key === "fy") return "band";
192-
if (type !== undefined) {
193-
for (const {type: t} of channels) {
194-
if (t !== undefined && type !== t) {
195-
throw new Error(`scale incompatible with channel: ${type} !== ${t}`);
196-
}
197-
}
198-
return type;
193+
194+
// If a channel dictates a scale type, make sure that it is consistent with
195+
// the user-specified scale type (if any) and all other channels. For example,
196+
// barY requires x to be a band scale and disallows any other scale type.
197+
for (const {type: t} of channels) {
198+
if (t === undefined) continue;
199+
else if (type === undefined) type = t;
200+
else if (type !== t) throw new Error(`scale incompatible with channel: ${type} !== ${t}`);
199201
}
200-
if (registry.get(key) === radius) return "sqrt";
201-
if (registry.get(key) === opacity || registry.get(key) === length) return "linear";
202-
if (registry.get(key) === symbol) return "ordinal";
203-
for (const {type} of channels) if (type !== undefined) return type;
204-
if ((domain || range || []).length > 2) return asOrdinalType(key);
202+
203+
// If the scale, a channel, or user specified a (consistent) type, return it.
204+
if (type !== undefined) return type;
205+
206+
// Some scales have default types.
207+
const kind = registry.get(key);
208+
if (kind === radius) return "sqrt";
209+
if (kind === opacity || kind === length) return "linear";
210+
if (kind === symbol) return "ordinal";
211+
212+
// For color scales, if no range or scheme is specified and all associated
213+
// defined values (from the domain if present, and otherwise from channels)
214+
// are valid colors, then default to the identity scale. This allows, for
215+
// example, a fill channel to return literal colors; without this, the colors
216+
// would be remapped to a categorical scheme!
217+
if (kind === color
218+
&& range === undefined
219+
&& scheme === undefined
220+
&& (domain !== undefined
221+
? isColors(domain) && isAllColors(domain)
222+
: channels.some(({value}) => value !== undefined && isColors(value))
223+
&& channels.every(({value}) => value === undefined || isAllColors(value)))) return "identity";
224+
225+
// If the domain or range has more than two values, assume it’s ordinal. You
226+
// can still use a “piecewise” (or “polylinear”) scale, but you must set the
227+
// type explicitly.
228+
if ((domain || range || []).length > 2) return asOrdinalType(kind);
229+
230+
// Otherwise, infer the scale type from the data! Prefer the domain, if
231+
// present, over channels. (The domain and channels should be consistently
232+
// typed, and the domain is more explicit and typically much smaller.) We only
233+
// check the first defined value for expedience and simplicitly; we expect
234+
// that the types are consistent.
205235
if (domain !== undefined) {
206-
if (isOrdinal(domain)) return asOrdinalType(key);
236+
if (isOrdinal(domain)) return asOrdinalType(kind);
207237
if (isTemporal(domain)) return "utc";
208238
return "linear";
209239
}
240+
210241
// If any channel is ordinal or temporal, it takes priority.
211242
const values = channels.map(({value}) => value).filter(value => value !== undefined);
212-
if (values.some(isOrdinal)) return asOrdinalType(key);
243+
if (values.some(isOrdinal)) return asOrdinalType(kind);
213244
if (values.some(isTemporal)) return "utc";
214245
return "linear";
215246
}
216247

217248
// Positional scales default to a point scale instead of an ordinal scale.
218-
function asOrdinalType(key) {
219-
switch (registry.get(key)) {
249+
function asOrdinalType(kind) {
250+
switch (kind) {
220251
case position: return "point";
221252
case color: return "categorical";
222253
default: return "ordinal";

0 commit comments

Comments
 (0)