Skip to content

Commit

Permalink
docs: add more examples for other relevant techniques
Browse files Browse the repository at this point in the history
  • Loading branch information
kanitw committed May 18, 2024
1 parent 744c346 commit 8dcaeae
Show file tree
Hide file tree
Showing 7 changed files with 116 additions and 11 deletions.
18 changes: 9 additions & 9 deletions build/vega-lite-schema.json

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions examples/specs/layer_null_data.vl.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"width": 300,
"data": {
"values": [
{"a": "Jan 1, 2000", "b": 28},
{"a": "Jan 2, 2000", "b": 55},
{"a": "Jan 3, 2000", "b": null},
{"a": "Jan 4, 2000", "b": 55},
{"a": "Jan 5, 2000", "b": 43},
{"a": "Jan 6, 2000", "b": null},
{"a": "Jan 7, 2000", "b": 55},
{"a": "Jan 8, 2000", "b": 43}
]
},
"layer": [{
"mark": "line",
"encoding": {
"x": {"timeUnit": "yearmonthdate", "field": "a", "type": "temporal", "axis": {"format": "%d %b"}},
"y": {"field": "b", "type": "quantitative"}
}
}, {
"transform": [{"filter": "datum.b === null"}],
"mark": {"type": "bar", "color": "red", "opacity": 0.2},
"encoding": {
"x": {"timeUnit": "yearmonthdate", "field": "a", "type": "temporal", "bandPosition": 0}
}
}]
}
36 changes: 36 additions & 0 deletions examples/specs/window_impute_null.vl.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "Using window transform to impute missing values in a line chart by averaging the previous and next values.",
"width": 300,
"data": {
"values": [
{"a": "Jan 1, 2000", "b": 28},
{"a": "Jan 2, 2000", "b": 55},
{"a": "Jan 3, 2000", "b": null},
{"a": "Jan 4, 2000", "b": 65},
{"a": "Jan 5, 2000", "b": 43},
{"a": "Jan 6, 2000", "b": null},
{"a": "Jan 7, 2000", "b": 55},
{"a": "Jan 8, 2000", "b": 43}
]
},
"transform": [{
"window": [{
"op": "lag",
"field": "b",
"as": "prev"
},{
"op": "lead",
"field": "b",
"as": "next"
}]
}, {
"calculate": "datum.b === null ? (datum.prev + datum.next)/2 : datum.b",
"as": "b"
}],
"mark": {"type": "line", "point": true},
"encoding": {
"x": {"timeUnit": "yearmonthdate", "field": "a", "type": "temporal", "axis": {"format": "%d %b"}},
"y": {"field": "b", "type": "quantitative"}
}
}
4 changes: 4 additions & 0 deletions site/_data/examples.json
Original file line number Diff line number Diff line change
Expand Up @@ -520,6 +520,10 @@
{
"name": "layer_point_line_loess",
"title": "Loess Regression"
},
{
"name": "window_impute_null",
"title": "Using window transform to impute missing values by averaging the previous and next values."
}
]
},
Expand Down
33 changes: 32 additions & 1 deletion site/docs/invaliddata.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ permalink: /docs/invalid-data.html

This page discusses modes in Vega-Lite for handling invalid data (`null` and `NaN` in continuous scales).

The main configurations are [`mark.invalid`](#mark) and [`config.scale.invalid`](#scale).
In addition, you can use [other Vega-Lite features including conditional encodings, layering, or window transform to handle invalid and missing data](#other).

Note: Vega-Lite does _not_ consider `null` and `NaN` in categorical scales and text encodings as invalid data:

- Categorical scales can treat nulls and NaNs as separate categories
Expand All @@ -25,7 +28,7 @@ Note: Vega-Lite does _not_ consider `null` and `NaN` in categorical scales and t

{:#mark}

You can set the invalid data mode via `mark.invalid` (or `config.mark.invalid`) to configure how Vega-Lite handles invalid data (`null` and `NaN` in continuous scales).
You can use `mark.invalid` (or `config.mark.invalid`) to configure how marks and their corresponding scales handle invalid data (`null` and `NaN` in continuous scales).

{% include table.html props="invalid" source="MarkDef" %}

Expand Down Expand Up @@ -101,3 +104,31 @@ A visualization with `"filter"` invalid data mode will not filter (not exclude)
Compare this with a similar spec, but without `config.scale.invalid`:

<div class="vl-example" data-name="test_invalid_color_size_mark_filter_only"></div>


## Other solutions
{:#other}

Note that `mark.invalid` and `config.scale.invalid` are options for handling invalid data *without* changing data or marks.

However, you may use other Vega-Lite features to encode invalid data.


### Example: Conditional Encoding

If you do not use color encoding, you may use conditional color encoding to use a specific color (e.g., gray) to encode invalid values.

<div class="vl-example" data-name="point_invalid_color"></div>


### Example: Layering

You may also use different marks (such as bars) to encode null data.

<div class="vl-example" data-name="layer_null_data"></div>


### Example: Using window transform to impute missing values


<div class="vl-example" data-name="window_impute_null"></div>
5 changes: 5 additions & 0 deletions site/docs/transform/window.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,8 @@ Here we use window transform to visualize how the average MPG for vehicles have
### Percent of Total

The window transform _can_ be used to compute an aggregate and attach it to all records in order to derive a percent of total, however, a simpler approach is to use the [join aggregate](joinaggregate.html) transform instead.


### Using window transform to impute missing values

<div class="vl-example" data-name="window_impute_null"></div>
2 changes: 1 addition & 1 deletion src/invalid.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import {isObject} from 'vega-util';
*/
export interface MarkInvalidMixins {
/**
* Invalid data mode for marks, which defines how the visualization should represent invalid values (`null` and `NaN` in continuous scales without defined output for invalid values) in the marks and their scale domains.
* Invalid data mode, which defines how the marks and corresponding scales should represent invalid values (`null` and `NaN` in continuous scales *without* defined output for invalid values).
*
* - `"filter"` — *Exclude* all invalid values from the visualization's *marks* and *scales*.
* For path marks (for line, area, trail), this option will create paths that connect valid points, as if the data rows with invalid values do not exist.
Expand Down

0 comments on commit 8dcaeae

Please sign in to comment.