The following sets of options for
breaks and legends offer expanded functionality over the basic use of
the biscale
package, allowing users to customize how their
data are sorted into different bins on the legend as well as how that
legend appears.
As of v1.0.0, biscale
functions accept factors as
well as numeric vectors. This allows users to exert far greater
control over how bivariate classes are ultimately calculated. To start,
we’ll load our dependencies and sample data:
If we investigate the pctWhite
vector in our sample
data, we’ll see that the data are percentage values.
Using style = "quantile"
will group these based on the
distribution of values, yielding breaks at approximately 14% and 62%.
Perhaps we would rather group our values manually, making breaks at
33.3% and 66.6% instead.
Now that biscale
accepts factors, we can construct our
breaks ahead of time and pass them to the cut()
function
from base
R
. We need to ensure that the breaks
created have one more value than what we will use for the
dim
argument in biscale
’s functions.
Therefore, if we intend to create a three-by-three bivariate map, our
breaks that are passed to cut()
’s breaks
argument need to have four values.
data$pctWhite_bin <- cut(data$pctWhite, breaks = c(0,33.3,66.6, max(data$pctWhite)), include.lowest = TRUE)
Using a similar approach, you can use
classInt::classIntervals()
to calculate your breaks as
well. For example, "kmeans"
is not included as one of the
styles in biscale
, but we can apply it to our data and use
it as the basis for constructing breaks:
## calculate breaks
breaks <- classIntervals(data$pctWhite, n = 3, style = "kmeans")$brks
## cut data
data$pctWhite_bin <- cut(data$pctWhite, breaks = breaks, include.lowest = TRUE)
The classInt::classIntervals()
is what
bi_class()
uses internally to calculate breaks for
continuous variables, and it is also possible to manually replicate
these calculations using this approach.
No matter the approach we’ve used to create it, we can use our factor
pctWhite_bin
with bi_class()
:
The bi_class()
function will ensure that the number of
factor levels in pctWhite_bin
matches the value given for
dim
. Since medInc
is a continuous measure, it
will be binned using the "quantile"
approach. If both the
x
and y
variables are factors,
style
can be omitted. From this point forward, the
biscale
workflow is the same as in the basic examples.
As of v1.0.0, biscale
provides two sets of tools for
further customizing your legends. These include the addition of breaks
or labels to each axis as well as the addition of padding between each
grid square on the legend. To start, we’ll load our dependencies
and sample data:
To take advantage of biscale
’s new functionality for
adding labels or breaks to legends, there is a companion function to
bi_class()
named bi_class_breaks()
. The
arguments are largely the same, though bi_class_breaks()
contains some additional arguments for formatting the output. These
options will significantly influence what your legend looks like. Of
particular note are dig_lab
, which impacts the number of
digits returned, and split
, which will impact whether you
create labels (if split = FALSE
) or breaks (if
split = TRUE
):
## example 1
labels1 <- bi_class_breaks(data, x = pctWhite, y = medInc, style = "quantile",
dim = 3, dig_lab = 3, split = FALSE)
## example 2
breaks2 <- bi_class_breaks(data, x = pctWhite, y = medInc, style = "quantile",
dim = 3, dig_lab = c(x = 2, y = 5), split = TRUE)
What is crucial here is that you use the same style
for
calculating breaks as well as the same x
and y
columns.
The results illustrate important differences between the two examples:
> ## example 1
> labels1
$bi_x
[1] "0-14" "14-62" "62-96.7"
$bi_y
[1] "1.05e+04-2.62e+04" "2.62e+04-4.39e+04" "4.39e+04-7.44e+04"
>
> ## example 2
> breaks2
$bi_x
[1] 0 14 62 97
$bi_y
[1] 10545 26185 43913 74425
In the first example, dig_lab = 3
is applied to both the
x
and y
vectors, and
split = FALSE
creates labels where a range of values for
each bin is show separated by a dash. Since dig_lab = 3
,
for these specific vectors, it produces inconsistently rounded values
for x
and scientific notation for y
.
In the second example, dig_lab = c(x = 2, y = 5)
uses a
named vector to apply different dig_lab
values to
x
and y
. This results in consistent decimals
for x
and no scientific notation for y
- a big
improvement! Since split = TRUE
, we get breaks instead of
labels.
The specific values needed for dig_lab
are entirely
dependent on your data, and some experimentation will likely be
necessary to produce values you are happy with. We’ll recreate
labels1
before proceeding, using what we learned about the
best dig_lab
values:
## example 1 (modified)
labels1 <- bi_class_breaks(data, x = pctWhite, y = medInc, style = "quantile",
dim = 3, dig_lab = c(2,5), split = FALSE)
Notice here that we use an unnamed vector for the
dig_lab
argument. bi_class_breaks()
will
accept either.
If you are using pre-made factors, these can be passed to
bi_class_breaks()
as well. Picking up from the example
above, the factor variable `is passed to the
x`
argument:
bi_class_breaks(data, x = pctWhite_bin, y = medInc, style = "quantile",
dim = 3, dig_lab = c(x = NA, y = 5), split = FALSE)
Note that an NA
value is passed to dig_lab
since pctWhite_bin
has already been created as a factor. If
you are using classInt::classIntervals()
to create your
factor, use that function’s dig_lab
argument instead to
prepare your labels or breaks to the desired number of decimal
places.
Once you have values that are ready to use, they can be passed to
bi_legend()
. To illustrate the difference between labels
and breaks, we’ll place the legends next to each other for comparison.
First, our code:
## example 1 (modified)
legend1 <- bi_legend(pal = "PurpleGrn",
xlab = "% White",
ylab = "Income",
size = 12,
breaks = labels1,
arrows = FALSE)
## example 2
legend2 <- bi_legend(pal = "PurpleGrn",
xlab = "% White",
ylab = "Income",
size = 12,
breaks = breaks2,
arrows = FALSE)
We have passed our objects containing labels or breaks,
labels1
and breaks2
respectively, to the
optional breaks
argument. Since we now can see how values
are changing, we can simplify the labels. In both cases, we have
arrows = FALSE
to suppress the default arrows and have less
text passed to both the xlab
and ylab
text.
Here are the results:
For comparison, here is the default legend:
If you desire a clearer delineation of the classifications within the
palette, you can use the optional pad_width
and
pad_color
arguments to style the legend.
As of v1.1.0, biscale
’s legends accept non-Latin
characters. The bi_legend()
function now has a
base_family
argument that can be use to alter the legend
font family used. It defaults to "sans"
, which has always
been the font family used in biscale
. However, users who
wish to utilize non-Latin characters may find that "sans"
will not print their inputs. By setting base_family = ""
,
those characters can now be used to created legends in
biscale
if the suggested package showtext
is
installed.
If you want to use non-Latin characters, you can either install
showtext
individually (faster) or install all of the
suggested dependencies at once (slower, will also give you a number of
other packages you may or may not want):
## install just showtext
install.packages("showtext")
## install all suggested dependencies
install.packages("biscale", dependencies = TRUE)
Once you have showtext
installed, you should include
showtext::showtext_auto()
prior to using
bi_legend()
:
# set language preferences
showtext::showtext_auto()
# create legend
bi_legend(pal = "GrPink",
dim = 3,
xlab = "白人",
ylab = "收入",
size = 12,
arrows = TRUE,
base_family = "")
When you use bi_theme()
, be sure to set
base_family = ""
as well so that you can use non-Latin
characters there, too.