Code
# double and integer
0.0 == 0L
#> [1] TRUE
# double and logical
0 == FALSE
#> [1] TRUE
# double and character
0 == "0"
#> [1] TRUE
# date and character
as.Date("2024-05-08") == "2024-05-08"
#> [1] TRUEFebruary 26, 2025
This interesting behavior was related to {fuj} v0.2.0 lst() behavior. An issue was created to track the fix: jmbarbone/fuj#60.
Short answer: Yes. Feel free to read through the journey of discovery.
The original post showed the behavior with substitute(), which made this feel a little more confusing that it really is.
When trying to compare two values, R makes an attempt to find a common data type. This shouldn’t be new:
In the above examples, all the values are coerced to the same type before the comparison is made. These all resolve in the values being equal. However, there is a special behavior invoked when we compare lists.
Here’s the wat simplified:
Let’s rewrite this as direct character conversions:
There it is: list(NA) turns into "NA". wat? Well, these two help files provide the relevant information on what is happening here:
Language objects such as symbols and calls are deparsed to character strings before comparison.
?base::Comparison, Details
For lists and pairlists (including language objects such as calls) it deparses the elements individually, except that it extracts the first element of length-one character vectors.
?base::character, Value
The extraction of length 1 character vectors is the described behavior that I was missing. When calling deparse() on a non-list vector, we get a different result for NA and NA_character_: So, here’s what happens:
We can replicate this with a custom function:
my_convert <- function(x) {
do <- function(i) {
if (is.character(i) && length(i) == 1L) {
# extracts the first element of length-one character vectors
return(i[1L])
}
# deparses elements individually
deparse1(i, control = NULL)
}
# determine routine for each element
vapply(x, do, "", USE.NAMES = FALSE)
}
x <- list(1:3, NA, NA_real_, NA_character_)
waldo::compare(as.character(x), sapply(x, deparse, control = NULL))
#> `old`: "1:3" "NA" "NA" NA
#> `new`: "1:3" "NA" "NA" "NA"
waldo::compare(as.character(x), my_convert(x))
#> ✔ No differencesThere is a default control of "keepNA", which retains additional information about our NA values. The list to character conversion doesn’t seem to use this as all the values are resolved to "NA" rather than "NA_character_", or the like.
---
title: "wat? Character NAs"
subtitle: "But it is documented"
date: 2024-05-20
categories: ["R", "{fuj}", "wat"]
draft: false
---
::: {.callout-note}
This interesting behavior was related to [`{fuj} v0.2.0 lst()`](https://github.com/jmbarbone/fuj/blob/v0.2.0/R/list.R) behavior.
An issue was created to track the fix: [jmbarbone/fuj#60](https://github.com/jmbarbone/fuj/issues/60).
:::
<iframe src="https://fosstodon.org/@barbone/112401680792851241/embed" class="mastodon-embed" style="max-width: 100%; border: 0" width="800" allowfullscreen="allowfullscreen"></iframe><script src="https://fosstodon.org/embed.js" async="async"></script>
Short answer: Yes.
Feel free to read through the journey of discovery.
The original post showed the behavior with [`substitute()`](https://rdrr.io/r/base/substitute.html), which made this feel a little more confusing that it really is.
When trying to compare two values, **R** makes an attempt to find a common data type.
This shouldn't be new:
```{r}
# double and integer
0.0 == 0L
# double and logical
0 == FALSE
# double and character
0 == "0"
# date and character
as.Date("2024-05-08") == "2024-05-08"
```
In the above examples, all the values are coerced to the same type before the comparison is made.
These all resolve in the values being _equal_.
However, there is a special behavior invoked when we compare lists.
Here's the _wat_ simplified:
```{r}
# NA resolves to NA
NA == ""
NA_character_ == ""
# but not in a list
list(NA) == ""
# unless it's a character
list(NA_character_) == ""
```
Let's rewrite this as direct character conversions:
```{r}
as.character(NA)
as.character(NA_character_)
as.character(list(NA))
as.character(list(NA_character_))
```
There it is: `list(NA)` turns into `"NA"`.
_wat_?
Well, these two help files provide the relevant information on what is happening here:
> Language objects such as symbols and calls are deparsed to character strings before comparison.
[`?base::Comparison`, Details](https://rdrr.io/r/base/Comparison.html)
> For lists and pairlists (including language objects such as calls) it deparses the elements individually, except that it extracts the first element of length-one character vectors.
[`?base::character`, Value](https://rdrr.io/r/base/character.html)
The extraction of length 1 character vectors is the described behavior that I was missing.
When calling [`deparse()`](https://rdrr.io/r/base/deparse.html) on a non-list vector, we get a different result for `NA` and `NA_character_`:
So, here's what happens:
1. **R** detects that one comparison is a list
2. **R** converts the list vector to a character vector
3. For each element in the list, **R** converts checks if it is a single length character vector or not
4. If it is a single length character vector, **R** simply returns the (first element of the) value
5. If it is not a single length character vector, **R** _deparses_ the value
We can replicate this with a custom function:
```{r}
my_convert <- function(x) {
do <- function(i) {
if (is.character(i) && length(i) == 1L) {
# extracts the first element of length-one character vectors
return(i[1L])
}
# deparses elements individually
deparse1(i, control = NULL)
}
# determine routine for each element
vapply(x, do, "", USE.NAMES = FALSE)
}
x <- list(1:3, NA, NA_real_, NA_character_)
waldo::compare(as.character(x), sapply(x, deparse, control = NULL))
waldo::compare(as.character(x), my_convert(x))
```
::: {.callout-tip}
There is a default `control` of `"keepNA"`, which retains additional information about our `NA` values.
The `list` to `character` conversion doesn't seem to use this as all the values are resolved to `"NA"` rather than `"NA_character_"`, or the like.
```{r}
sapply(x, deparse, control = "keepNA")
```
:::