Skip to contents

EXPERIMENTAL! This function joins two dataframes together by key, and then coalesces any columns that have shared names (i.e. fills in NAs). A utility function primarily used internally within nflverse to help build player IDs

Usage

join_coalesce(
  x,
  y,
  by = NULL,
  type = c("left", "inner", "full"),
  ...,
  by.x = NULL,
  by.y = NULL,
  sort = TRUE,
  incomparables = c(NA, NaN)
)

Arguments

x, y

dataframes. Will be coerced to data.table

by

keys to join on, as a plain or named character vector

type

one of "left" (all rows of x and matching rows of y), "inner" (matching rows of x and y), "full" (all rows of x and y)

...

other args passed to merge.data.frame()

by.x, by.y

alternate form of keys to join on - if provided, will override by.

sort

whether to sort output by the join keys

incomparables

keys to NOT match on, i.e. NA should not match on NA.

Value

a data.table joining x and y dataframes together, with every column from both x and y and patching NA values in x with those in y.

Examples


x <- data.frame(id1 = c(NA_character_,letters[1:4]), a = c(1,NA,3,NA,5), b = 1:5 * 10)
y <- data.frame(id2 = c(letters[3:11],NA_character_), a = -(1:10), c = 1:10 * 100)

join_coalesce(x,y, by = c("id1"="id2"))
#> Warning: Join `by` keys in x have NAs
#> Join `by` keys in y have NAs
#>     id1  a  b   c
#> 1:    a NA 20  NA
#> 2:    b  3 30  NA
#> 3:    c -1 40 100
#> 4:    d  5 50 200
#> 5: <NA>  1 10  NA
join_coalesce(x,y, by.x = "id1", by.y = "id2")
#> Warning: Join `by` keys in x have NAs
#> Join `by` keys in y have NAs
#>     id1  a  b   c
#> 1:    a NA 20  NA
#> 2:    b  3 30  NA
#> 3:    c -1 40 100
#> 4:    d  5 50 200
#> 5: <NA>  1 10  NA
join_coalesce(x,y, by = c("id1"="id2"), type = "inner")
#> Warning: Join `by` keys in x have NAs
#> Join `by` keys in y have NAs
#>    id1  a  b   c
#> 1:   c -1 40 100
#> 2:   d  5 50 200
join_coalesce(x,y, by = c("id1"="id2"), type = "full")
#> Warning: Join `by` keys in x have NAs
#> Join `by` keys in y have NAs
#>      id1   a  b    c
#>  1:    a  NA 20   NA
#>  2:    b   3 30   NA
#>  3:    c  -1 40  100
#>  4:    d   5 50  200
#>  5:    e  -3 NA  300
#>  6:    f  -4 NA  400
#>  7:    g  -5 NA  500
#>  8:    h  -6 NA  600
#>  9:    i  -7 NA  700
#> 10:    j  -8 NA  800
#> 11:    k  -9 NA  900
#> 12: <NA>   1 10   NA
#> 13: <NA> -10 NA 1000