The question Including null values in an Apache Spark Join has answers for Scala, PySpark and SparkR, but not for sparklyr. I've been unable to figure out how to have inner_join
in sparklyr treat null values in a join column as equal. Does anyone know how this can be done in sparklyr?
相关问题
- SQL join to get the cartesian product of 2 columns
- How to maintain order of key-value in DataFrame sa
- R - Quantstart: Testing Strategy on Multiple Equit
- Using predict with svyglm
- Reshape matrix by rows
相关文章
- How to convert summary output to a data frame?
- Livy Server: return a dataframe as JSON?
- How to plot smoother curves in R
- Paste all possible diagonals of an n*n matrix or d
- ess-rdired: I get this error “no ESS process is as
- How to use doMC under Windows or alternative paral
- dyLimit for limited time in Dygraphs
- Saving state of Shiny app to be restored later
You can invoke an implicit cross join:
and filter the result with
IS NOT DISTINCT FROM
The optimized execution plan:
<=>
operator should work the same way:Please note that:
It is possible to use
dplyr
style cross join:but I'd advise against that, as it is less robust (depending on the context optimizer might be unable to recognize that
_const
is constant).