Short version
Can I replace
source(filename, local = TRUE, encoding = 'UTF-8')
with
eval(parse(filename, encoding = 'UTF-8'))
without any risk of breakage, to make UTF-8 source files work on Windows?
Long version
I am currently loading specific source files via
source(filename, local = TRUE, encoding = 'UTF-8')
However, it is well known that this does not work on Windows, full stop.
As a workaround, Joe Cheng suggested using instead
eval(parse(filename, encoding = 'UTF-8'))
This seems to work quite well1 but even after consulting the source code of source
, I don’t understand how they differ in one crucial detail:
Both source
and sys.source
do not simply parse
and then eval
the file content. Instead, they parse the file content and then iterate manually over the parsed expressions, and eval
them one by one. I do not understand why this would be necessary in sys.source
(source
at least uses it to show verbose diagnostics, if so instructed; but sys.source
does nothing of the kind):
for (i in seq_along(exprs)) eval(exprs[i], envir)
What is the purpose of eval
ing statements separately? And why is it iterating over indices instead directly over the sub-expressions? What other caveats are there?
To clarify: I am not concerned about the additional parameters of source
and parse
, some of which may be set via options.
1 The reason that source
is tripped up by the encoding but parse
isn’t boils down to the fact that source
attempts to convert the input text. parse
does no such thing, it reads the file’s byte content as-is and simply marks its Encoding
as UTF-8
in memory.
This is not a full answer as it primarily addresses the
seq_along
part of the question, but too lengthy to include as comments.One key difference between the
seq_along
followed by[
vs just usingfor i in x
approach (which I believe is be similar toseq_along
followed by[[
instead of[
) is that the former preserves the expression. Here is an example to illustrate the difference:Alternatively:
Whether this has any practical impact when comparing to
eval(parse(..., keep.source=T))
, I can only say that it could, but can't imagine a situation where it does.Note that subsetting expression separately also leads to the
srcref
business getting subset, which could conceivably be useful (...maybe?).