RStudio被崩溃时,我试图重塑使用特定数据帧dcast
(从reshape2
包)。 我发现,坠机实际上R中本身发生的事情,所以我跑R.app我铸造代码,并得到了错误的,让这个网站的名称类型: Error: segfault from C stack overflow
。 随着谷歌的帮助,所以,我才知道,这是一个内存访问错误。
好吧,我得到了那么远,但我不知道在哪里可以从这里走。 我不能提供一个真正可重复的例子,因为我的数据帧是约558000行和小玩具的例子不会发生问题。 例如,即使我走,说,数据的50,000行子集, dcast
工作得很好。 莫不是这是造成问题的数据的特定行? 如果是这样,任何人都可以提出什么功能(S),以寻找那些可能会导致错误的我收到的类型?
下面是数据帧我是从铸件(用假值一些变量)的子集,接着我使用的铸造功能。 我还包含了数据在这个小片段dput
以下功能,万一这将有助于发挥与它周围。 真实的数据集具有大约700的值prog
,15个值prog1
,以及5个值fa.type
。
id term yr nslds acad.lev prog prog1 fa.type amount
1 1 Fall 2009 2010 Graduate Graduate loan 1 Other Loans Loan 5000
2 1 Spring 2010 2010 Graduate Graduate loan 1 Other Loans Loan 5000
3 2 Fall 2009 2010 Graduate Graduate loan 2 Stafford Loan Loan 8781
4 2 Spring 2010 2010 Graduate Graduate loan 2 Stafford Loan Loan 8781
5 3 Fall 2007 2008 Graduate Graduate loan 3 Stafford Loan Loan 4250
6 3 Fall 2007 2008 Graduate Graduate grant 1 University Grant Grant 1707
fa.wide = dcast(id + term + yr + nslds + acad.lev ~ prog1 + fa.type , data=fa, value.var="amount", fun.aggregate=sum)
fa = structure(list(id = c(1, 1, 2, 2, 3, 3), term = structure(c(7L,
8L, 7L, 8L, 1L, 1L), .Label = c("Fall 2007", "Spring 2008", "Summer 2008",
"Fall 2008", "Spring 2009", "Summer 2009", "Fall 2009", "Spring 2010",
"Summer 2010", "Fall 2010", "Spring 2011", "Summer 2011", "Fall 2011",
"Spring 2012", "Summer 2012", "Fall 2012", "Spring 2013"), class = c("ordered",
"factor")), yr = c(2010L, 2010L, 2010L, 2010L, 2008L, 2008L),
nslds = structure(c(7L, 7L, 7L, 7L, 7L, 7L), .Label = c("1st Year, Never Attended",
"1st Year, Previously Attended", "2nd Year", "3rd Year",
"4th Year", "5th Year+", "Graduate"), class = c("ordered",
"factor")), acad.lev = structure(c(6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("Freshman", "Sophomore", "Junior", "Senior",
"PB Undergrad", "Graduate"), class = c("ordered", "factor"
)), prog = c("loan 1", "loan 1", "loan 2", "loan 2", "loan 3",
"grant 1"), prog1 = c("Other Loans", "Other Loans", "Stafford Loan",
"Stafford Loan", "Stafford Loan", "University Grant"), fa.type = structure(c(3L,
3L, 3L, 3L, 3L, 2L), .Label = c("Athletic", "Grant", "Loan",
"Scholarship", "Waiver", "Work/Study"), class = "factor"),
amount = c(5000, 5000, 8781, 8781, 4250, 1707)), .Names = c("id",
"term", "yr", "nslds", "acad.lev", "prog", "prog1", "fa.type",
"amount"), row.names = c(NA, 6L), class = "data.frame")
这不是一个答案,但一个简单的(无意义的),可重复的,将不适合在评论例子。 您可以创建与这个简单的例子,这个错误(在我的MacBookPro)。
require(reshape2)
n = 1448
df <- data.frame( Student = rep( 1:n , each = 2 ) , Grade = sample( 100 , n*2 , repl = TRUE ) )
df2 <- dcast( df , Student ~ Student , value.var = "Grade" , sum )
Error: segfault from C stack overflow
在边界处发生了错误n = 1448
,即,不会发生当n=1447
和下面。 看来,错误是来自split_indices
在split-numeric.c
从包装plyr
。 这可能与该分组级别的数量分配给(无符号?)整数值的事实做,如果组数越过32767它会导致一个内存访问错误,但TBH我救命稻草,现在抓着。
我sessionInfo()
的情况下,任何人都无法重现此错误是:
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.2.2
loaded via a namespace (and not attached):
[1] plyr_1.8 stringr_0.6.2
有趣的是,如果我运行df2 <-
命令再次获得第一个错误后,R彻底崩溃了,我得到一些操作系统产生的错误报告。 我在这里包括了崩溃日志的相关部分:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_PROTECTION_FAILURE at 0x00007fff5f3ff120
VM Regions Near 0x7fff5f3ff120:
JS JIT generated code 00004d431a401000-00004d431a402000 [ 4K] ---/rwx SM=NUL
--> STACK GUARD 00007fff5bc00000-00007fff5f400000 [ 56.0M] ---/rwx SM=NUL stack guard for thread 0
Stack 00007fff5f400000-00007fff5fc00000 [ 8192K] rw-/rwx SM=COW thread 0
Application Specific Information:
objc[57147]: garbage collection is OFF
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_c.dylib 0x00007fff897c4632 small_free_scan_madvise_free + 41
1 libsystem_c.dylib 0x00007fff897c5f06 szone_free_definite_size + 4186
2 libsystem_c.dylib 0x00007fff897fe789 free + 194
3 libR.dylib 0x0000000100222dbf R_gc_internal + 7327 (memory.c:952)
4 libR.dylib 0x0000000100224919 Rf_allocVector + 841 (memory.c:2356)
5 plyr.so 0x000000010144bd2c split_indices + 204 (split-numeric.c:23)
6 libR.dylib 0x00000001001b4cc7 do_dotcall + 16311 (dotcode.c:593)
7 libR.dylib 0x00000001001e4448 Rf_eval + 1672 (eval.c:494)
8 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
9 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
10 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
11 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
12 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
13 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
14 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
15 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
16 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
17 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
18 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
19 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
20 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
21 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
22 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
23 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
24 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
25 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
26 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
27 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
28 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
29 libR.dylib 0x00000001001e5edd do_begin + 141 (eval.c:1415)
30 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
31 libR.dylib 0x00000001001e93b1 Rf_applyClosure + 849 (eval.c:861)
32 libR.dylib 0x00000001001e41b2 Rf_eval + 1010 (eval.c:512)
33 libR.dylib 0x00000001001e74e5 do_set + 709 (eval.c:1717)
34 libR.dylib 0x00000001001e429c Rf_eval + 1244 (eval.c:468)
35 libR.dylib 0x000000010021c761 R_ReplDLLdo1 + 481 (main.c:362)
36 org.R-project.R 0x0000000100022c24 run_REngineRmainloop + 196
37 org.R-project.R 0x00000001000159b7 -[REngine runREPL] + 119
38 org.R-project.R 0x0000000100001f24 main + 852
39 org.R-project.R 0x0000000100001914 start + 52
我具有枢转一个长表,以宽一个在包reshape2使用dcast一个同样的问题。 我发现在这个岗位解决方案plyr split_indices功能崩溃的长向量 。 具体来说,您可以下载该页面中的split_numeric.c和环路apply.c https://github.com/hadley/plyr/tree/master/src 。 卸载来自R控制台包plyr,最后在本地重新安装软件包:install.packages(“/路径/到/源”,回购= NULL,类型=“源”)。
这解决了我的问题,希望它帮助。