我有一个大的数据集,我想编写自定义合并功能与适用于使用,但我解决不了一定的问题。 我不能用一个循环,因为它会花费太长的时间。 数据大致是这样的;
# [ Name, Strand, Start, End ]
R1 = c( 'GeneA', '+', 1000, 1500 )
R2 = c( 'GeneA', '+', 1510, 2000 )
R3 = c( 'GeneA', '+', 2001, 2500 )
R4 = c( 'GeneB', '-', 3100, 4000 )
中的数据是与行R1一个data.frame:R4
到目前为止,我能得到哪些比较Ri和Rj的功能(J = +1),并合并他们,如果名称是一样的,钢绞线是一样的,他们之间的差距是小于100。
GAP = Ri[End] - Rj[Start]
如果我申请的功能,每一行,建立输出。 该函数然后输出应该创建
R1 = c( 'GeneA', '+', 1000, 2500 )
R4 = c( 'GeneB', '-', 3100, 4000 )
我能得到哪些可以使用适用于合并两个连续的元素融入其中工作的功能,但我无法弄清楚如何合并连续的三个要素。 一个丑陋的解决方案是,直到没有额外的合并是由运行连续两个元素合并功能,但这不是有效的。 我坚持的想法有点和任何见解将不胜感激。
编辑:澄清,该数据是通过在连续的染色体起始位置排序(未示出),并且每个基因名称在数据中设置在不同的位置出现多次(即基因A可以是1000-2500,然后再在4000-5000。我不希望合并这两个基因,只有连续的)。
编辑2:我已经使用以下添P溶液。 一个问题出现在合并的效率。 也就是有没有办法上传文本文件堆栈溢出,所以我可以显示什么数据真的看起来像,然后张贴我的脚本这么远?
# Let ValidMerge be a logical vector of MergeDown corresponding to the rows in the data
# as defined by TimP below
# The data.frame is called RMDB
# TeMerge function is previously defined to merge two rows Ri and Rj into one entry
# which spans the start of Ri to the end of Rj (with same name and strand)
COUNT = 0
RMDB.OUT = 0
while (COUNT < nrow(RMDB)) { # Cycle through all rows of RMDB
COUNT = COUNT + 1
# Is this position a merger start?
# If yes, then returns position in startpt
# which will be the end position in endpt
# If no, returns 0
Merge = match(COUNT,startpt,nomatch=0)
if ( Merge == 0 ){
# No merge starts at this position
RMDB.OUT = rbind(RMDB.OUT,RMDB[COUNT,]) # Append COUNT Row to output
}else{
# Merge COUNT row in RMDB to its endpoint
RMDB.OUT = rbind(RMDB.OUT,
TeMerge(RMDB[COUNT,],RMDB[endpt[Merge],]))
#print(paste('Merging Row',COUNT,' and Row',endpt[Merge]))
COUNT = endpt[Merge] # Move Count to the endpoint
}
}
这个脚本正好给我感兴趣的结果,唯一的问题是,我的data.frame有5 000 000项,我想用几个参数,间隙大小,比较的结果运行分析。 有没有我可以重写代码的这一部分更有效的方法吗? 一切直到这一点在合理的时间(〜几分钟)中运行。 这部分已经进行了3小时以上的数据的700 000子集。
编辑3:其具有所有的情况下在顶部(MIR3)的加标数据。 忽略列1:4,8,11:15。
dput(RMDB)结构(列表(V1 = C(3612L,318L,318L,318L,318L,318L,318L,318L,318L,318L,318L,318L,318L,318L,741L,444L,741L,407L,2059L, 407L,656L,2058L,257L,4051L,456L,351L,850L,335L,1000L,1566L,236L,588L,3877L,750L,2292L,783L,747L,666L,260L,1118L,341L,7010L,320L,7010L, 249L,458L,24L,631L,631L,875L),V2 = C(11.4,23,23,23,23,23,23,23,23,23,23,23,23,23,18.7,11.6,24.9 ,28.8,14.1,28.8,23.6,18.3,25,如图7所示,8.2,23.6,24.9,29.5,13.5,19.4,34.8,17.4,22.9,27.6,12,26.6,30.4,12.9,38.5,35.4,27.8,19.2 ,0,17.3,21.2,19.3,0,3.9,26.6,22.6),V3 = C(27,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8,3.8, 4.5,0,0,7.3,0.3,7.3,12.2,4.9,0,0.4,0,1.8,15.1,12.7,0,3.6,3,7.5,14,9.4,0,14.1,4.1,4,1.4, 9.4,5.9,2.7,0,9.3,1,0,0,0,1.8,9.5),V4 = C(1.3,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,3.1,1.1,3,0.3%,3,3,0,0,0,0,3.6,0.8,2.7,0,2.8,0.4,0.8,1.6,6.4,0,3.8,3.7 ,0,0,2.6 ,1.5,2.5,2.4,1.4,2,5,0,0,0.6,0.5),V5 =结构(C(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, 1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, 1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L),.Label = “CHR1” 的class = “因子”),V6 = C(10469L,20001L ,20210L,21000L,21201L,22000L,22201L,23000L,23201L,20000L,20001L,24001L,24205L,24405L,0L,30855L,30953L,31293L,31436L,31734L,32841L,33048L,33466L,33529L,34048L,34451L,34565L ,35217L,35367L,37045L,37733L,38059L,38256L,39465L,39624L,39925L,40333L,40629L,40736L,41380L,42370L,43243L,44836L,44877L,45887L,46079L,46217L,46416L,46553L,46893L),V7 = C(11447L,20200L,20400L,21200L,21400L,22200L,22400L,23200L,23400L,20001L,20200L,24200L,24400L,24600L,0L,30952L,31131L,31435L,31733L,31754L,33037L,33456L,33509L,34041L, 34108L,34560L,34921L,35366L,35499L,37431L,37861L,38191L,39464L,39623L,39924L,40294L,40626L ,40729L,40878L,42285L,42504L,44835L,44876L,45753L,45987L,46198L,46240L,46493L,46722L,47092L),V8 =结构(C(38L,37L,37L,37L,37L,37L,37L,37L, 37L,37L,37L,37L,37L,37L,36L,35L,34L,33L,32L,31L,30L,29L,28L,27L,26L,25L,24L,23L,22L,21L,20L,19L,18L, 17L,16L,15L,14L,13L,12L,11L,10L,9L,8L,7L,6L,5L,4L,3L,2L,1L),.Label = C( “(249203529)”,“(249203899) ”, “(249204128)”, “(249204381)”, “(249204423)”, “(249204634)”, “(249204868)”, “(249205745)”, “(249205786)”, “(249208117)”, “(249208336)”, “(249209743)”, “(249209892)”, “(249209995)”, “(249210327)”, “(249210697)”, “(249210998)”, “(249211157)”,“( 249212430) “ ”(249212760)“, ”(249213190)“, ”(249215122)“, ”(249215255)“, ”(249215700)“, ”(249216061)“, ”(249216513)“,”(249216580) ”, “(249217112)”, “(249217165)”, “(249217584)”, “(249218867)”, “(249218888)”, “(249219186)”, “(249219490)”, “(249219669)”, “(249219773)”, “(249235266)”, “(249239174)”),类= “因子”),V9 =结构(C(2L,1L,1L,2L, 2L,2L,2L,2L,1L,2L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,2L,1L,2L,2L,2L,1L, 1L,1L,1L,1L,1L,1L,1L,1L,2L,2L,2L,1L,1L,1L,1L,1L,1L,1L,2L,1L,2L),.Label = C(“+ ”, “C”)中,class = “因素”),V10 =结构(C(32L,23L,23L,23L,23L,23L,24L,23L,23L,23L,23L,23L,23L,23L,34L, 33L,27L,26L,1L,26L,22L,12L,5L,14L,13L,16L,30L,25L,2L,7L,16L,15L,29L,28L,3L,28L,15L,4L,18L,8L, 19L,10L,31L,10L,9L,11L,6L,17L,20L,21L),.Label = C( “AluJo”, “AluJr”, “AluSx”, “AluSz6”, “AluYc”, “AT_rich”, “Charlie5”, “ERVL-E-INT”, “L1M5”, “L1MA8”, “L1MA9”, “L1MB5”, “L1P1”, “L1PA6”, “L2a而”, “L2c的”, “LTR12F”,“LTR16C “ ”MamRep1527“, ”MER45A“, ”MER58A“, ”MIR“, ”MIR3“, ”MIR3a“, ”MIRb“, ”MIRC“, ”MLT1A“, ”MLT1E1A“, ”MLT1E1A-INT“,” MLT1J2 ”, “(TAAA)n” 个, “TAR1”, “(TC)N”, “XXXXX”),类= “因子”),V11 =结构(C(10L,13L,13L,13L,13L,13L, 13L,13L,13L,13L,13L,13L,13L,13L,14L,11L,9L,13L,12L,13L,13L,3L,12L,3L,3L,4L,9L,13L,12L, 1L,4L,4L,9L,9L,12L,9L,4L,12L,8L,8L,6L,3L,11L,3L,3L,3L,5L,7L,2L,1L),.Label = C(“DNA / HAT-查理 “ ”DNA /帽Tip100“, ”LINE / L1“, ”LINE / L2“, ”Low_complexity“, ”LTR“, ”LTR / ERV1“, ”LTR / ERVL“,” LTR / ERVL -MaLR”, “卫星/ TELO”, “Simple_repeat”, “SINE /铝”, “SINE / MIR”, “XXXXXXXX”),类= “因子”),V12 =结构(C(19L,5L,5L, 5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,2L,9L,8L,25L,2L,10L,9L,23L,2L,1L,15L,12L,1L,6L, 2L,11L,16L,13L,7L,2L,2L,8L,14L,1L,3L,26L,17L,18L,9L,21L,22L,24L,2L,4L,27L,20L),.Label = C( “(0)”, “1”, “(113)”, “(115)”, “(119)”, “(13)”, “131”, “173”, “2”, “218”, “2234”, “(231)”, “2705”, “2923”, “2970”, “3242”, “359”, “3715”, “(399)”, “(4)”, “5306”, “5334”, “5746”, “6167”, “67”, “(685)”, “7”),类= “因子”),V13 = C(1712L,143L,143L,143L,143L,143L, 143L,143L,143L,143L,143L,143L,143L,143L,162L,96L,349L,217L,298L,238L,216L,6174L,44L,6154L,3030L,3156L,448L,255L,133L,2623L,3375L, 2846L,1489L, 172L,301L,666L,3217L,312L,376L,4982L,499L,5305L,41L,6290L,5433L,6280L,24L,404L,178L,220L),V14 =结构(C(23L,24L,24L,24L,24L ,24L,24L,24L,24L,24L,24L,24L,24L,24L,8L,1L,10L,25L,4L,13L,21L,1L,11L,26L,15L,14L,20L,30L,5L,2L ,1L,27L,1L,18L,3L,4L,7L,6L,9L,19L,22L,29L,1L,2L,28L,16L,1L,17L,1L,12L),.Label = C(“(0 ) “ ”(1)“, ”(11)“, ”(14)“, ”(179)“, ”208“, ”(209)“, ”(212)“, ”232“,”(25 ) “ ”(255)“, ”3“, ”(30)“, ”3049“, ”(3116)“, ”(32)“, ”327“, ”(388)“, ”4016“,” 41" , “(46)”, “(470)”, “483”, “49”, “(51)”, “5640”, “(573)”, “(691)”, “(838)” , “91”),类= “因子”),V15 = C(2L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,22L,23L,22L ,24L,25L,24L,26L,27L,28L,29L,30L,31L,32L,33L,34L,35L,36L,37L,38L,38L,39L,38L,37L,40L,41L,42L,43L,44L ,45L,44L,46L,47L,48L,49L,50L,51L)),.Names = C( “V1”, “V2”, “V3”, “V4”, “V5”, “V6”,“V7 “ ”V8“, ”V9“, ”V10“, ”V11“, ”V12“, ”V13“, ”V14“, ”V15“),类=” data.frame ”,row.names = C(NA,-50L))
这里是应用ValidMerge后的输出。
dput(RMDB.OUT) structure(list(V1 = c(3612L, NA, NA, 318L, 318L, 318L, 318L, 318L, 318L, NA, 741L, 444L, 741L, 407L, 2059L, 407L, 656L, 2058L, 257L, 4051L, 456L, 351L, 850L, 335L, 1000L, 1566L, 236L, 588L, 3877L, 750L, 2292L, 783L, 747L, 666L, 260L, 1118L, 341L, 7010L, 320L, 7010L, 249L, 458L, 24L, 631L, 631L, 875L), V2 = c(11.4, NA, NA, 23, 23, 23, 23, 23, 23, NA, 18.7, 11.6, 24.9, 28.8, 14.1, 28.8, 23.6, 18.3, 25, 7, 8.2, 23.6, 24.9, 29.5, 13.5, 19.4, 34.8, 17.4, 22.9, 27.6, 12, 26.6, 30.4, 12.9, 38.5, 35.4, 27.8, 19.2, 0, 17.3, 21.2, 19.3, 0, 3.9, 26.6, 22.6), V3 = c(27, NA, NA, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, NA, 4.5, 0, 0, 7.3, 0.3, 7.3, 12.2, 4.9, 0, 0.4, 0, 1.8, 15.1, 12.7, 0, 3.6, 3, 7.5, 14, 9.4, 0, 14.1, 4.1, 4, 1.4, 9.4, 5.9, 2.7, 0, 9.3, 1, 0, 0, 0, 1.8, 9.5 ), V4 = c(1.3, NA, NA, 0, 0, 0, 0, 0, 0, NA, 0, 3.1, 1.1, 3, 0.3, 3, 3, 0, 0, 0, 0, 3.6, 0.8, 2.7, 0, 2.8, 0.4, 0.8, 1.6, 6.4, 0, 3.8, 3.7, 0, 0, 2.6, 1.5, 2.5, 2.4, 1.4, 2, 5, 0, 0, 0.6, 0.5), V5 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), V6 = c(10469L, 20001L, 21000L, 22000L, 22201L, 23000L, 23201L, 20000L, 20001L, 24001L, 0L, 30855L, 30953L, 31293L, 31436L, 31734L, 32841L, 33048L, 33466L, 33529L, 34048L, 34451L, 34565L, 35217L, 35367L, 37045L, 37733L, 38059L, 38256L, 39465L, 39624L, 39925L, 40333L, 40629L, 40736L, 41380L, 42370L, 43243L, 44836L, 44877L, 45887L, 46079L, 46217L, 46416L, 46553L, 46893L), V7 = c(11447L, 20400L, 21400L, 22200L, 22400L, 23200L, 23400L, 20001L, 20200L, 24600L, 0L, 30952L, 31131L, 31435L, 31733L, 31754L, 33037L, 33456L, 33509L, 34041L, 34108L, 34560L, 34921L, 35366L, 35499L, 37431L, 37861L, 38191L, 39464L, 39623L, 39924L, 40294L, 40626L, 40729L, 40878L, 42285L, 42504L, 44835L, 44876L, 45753L, 45987L, 46198L, 46240L, 46493L, 46722L, 47092L), V8 = structure(c(38L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 37L, 36L, 35L, 34L, 33L, 32L, 31L, 30L, 29L, 28L, 27L, 26L, 25L, 24L, 23L, 22L, 21L, 20L, 19L, 18L, 17L, 16L, 15L, 14L, 13L, 12L, 11L, 10L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), .Label = c("(249203529)", "(249203899)", "(249204128)", "(249204381)", "(249204423)", "(249204634)", "(249204868)", "(249205745)", "(249205786)", "(249208117)", "(249208336)", "(249209743)", "(249209892)", "(249209995)", "(249210327)", "(249210697)", "(249210998)", "(249211157)", "(249212430)", "(249212760)", "(249213190)", "(249215122)", "(249215255)", "(249215700)", "(249216061)", "(249216513)", "(249216580)", "(249217112)", "(249217165)", "(249217584)", "(249218867)", "(249218888)", "(249219186)", "(249219490)", "(249219669)", "(249219773)", "(249235266)", "(249239174)"), class = "factor"), V9 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L), .Label = c("+", "C"), class = "factor"), V10 = structure(c(32L, 23L, 23L, 23L, 24L, 23L, 23L, 23L, 23L, 23L, 34L, 33L, 27L, 26L, 1L, 26L, 22L, 12L, 5L, 14L, 13L, 16L, 30L, 25L, 2L, 7L, 16L, 15L, 29L, 28L, 3L, 28L, 15L, 4L, 18L, 8L, 19L, 10L, 31L, 10L, 9L, 11L, 6L, 17L, 20L, 21L), .Label = c("AluJo", "AluJr", "AluSx", "AluSz6", "AluYc", "AT_rich", "Charlie5", "ERVL-E-int", "L1M5", "L1MA8", "L1MA9", "L1MB5", "L1P1", "L1PA6", "L2a", "L2c", "LTR12F", "LTR16C", "MamRep1527", "MER45A", "MER58A", "MIR", "MIR3", "MIR3a", "MIRb", "MIRc", "MLT1A", "MLT1E1A", "MLT1E1A-int", "MLT1J2", "(TAAA)n", "TAR1", "(TC)n", "XXXXX"), class = "factor"), V11 = structure(c(10L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 11L, 9L, 13L, 12L, 13L, 13L, 3L, 12L, 3L, 3L, 4L, 9L, 13L, 12L, 1L, 4L, 4L, 9L, 9L, 12L, 9L, 4L, 12L, 8L, 8L, 6L, 3L, 11L, 3L, 3L, 3L, 5L, 7L, 2L, 1L), .Label = c("DNA/hAT-Charlie", "DNA/hAT-Tip100", "LINE/L1", "LINE/L2", "Low_complexity", "LTR", "LTR/ERV1", "LTR/ERVL", "LTR/ERVL-MaLR", "Satellite/telo", "Simple_repeat", "SINE/Alu", "SINE/MIR", "XXXXXXXX"), class = "factor"), V12 = structure(c(19L, NA, NA, 5L, 5L, 5L, 5L, 5L, 5L, NA, 2L, 9L, 8L, 25L, 2L, 10L, 9L, 23L, 2L, 1L, 15L, 12L, 1L, 6L, 2L, 11L, 16L, 13L, 7L, 2L, 2L, 8L, 14L, 1L, 3L, 26L, 17L, 18L, 9L, 21L, 22L, 24L, 2L, 4L, 27L, 20L), .Label = c("(0)", "1", "(113)", "(115)", "(119)", "(13)", "131", "173", "2", "218", "2234", "(231)", "2705", "2923", "2970", "3242", "359", "3715", "(399)", "(4)", "5306", "5334", "5746", "6167", "67", "(685)", "7"), class = "factor"), V13 = c(1712L, NA, NA, 143L, 143L, 143L, 143L, 143L, 143L, NA, 162L, 96L, 349L, 217L, 298L, 238L, 216L, 6174L, 44L, 6154L, 3030L, 3156L, 448L, 255L, 133L, 2623L, 3375L, 2846L, 1489L, 172L, 301L, 666L, 3217L, 312L, 376L, 4982L, 499L, 5305L, 41L, 6290L, 5433L, 6280L, 24L, 404L, 178L, 220L), V14 = structure(c(23L, NA, NA, 24L, 24L, 24L, 24L, 24L, 24L, NA, 8L, 1L, 10L, 25L, 4L, 13L, 21L, 1L, 11L, 26L, 15L, 14L, 20L, 30L, 5L, 2L, 1L, 27L, 1L, 18L, 3L, 4L, 7L, 6L, 9L, 19L, 22L, 29L, 1L, 2L, 28L, 16L, 1L, 17L, 1L, 12L), .Label = c("(0)", "(1)", "(11)", "(14)", "(179)", "208", "(209)", "(212)", "232", "(25)", "(255)", "3", "(30)", "3049", "(3116)", "(32)", "327", "(388)", "4016", "41", "(46)", "(470)", "483", "49", "(51)", "5640", "(573)", "(691)", "(838)", "91"), class = "factor"), V15 = c(2L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 22L, 23L, 22L, 24L, 25L, 24L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 38L, 39L, 38L, 37L, 40L, 41L, 42L, 43L, 44L, 45L, 44L, 46L, 47L, 48L, 49L, 50L, 51L)), .Names = c("V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14", "V15"), class = "data.frame", row.names = c(NA, -46L))
编辑4:对不起,这里是简化版本:初始Data.frame
dput(RMDB.cut)
结构(列表(染色体=结构(C(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L) ,.Label = “CHR1” 的class = “因子”),起始= C(10469L,20001L,20210L,21000L,21201L,22000L,22201L,23000L,23201L,20000L,20001L,24001L,24205L,24405L,0L,30855L ,30953L,31293L,31436L,31734L),结束= C(11447L,20200L,20400L,21200L,21400L,22200L,22400L,23200L,23400L,20001L,20200L,24200L,24400L,24600L,0L,30952L,31131L,31435L, 31733L,31754L),
(Left)
=结构(C(38L,37L,37L,37L,37L,37L,37L,37L,37L,37L,37L,37L,37L,37L,36L,35L,34L,33L,32L ,31L),.Label = C( “(249203529)”, “(249203899)”, “(249204128)”, “(249204381)”, “(249204423)”, “(249204634)”, “(249204868)” “(249205745)”, “(249205786)”, “(249208117)”, “(249208336)”, “(249209743)”, “(249209892)”, “(249209995)”, “(249210327)”,“ (249210697) “ ”(249210998)“, ”(249211157)“, ”(249212430)“, ”(249212760)“, ”(249213190)“, ”(249215122)“, ”(249215255)“,”(249215700 ) “ ”(249216061)“,”(2 49216513) “ ”(249216580)“, ”(249217112)“, ”(249217165)“, ”(249217584)“, ”(249218867)“, ”(249218888)“, ”(249219186)“,”(249219490) ”, “(249219669)”, “(249219773)”, “(249235266)”, “(249239174)”),类= “因子”),链=结构(C(2L,1L,1L,2L,2L, 2L,2L,2L,1L,2L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L),.Label = C( “+”, “C”)中,class = “因子” ),repName =结构(C(32L,23L,23L,23L,23L,23L,24L,23L,23L,23L,23L,23L,23L,23L,34L,33L,27L,26L,1L,26L)。标签= C( “AluJo”, “AluJr”, “AluSx”, “AluSz6”, “AluYc”, “AT_rich”, “Charlie5”, “ERVL-E-INT”, “L1M5”, “L1MA8”,“L1MA9 ”, “L1MB5”, “L1P1”, “L1PA6”, “L2a而”, “L2c的”, “LTR12F”, “LTR16C”, “MamRep1527”, “MER45A”, “MER58A”, “MIR”, “MIR3”, “MIR3a”, “MIRb”, “MIRC”, “MLT1A”, “MLT1E1A”, “MLT1E1A-INT”, “MLT1J2”, “(TAAA)n” 个, “TAR1”, “(TC)N”,“XXXXX “),类= “因子”),ValidMerge = C(FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE, FALSE,FALSE)),.Names = C(“CH romosome”, “开始”, “结束”, “(左)”, “海滩”, “repName”, “ValidMerge”),row.names = C(NA,20L)中,class = “data.frame”)
而合并后的输出
dput(RMDB.out.cut)
结构(列表(=染色体结构(C(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L),.Label = “CHR1”,类= “因子”),起始= C( “10469”, “20001”, “21000”, “22000”, “22201”, “23000”, “23201”, “20000”, “20001”, “24001” , “0”, “30855”, “30953”, “31293”, “31436”, “31734”),结束= C( “11447”, “20400”, “21400”, “22200”, “22400”, “23200”, “23400”, “20001”, “20200”, “24600”, “0”, “30952”, “31131”, “31435”, “31733”, “31754”),
(Left)
=结构(C(38L,37L,37L,37L,37L,37L,37L,37L,37L,37L,36L,35L,34L,33L,32L,31L),.Label = C( “(249203529)”,“(249203899 )”, “(249204128)”, “(249204381)”, “(249204423)”, “(249204634)”, “(249204868)”, “(249205745)”, “(249205786)”, “(249208117)” “(249208336)”, “(249209743)”, “(249209892)”, “(249209995)”, “(249210327)”, “(249210697)”, “(249210998)”, “(249211157)”,“ (249212430) “ ”(249212760)“, ”(249213190)“, ”(249215122)“, ”(249215255)“, ”(249215700)“, ”(249216061)“, ”(249216513)“,”(249216580 ) “ ”(249217112)“, ”(249217165)“,”(249217584 )”, “(249218867)”, “(249218888)”, “(249219186)”, “(249219490)”, “(249219669)”, “(249219773)”, “(249235266)”, “(249239174)” ),类= “因子”),链=结构(C(2L,1L,2L,2L,2L,2L,1L,2L,1L,1L,1L,1L,1L,1L,1L,1L),.Label = C( “+”, “C”)中,class = “因素”),repName =结构(C(32L,23L,23L,23L,24L,23L,23L,23L,23L,23L,34L,33L,27L ,26L,1L,26L),.Label = C( “AluJo”, “AluJr”, “AluSx”, “AluSz6”, “AluYc”, “AT_rich”, “Charlie5”, “ERVL-E-INT”,“ L1M5" , “L1MA8”, “L1MA9”, “L1MB5”, “L1P1”, “L1PA6”, “L2a而”, “L2c的”, “LTR12F”, “LTR16C”, “MamRep1527”, “MER45A”, “MER58A” , “MIR”, “MIR3”, “MIR3a”, “MIRb”, “MIRC”, “MLT1A”, “MLT1E1A”, “MLT1E1A-INT”, “MLT1J2”, “(TAAA)n” 个, “TAR1”, “(TC)N”, “XXXXX”),类= “因子”),ValidMerge = C(FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE ,FALSE,FALSE)),.Names = C( “染色体”, “开始”, “结束”, “(左)”, “海滩”, “repName”, “ValidMerge”),row.names = C(” 2" , “3”, “4”, “6”, “7”, “8”, “9”, “10”,“1 1" , “111”, “15”, “16”, “17”, “18”, “19”, “20”)中,class = “data.frame”)