conditionally assign first nth element in vector a

2019-03-02 10:44发布

问题:

I have a vector of randomly sampled numbers. For example,

vec1 <- sample(1:574)

I would like to assign first 25 percent value in this sampled vector as H and rest elements as L. I have tried using ifelse but that gives values 1 to 143 as H and values greater that 143 as L. But I want first 143 values to have H and rests as L. For elements below, I want first 143th elements (i.e.first 4 rows upto 44) will be H and all rests will be L.That means the first element which values 393 will get H, the second element 518 will aso get H.

> vec1
  [1] 393 518 179 358  78 185 168 386 321 114 163  18 217 302 191 167 465 427 342 422 406 144 183 438 546
 [26] 500 336 123  99 208 130   9 477 197  95  75 122 320 101 175 117 388 414 475 353 499  66 558 298 277
 [51]  35 522 293 343 165 194 563 482 219 274 104 164 484  11 333  67 180  57 221 470 211 447  63 212 148
 [76] 267 426 118  10  84 459 463  51  91 432 569 303 442 390 446 218  34 338 464 201 232 398 385 365 510
[101] 412 400  38 295 514 220 430 372  15 308 366 268 557 110 467 474  32 364 515  41 418 419 300 542 523
[126] 317 307 554 458 263  47 128 445 360 528  72 431 171 368  44 160 433 202 502  62 550 417  89 337 503
[151] 233 401 115  42 113 257 340 555 323  76 196 533  69 261  86 292  77 273 297 454 479 508   7  50 357
[176] 296 473 476 516 551  83 309 215 213 488 556 262 278 289 265 370 151 310   4 423 126 506  55 564 424
[201]  40  60 493 544  30 312  65 490 529 214 456 319  98 198 391 376 112 284 572 304 247 560 249 486 345
[226] 527 250 382 346 428  49 331 169 166 565 159 384 324 371 548 222 325 107 209 453 504 269 538 305   6
[251] 125  82 238 481 487 339 129 237 108 139 141 441 471 562 178 517 146 468 480 182 352 460  28 543 363
[276]  96   8 252 301 526 281 119  16  13 157 280 530 143 524 276 256 316 254 205 200 332 425 440 573   2
[301] 149 539 206 497  31 315 228   1 199 552  21 436 498 109  26 158 111  90 330 313 491  54 204 103 520
[326]  61 466 253 329  64 241 124  37  19 351 505 405 513 246 429 258 547 176 496 381 451 186 172 350 354
[351] 347 469 416 449  68 525 404 549 413 134 367 216 373 136 306 568 121 190 355 156 361 566 181 411 236
[376] 161  43 311 138 359  81 439 155 374 570 223 344   3 264  36  27  71 348 452 288 174 407  52 396 255
[401] 187 455 369 224 314 127 379  12  94 328 341 271  48 327 402 443 478 450  87  17 380 287 279  39  85
[426] 509  45 483 457 521 231 349 152 105  73 494 251 207 420 135 120 435  33  22 286 153 535 421 285 235
[451]  80  93 395 540 489 154 322  46  59 392 326 195 409  25 177 100 545  24 170 472 142 437  74 270 495
[476]  56 410  20 188 394 561 415  58 229  79 227 531 203 242 162 574 272 541   5 283 448  88 137 239 507
[501] 356 383 243 290 536 444 387 234  14 461 282  29 534 132 192 275 133 230 259 375 537 532  70 184 512
[526] 260  92 106 131 434 225 102 362 147 408 173 248 334 226 403 511 294 299 378 389 399 189 492 519 240
[551] 116 266 501  23 193 145 567 140  97 335  53 210 291 553 571 377 485 150 318 244 245 397 559 462
> 

回答1:

I think you mean this:

vec1 <- sample(1:574)
L = vec1[1:143]
H = vec1[144:length(vec1)]

Anyway by percentage you can do this:

vec1 <- sample(1:574)
num_to_25 <- floor(0.25*length(vec1))
L = vec1[1:num_to_25]
H = vec1[(num_to_25+1):length(vec1)]

With function you can Use:

assigner = function (input_vactor,percentage_of_L=25){
   num_to_25 <- floor(percentage_of_L*0.01*length(input_vactor))
   L = input_vactor[1:num_to_25]
   H = input_vactor[(num_to_25+1):length(input_vactor)]
   return (list(L=L,H=H))
}

And use it like this:

vec1 <- sample(1:574)
h = assigner(vec1)$H
l = assigner(vec1)$L

Edit for Edit: For your edited question change second function to this:

sensitivity.rand <- function(vector, threshold){
  num_to_thres <- floor(threshold*0.01*length(vector))
  l = length (vector)
  score = c(rep("L",num_to_thres),rep("H",l-num_to_thres))
  return(score)
} 


回答2:

caretpackage includes createDataPartition function, that will do the job:

library("caret")

SAMPLE_SIZE <- 0.25
SIZE_VECT <- 574

# Set seed for reproducibility
set.seed(pi)

vec1 <- sample(1:SIZE_VECT)

in_sample <- createDataPartition(y = vec1, p = SAMPLE_SIZE, list = FALSE)
H <- vec1[in_sample]
L <- vec1[-in_sample]
# Actual sample size
100 * length(H) / (length(L) + length(H))
#  [1] 25.08711