Confuse about the result of my check null value co

2019-09-11 23:40发布

问题:

I tried this to check whether a row is null or not.

package org.apache.spark.h2o.utils

import water.fvec.{NewChunk, Frame, Chunk}
import water._

class Miss extends MRTask {
  override def map(c: Chunk, nc: NewChunk): Unit = {
    for (row <- 0 until c.len()) {
      if(c.atd(row) == 0){  
       nc.addNum(0)
      }
      else
       nc.addNum(1)
    }
  }
}

And I can not understand the result of my code here

           A    B    C    D            E   check
    min                                     0
   mean                                     0
 stddev                                     0
    max                                     1
missing                                     0
      0  5.1  3.5  1.4  0.2  Iris-setosa    1
      1  4.9    3  1.4  0.2  Iris-setosa    1
      2  4.7  3.2  1.3  0.2  Iris-setosa    1
      3  4.6  3.1  1.5  0.2  Iris-setosa    1
      4    5  3.6  1.4  0.2  Iris-setosa    1
      5  5.4  3.9  1.7  0.4  Iris-setosa    1
      6  4.6  3.4  1.4  0.3  Iris-setosa    1
      7    5  3.4  1.5  0.2  Iris-setosa    1
      8  4.4  2.9  1.4  0.2  Iris-setosa    1
      9  4.9  3.1  1.5  0.1  Iris-setos...

In the code generate check column, Why my max row is 1? I'm new to h2oFrame, can anyone help me understand this? IS there something wrong with my code? Thx

回答1:

you are appending a new column which includes only 0,1 values. Hence minimum value stored in the column is 0. Maximum value stored in the column is 1. In this case, mean==0 is suspicious, that's probably a bug.



标签: scala h2o