How do I interpret the results from dieharder for

This is a question about an SO question; I don't think it belongs in meta despite being sp by definition, but if someone feels it should go to math, cross-validated, etc., please let me know.

Background: @ForceBru asked this question about how to generate a 64 bit random number using rand(). @nwellnhof provided an answer that was accepted that basically takes the low 15 bits of 5 random numbers (because MAXRAND is apparently only guaranteed to be 15bits on at least some compilers) and glues them together and then drops the first 11 bits (15*5-64=11). @NikBougalis made a comment that while this seems reasonable, it won't pass many statistical tests of randomnes. @Foon (me) asked for a citation or an example of a test that it would fail. @NikBougalis replied with an answer that didn't elucidate me; @DavidSwartz suggested running it against dieharder.

So, I ran dieharder. I ran it against the algorithm in question

unsigned long long llrand() {
    unsigned long long r = 0;

    for (int i = 0; i < 5; ++i) {
        r = (r << 15) | (rand() & 0x7FFF);
    }

    return r & 0xFFFFFFFFFFFFFFFFULL;
}

For comparison, I also ran it against just rand() and just 8bits of rand() at at time.

void rand_test()
{
int x;
srand(1);
  while(1)
    {
      x = rand();
      fwrite(&x,sizeof(x),1,stdout);
    }

void rand_byte_test()
{
  srand(1);
  while(1)
    {
      x = rand();
      c = x % 256;
      fwrite(&c,sizeof(c),1,stdout);
    }
}

The algorithm under question came back with two tests showing weakenesses for rgb_lagged_sum for ntuple=28 and one of the sts_serials for ntuple=8.

The just using rand() failed horribly on many tests, presumably because I'm taking a number that has 15 bits of randomness and passing it off as 32 bits of randomness.

The using the low 8 bits of rand() at a time came back as weak for rgb_lagged_sum with ntuple 2, and (edit) failed dab_monobit, with tuple 12

My question(s) is:

Am I interpretting the results for 8 bits of randomly correctly, namely that given that one of the tests (which was marked as "good"; for the record, it also came back as weak for one of the dieharder tests marked "suspect"), came as weak and one as failed, rand()'s randomness should be suspected.
Am I interpretting the results for the algorithm under test correctly (namely that this should also be marginally suspected)
Given the description of what the tests that came back as weak do (e.g for sts_serial looks at whether the distribution of bit patterns of a certain size is valid), should I be able to determine what the bias likely is
If 3, since I'm not, can someone point out what I should be seeing?

Edit: understood that rand() isn't guaranteed to be great. Also, I tried to think what values would be less likely, and surmised zero, maxvalue, or repeated numbers might be... but doing a test of 1000000000 tries, the ratio is very near the expected value of 1 out of every 2^15 times (e.g., in 1000000000 runs, we saw 30512 zeros, 30444 max, and 30301 repeats, and bc says that 30512 * 2^15 is 999817216; other runs had similar ratios including cases where max and/or repeat was larger than zeros.

When you run dieharder the column you really need to watch is the p-value column.

The p-value column essentially says: "This is the probability that real random numbers could have produced this result." You want it to be uniformly distributed between 0 and 1.

You'll also want to run it multiple times on suspect cases. For instance, if you have a column with a p-value of (for instance) .03 then if you re-run it, you still have .03 (rather than some higher value) then you can have a high confidence that your random number generator performs poorly on that test and it's not just a 3% fluke. However, if you get a high value, then you're probably looking at a statistical fluke. But it cuts both ways.

Ultimately, knowing facts about random or pseudorandom processes is difficult. But armed with dieharder you have approximate knowledge of many things.