An interview question: About Probability

2019-03-07 16:54发布

An interview question:

Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1. Write a function g(x) using f(x) that 1/2 times returns 0, 1/2 times returns 1.

My implementation is:

function g(x) = {
    if (f(x) == 0){ // 1/4 
        var s = f(x) 
        if( s == 1) {// 3/4 * 1/4
            return s  //   3/16
        } else {
            g(x)
        } 
    } else { // 3/4
            var k = f(x)
            if( k == 0) {// 1/4 * 3/4
                return k // 3/16 
            }  else {
                g(x)
            }       
    }
}

Am I right? What's your solution?(you can use any language)

10条回答
疯言疯语
2楼-- · 2019-03-07 17:22

Here is a solution based on central limit theorem, originally due to a friend of mine:

/*
Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1. Write a function g(x) using f(x) that 1/2 times returns 0, 1/2 times returns 1.
*/
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <cstdio>
using namespace std;

int f() {
  if (rand() % 4 == 0) return 0;
  return 1;
}

int main() {
  srand(time(0));
  int cc = 0;
  for (int k = 0; k < 1000; k++) { //number of different runs
    int c = 0;
    int limit = 10000; //the bigger the limit, the more we will approach %50 percent
    for (int i=0; i<limit; ++i) c+= f();
    cc += c < limit*0.75 ? 0 : 1; // c will be 0, with probability %50
  }
  printf("%d\n",cc); //cc is gonna be around 500
  return 0;
}
查看更多
走好不送
3楼-- · 2019-03-07 17:24

A refinement of the same approach used in btilly's answer, achieving an average ~1.85 calls to f() per g() result (further refinement documented below achieves ~1.75, tbilly's ~2.6, Jim Lewis's accepted answer ~5.33). Code appears lower in the answer.

Basically, I generate random integers in the range 0 to 3 with even probability: the caller can then test bit 0 for the first 50/50 value, and bit 1 for a second. Reason: the f() probabilities of 1/4 and 3/4 map onto quarters much more cleanly than halves.


Description of algorithm

btilly explained the algorithm, but I'll do so in my own way too...

The algorithm basically generates a random real number x between 0 and 1, then returns a result depending on which "result bucket" that number falls in:

result bucket      result
         x < 0.25     0
 0.25 <= x < 0.5      1
 0.5  <= x < 0.75     2
 0.75 <= x            3

But, generating a random real number given only f() is difficult. We have to start with the knowledge that our x value should be in the range 0..1 - which we'll call our initial "possible x" space. We then hone in on an actual value for x:

  • each time we call f():
    • if f() returns 0 (probability 1 in 4), we consider x to be in the lower quarter of the "possible x" space, and eliminate the upper three quarters from that space
    • if f() returns 1 (probability 3 in 4), we consider x to be in the upper three-quarters of the "possible x" space, and eliminate the lower quarter from that space
    • when the "possible x" space is completely contained by a single result bucket, that means we've narrowed x down to the point where we know which result value it should map to and have no need to get a more specific value for x.

It may or may not help to consider this diagram :-):

    "result bucket" cut-offs 0,.25,.5,.75,1

    0=========0.25=========0.5==========0.75=========1 "possible x" 0..1
    |           |           .             .          | f() chooses x < vs >= 0.25
    |  result 0 |------0.4375-------------+----------| "possible x" .25..1
    |           | result 1| .             .          | f() chooses x < vs >= 0.4375
    |           |         | .  ~0.58      .          | "possible x" .4375..1
    |           |         | .    |        .          | f() chooses < vs >= ~.58
    |           |         ||.    |    |   .          | 4 distinct "possible x" ranges

Code

int g() // return 0, 1, 2, or 3                                                 
{                                                                               
    if (f() == 0) return 0;                                                     
    if (f() == 0) return 1;                                                     
    double low = 0.25 + 0.25 * (1.0 - 0.25);                                    
    double high = 1.0;                                                          

    while (true)                                                                
    {                                                                           
        double cutoff = low + 0.25 * (high - low);                              
        if (f() == 0)                                                           
            high = cutoff;                                                      
        else                                                                    
            low = cutoff;                                                       

        if (high < 0.50) return 1;                                              
        if (low >= 0.75) return 3;                                              
        if (low >= 0.50 && high < 0.75) return 2;                               
    }                                                                           
}

If helpful, an intermediary to feed out 50/50 results one at a time:

int h()
{
    static int i;
    if (!i)
    {
        int x = g();
        i = x | 4;
        return x & 1;
    }
    else
    {
        int x = i & 2;
        i = 0;
        return x ? 1 : 0;
    }
}

NOTE: This can be further tweaked by having the algorithm switch from considering an f()==0 result to hone in on the lower quarter, to having it hone in on the upper quarter instead, based on which on average resolves to a result bucket more quickly. Superficially, this seemed useful on the third call to f() when an upper-quarter result would indicate an immediate result of 3, while a lower-quarter result still spans probability point 0.5 and hence results 1 and 2. When I tried it, the results were actually worse. A more complex tuning was needed to see actual benefits, and I ended up writing a brute-force comparison of lower vs upper cutoff for second through eleventh calls to g(). The best result I found was an average of ~1.75, resulting from the 1st, 2nd, 5th and 8th calls to g() seeking low (i.e. setting low = cutoff).

查看更多
虎瘦雄心在
4楼-- · 2019-03-07 17:26

The problem with your algorithm is that it repeats itself with high probability. My code:

function g(x) = {
    var s = f(x) + f(x) + f(x); 
    // s = 0, probability:  1/64
    // s = 1, probability:  9/64
    // s = 2, probability: 27/64
    // s = 3, probability: 27/64
    if (s == 2) return 0;
    if (s == 3) return 1;

    return g(x); // probability to go into recursion = 10/64, with only 1 additional f(x) calculation
}

I've measured average number of times f(x) was calculated for your algorithm and for mine. For yours f(x) was calculated around 5.3 times per one g(x) calculation. With my algorithm this number reduced to around 3.5. The same is true for other answers so far since they are actually the same algorithm as you said.

P.S.: your definition doesn't mention 'random' at the moment, but probably it is assumed. See my other answer.

查看更多
5楼-- · 2019-03-07 17:26

This is much like the Monty Hall paradox.

In general.

Public Class Form1

    'the general case
    '
    'twiceThis = 2 is 1 in four chance of 0
    'twiceThis = 3 is 1 in six chance of 0
    '
    'twiceThis = x is 1 in 2x chance of 0

    Const twiceThis As Integer = 7
    Const numOf As Integer = twiceThis * 2

    Private Sub Button1_Click(ByVal sender As System.Object, _
                              ByVal e As System.EventArgs) Handles Button1.Click

        Const tries As Integer = 1000
        y = New List(Of Integer)

        Dim ct0 As Integer = 0
        Dim ct1 As Integer = 0
        Debug.WriteLine("")
        ''show all possible values of fx
        'For x As Integer = 1 To numOf
        '    Debug.WriteLine(fx)
        'Next

        'test that gx returns 50% 0's and 50% 1's
        Dim stpw As New Stopwatch
        stpw.Start()
        For x As Integer = 1 To tries
            Dim g_x As Integer = gx()
            'Debug.WriteLine(g_x.ToString) 'used to verify that gx returns 0 or 1 randomly
            If g_x = 0 Then ct0 += 1 Else ct1 += 1
        Next
        stpw.Stop()
        'the results
        Debug.WriteLine((ct0 / tries).ToString("p1"))
        Debug.WriteLine((ct1 / tries).ToString("p1"))
        Debug.WriteLine((stpw.ElapsedTicks / tries).ToString("n0"))

    End Sub

    Dim prng As New Random
    Dim y As New List(Of Integer)

    Private Function fx() As Integer

        '1 in numOf chance of zero being returned
        If y.Count = 0 Then
            'reload y
            y.Add(0) 'fx has only one zero value
            Do
                y.Add(1) 'the rest are ones
            Loop While y.Count < numOf
        End If
        'return a random value 
        Dim idx As Integer = prng.Next(y.Count)
        Dim rv As Integer = y(idx)
        y.RemoveAt(idx) 'remove the value selected
        Return rv

    End Function

    Private Function gx() As Integer

        'a function g(x) using f(x) that 50% of the time returns 0
        '                           that 50% of the time returns 1
        Dim rv As Integer = 0
        For x As Integer = 1 To twiceThis
            fx()
        Next
        For x As Integer = 1 To twiceThis
            rv += fx()
        Next
        If rv = twiceThis Then Return 1 Else Return 0

    End Function
End Class
查看更多
登录 后发表回答