do we need volitile when implementing singleton us

suppose we use double-check lock to implement singleton pattern:

    private static Singleton instance;

    private static Object lock = new Object();

    public static Singleton getInstance() {
        if(instance == null) {
            synchronized (lock) {
                if(instance == null) {
                    instance = new Singleton();
                }
            }
        }
        return instance;
    }

Do we need to set variable "instance" as "volitile"? I hear a saying that we need it to disable reordering:

When an object is created , reordering may happen:

address=alloc
instance=someAddress
init(someAddress)

They say that if the last two steps are reordered, we need a votiled instance to disable reordering otherwise other threads may get an object that is not completely initiaized.

However since we are in a synchronized code block, do we really need voltile? Or generally, can I say that synchronized block can guarantee that a shared variable is transparent to other threads and there is no reordering even if it is not volatiled variable?

Before I go into this explanation, you need to understand one optimization that compilers do (my explanation is very simplified). Suppose that somewhere in your code you have such a sequence:

 int x = a;
 int y = a;

It is perfectly valid for a compiler to re-order these into:

 // reverse the order
 int y = a;
 int x = a;

No one writes to a here, there are only two reads of a, as such this type of re-ordering is allowed.

A slightly more complicated example would be:

// someone, somehow sets this
int a;

public int test() {

    int x = a;

    if(x == 4) {
       int y = a;
       return y;
    }

    int z = a;
    return z;
}

A compiler might look at this code and notice that if this is entered if(x == 4) { ... }, this : int z = a; never happens. But, at the same time, you can think about it slightly different: if that if statement is entered, we do not care if int z = a; is executed or not, it does not change the fact that:

 int y = a;
 return y;

would still happen. As such let's make that int z = a; to be eager:

public int test() {

   int x = a;
   int z = a; // < --- this jumped in here

   if(x == 4) {
       int y = a;
       return y;
   }

   return z;
}

And now a compiler can further re-order:

// < --- these two have switched places
int z = a;
int x = a;

if(x == 4) { ... }

Armed with this knowledge, we can try to understand now what is going on.

Let's look at your example:

 private static Singleton instance; // non-volatile     

 public static Singleton getInstance() {
    if (instance == null) {  // < --- read (1)
        synchronized (lock) {
            if (instance == null) { // < --- read (2)
                instance = new Singleton(); // < --- write 
            }
        }
    }
    return instance; // < --- read (3)
}

There are 3 reads of instance (also called load) and a single write to it (also called store). As weird at it may sound, but if read (1) has seen an instance that is not null (meaning that if (instance == null) { ... } is not entered), it does not mean that read (3) will return a non-null instance, it is perfectly valid for read (3) to still return null. This should melt your brain (it did mine a few times). Fortunately, there is a way to prove this.

A compiler might add such a small optimization to your code:

public static Singleton getInstance() {
    if (instance == null) {
        synchronized (lock) {
            if (instance == null) {
                instance = new Singleton();
                // < --- we added this
                return instance;
            }
        }
    }
    return instance;
}

It inserted a return instance, semantically this does not change the logic of the code in any way.

Then, there is a certain optimization that compilers do that will help us here. I am not going to dive into the details, but it introduces some local fields (the benefit is in that link) to do all the reads and writes (stores and loads).

public static Singleton getInstance() {
    Singleton local1 = instance;   // < --- read (1)
    if (local1 == null) {
        synchronized (lock) {
            Singleton local2 = instance; // < --- read (2)
            if (local2 == null) {
                Singleton local3 = new Singleton();
                instance = local3; // < --- write (1)
                return local3;
            }
        }
    }

    Singleton local4 = instance; // < --- read (3)
    return local4;
}

Now a compiler might look at this and see that: if if (local2 == null) { ... } is entered, Singleton local4 = instance; never happens (or as said in the example I started this answer with: it does not really matter if Singleton local4 = instance; happens at all). But in order to enter if (local2 == null) {...}, we need to enter this if (local1 == null) { ... } first. And now let's reason about this as a whole:

if (local1 == null) { ... } NOT ENTERED => NEED to do : Singleton local4 = instance

if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } NOT ENTERED 
=> MUST DO : Singleton local4 = instance. 

if (local1 == null) { ... } ENTERED && if (local2 == null) { ... } ENTERED
=> CAN DO : Singleton local4 = instance.  (remember it does not matter if I do it or not)

You can see that in all the cases, there is no harm in doing that : Singleton local4 = instance before any if checks.

After all this madness, your code could become:

 public static Singleton getInstance() {

    Singleton local4 = instance; // < --- read (3)
    Singleton local1 = instance;   // < --- read (1)

    if (local1 == null) {
        synchronized (lock) {
            Singleton local2 = instance; // < --- read (2)
            if (local2 == null) {
                Singleton local3 = new Singleton();
                instance = local3; // < --- write (1)
                return local3;
            }
        }
    }

    return local4;
}

There are two independent reads of instance here:

Singleton local4 = instance; // < --- read (3)
Singleton local1 = instance;   // < --- read (1)

if(local1 == null) {
   ....
}

return local4;

You read instance into local4 (let's suppose a null), then you read instance into local1 (let's assume that some thread already changed this into a non-null) and ... your getInstance will return a null, not a Singleton. q.e.d.

Conclusion: these optimizations are only possible when private static Singleton instance; is non-volatile, otherwise much of the optimization are prohibited and nothing like this would be even possible. So, yes, using volatile is a MUST for this pattern to work correctly.