Number of Increasing Subsequences of length k

I am trying to understand the algorithm that gives me the number of increasing subsequences of length K in an array in time O(nklog(n)). I know how to solve this very same problem using the O(k*n^2) algorithm. I have looked up and found out this solution uses BIT (Fenwick Tree) and DP. I have also found some code, but I have not been able to understand it.

Here are some links I've visited that have been helpful.

Here in SO
Topcoder forum
Random webpage

I would really appreciate if some can help me out understand this algorithm.

I am reproducing my algorithm from here, where its logic is explained:

dp[i, j] = same as before num[i] = how many subsequences that end with i (element, not index this time) 
         have a certain length

for i = 1 to n do   dp[i, 1] = 1

for p = 2 to k do // for each length this time   num = {0}

  for i = 2 to n do
    // note: dp[1, p > 1] = 0 

    // how many that end with the previous element
    // have length p - 1
    num[ array[i - 1] ] += dp[i - 1, p - 1] *1*   

    // append the current element to all those smaller than it
    // that end an increasing subsequence of length p - 1,
    // creating an increasing subsequence of length p
    for j = 1 to array[i] - 1 do *2*       
      dp[i, p] += num[j]

You can optimize *1* and *2* by using segment trees or binary indexed trees. These will be used to efficiently process the following operations on the num array:

Given (x, v) add v to num[x] (relevant for *1*);
Given x, find the sum num[1] + num[2] + ... + num[x] (relevant for *2*).

These are trivial problems for both data structures.

Note: This will have complexity O(n*k*log S), where S is the upper bound on the values in your array. This may or may not be good enough. To make it O(n*k*log n), you need to normalize the values of your array prior to running the above algorithm. Normalization means converting all of your array values into values lower than or equal to n. So this: