I'm trying to understand the concept of Dynamic Programming, via the course on MIT OCW here. The explanation on OCW video is great and all, but I feel like I don't really understand it until I implemented the explanation into code. While implementiing, I refer to some notes from the lecture note here, particularly page 3 of the note.
The problem is, I have no idea how to translate some of the mathematical notation to code. Here's some part of the solution I've implemented (and think it's implemented right):
import math
paragraph = "Some long lorem ipsum text."
words = paragraph.split(" ")
# Count total length for all strings in a list of strings.
# This function will be used by the badness function below.
def total_length(str_arr):
total = 0
for string in str_arr:
total = total + len(string)
total = total + len(str_arr) # spaces
return total
# Calculate the badness score for a word.
# str_arr is assumed be send as word[i:j] as in the notes
# we don't make i and j as argument since it will require
# global vars then.
def badness(str_arr, page_width):
line_len = total_length(str_arr)
if line_len > page_width:
return float('nan')
else:
return math.pow(page_width - line_len, 3)
Now the part I don't understand is on point 3 to 5 in the lecture notes. I literally don't understand and don't know where to start implementing those. So far, I've tried iterating the list of words, and counting the badness of each allegedly end of line, like this:
def justifier(str_arr, page_width):
paragraph = str_arr
par_len = len(paragraph)
result = [] # stores each line as list of strings
for i in range(0, par_len):
if i == (par_len - 1):
result.append(paragraph)
else:
dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)]
# Should I do a min(dag), get the index, and declares it as end of line?
But then, I don't know how I can continue the function, and to be honest, I don't understand this line:
dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)]
and how I'll return justifier
as an int
(since I already decided to store return value in result
, which is a list. Should I make another function and recurse from there? Should there be any recursion at all?
Could you please show me what to do next, and explain how this is dynamic programming? I really can't see where the recursion is, and what the subproblem is.
Thanks before.
Java Implementation Given the maximum line width as L, the idea to justify the Text T, is to consider all suffixes of the Text (consider words instead of characters for forming suffixes to be precise.) Dynamic Programming is nothing but "Careful Brute-force". If you consider the brute force approach, you need to do the following.
Instead lets just consider the problem to find out the cost of putting a word at the beginning of a line. In general we can define DP(i) to be the cost for considering (i- 1)th word as the beginning of a Line.
How can we form a recurrence relation for DP(i)?
If jth word is the beginning of the next line, then the current line will contain words[i:j) (j exclusive) and the cost of jth word being the beginning of the next line will be DP(j). Hence DP(i) = DP(j) + cost of putting words[i:j) in the current line As we want to minimise the total cost, DP(i) can be defined as follows.
Recurrence relation:
Note j = n signify that no words are left to be put in the next line.
The base Case: DP(n) = 0 => at this point there is no word left to be written.
To summarise:
Now even though we derived the minimum cost for justifying the text, we also need to solve the original problem by keeping track of the j value for chosen as minimum in the above expression, so that we can later use the same to print out the justified text. The idea is of keeping parent pointer.
Hope this helps you understand the solution. Below is the simple implementation of the above idea.
In case you have trouble understanding the core idea of dynamic programming itself here is my take on it:
Dynamic programming is essentially sacrificing space complexity for time complexity (but the extra space you use is usually very little compared to the time you save, making dynamic programming totally worth it if implemented correctly). You store the values from each recursive call as you go (e.g. in an array or a dictionary) so you can avoid computing for the second time when you run into the same recursive call in another branch of the recursion tree.
And no you do not have to use recursion. Here is my implementation of the question you were working on using just loops. I followed the TextAlignment.pdf linked by AlexSilva very closely. Hopefully you find this helpful.
For anyone else still interested in this: The key is to move backwards from the end of the text (as mentioned here). If you do so, you just compare already memorized elements.
Say,
words
is a list of strings to be wrapped according totextwidth
. Then, in the notation of the lecture, the task reduces to three lines of code:With:
He mentions that one can add a second list to keep track of the breaking positions. You can do so by altering to code to:
To recover the text, just use the list of breaking positions:
Result: (
text = reconstruct_text(words,breaks)
)One might be tempted to add some whitespaces. This is pretty tricky (since one might come up with various aesthetic rules) but a naive try might be:
What gives you: (
text = spacing(text,textwidth)
)This is what I think according to your definition.
i just saw the lecture and thought would put here whatever I could understand. I have put in the code in the similar format as that of the questioner. I have used recursion here, as the lecture had explained.
Point #3, defines recurrence. This is basically a bottom to approach, where in you calculate a value of the function pertaining to a higher input earlier, and then use it to calculate the for the lower valued input.
The lecture explains it as :
DP(i) = min(DP(j) + badness(i, j))
for j which varies from i+1 to n.
Here, i varies from n to 0 (bottom to top!).
As DP(n) = 0 ,
DP(n-1) = DP(n) + badness(n-1, n)
and then you calculate D(n-2) from D(n-1) and D(n) and take a minimum out of them.
This way you can go down till i=0 and that's the final answer of badness!
In point #4, as you can see, there are two loops going on here. One for i and the other inside i for j.
Hence, when i=0, j(max) = n, i = 1, j(max) = n-1, ... i = n , j(max) = 0.
hence total time = addition of these = n(n+1)/2.
Hence O(n^2).
Point #5 just identifies the solution which DP[0]!
Hope this helps!