How does the Brainfuck Hello World actually work?

2019-03-07 10:44发布

Someone sent this to me and claimed it is a hello world in Brainfuck (and I hope so...)

++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.

I know the basics that it works by moving a pointer and increment and decrementing stuff...

Yet I still want to know, how does it actually work? How does it print anything on the screen in the first place? How does it encode the text? I do not understand at all...

6条回答
Bombasti
2楼-- · 2019-03-07 11:12

To answer the question of how it knows what to print, I have added the calculation of ASCII values to the right of the code where the printing happens:

> just means move to the next cell
< just means move to the previous cell
+ and - are used for increment and decrement respectively. The value of the cell is updated when the increment/decrement happens

+++++ +++++             initialize counter (cell #0) to 10

[                       use loop to set the next four cells to 70/100/30/10

> +++++ ++              add  7 to cell #1

> +++++ +++++           add 10 to cell #2 

> +++                   add  3 to cell #3

> +                     add  1 to cell #4

<<<< -                  decrement counter (cell #0)

]            

> ++ .                  print 'H' (ascii: 70+2 = 72) //70 is value in current cell. The two +s increment the value of the current cell by 2

> + .                   print 'e' (ascii: 100+1 = 101)

+++++ ++ .              print 'l' (ascii: 101+7 = 108)

.                       print 'l' dot prints same thing again

+++ .                   print 'o' (ascii: 108+3 = 111)

> ++ .                  print ' ' (ascii: 30+2 = 32)

<< +++++ +++++ +++++ .  print 'W' (ascii: 72+15 = 87)

> .                     print 'o' (ascii: 111)

+++ .                   print 'r' (ascii: 111+3 = 114)

----- - .               print 'l' (ascii: 114-6 = 108)

----- --- .             print 'd' (ascii: 108-8 = 100)

> + .                   print '!' (ascii: 32+1 = 33)

> .                     print '\n'(ascii: 10)
查看更多
对你真心纯属浪费
3楼-- · 2019-03-07 11:13

All the answers are thorough, but they lack one tiny detail: Printing. In building your brainfuck translator, you also consider the character ., this is actually what a printing statement looks like in brainfuck. So what your brainfuck translator should do is, whenever it encounters a . character it prints the currently pointed byte.

Example:

suppose you have --> char *ptr = [0] [0] [0] [97] [0]... if this is a brainfuck statement: >>>. your pointer should be moved 3 spaces to right landing at: [97], so now *ptr = 97, after doing that your translator encounters a ., it should then call

write(1, ptr, 1)

or any equivalent printing statement to print the currently pointed byte, which has the value 97 and the letter a will then be printed on the std_output.

查看更多
狗以群分
4楼-- · 2019-03-07 11:29

1. Basics

To understand Brainfuck you must imagine infinite array of cells initialized by 0 each.

...[0][0][0][0][0]...

When brainfuck program starts, it points to any cell.

...[0][0][*0*][0][0]...

If you move pointer right > you are moving pointer from cell X to cell X+1

...[0][0][0][*0*][0]...

If you increase cell value + you get:

...[0][0][0][*1*][0]...

If you increase cell value again + you get:

...[0][0][0][*2*][0]...

If you decrease cell value - you get:

...[0][0][0][*1*][0]...

If you move pointer left < you are moving pointer from cell X to cell X-1

...[0][0][*0*][1][0]...

2. Input

To read character you use comma ,. What it does is: Read character from standard input and write its decimal ASCII code to the actual cell.

Take a look at ASCII table. For example, decimal code of ! is 33, while a is 97.

Well, lets imagine your BF program memory looks like:

...[0][0][*0*][0][0]...

Assuming standard input stands for a, if you use comma , operator, what BF does is read a decimal ASCII code 97 to memory:

...[0][0][*97*][0][0]...

You generally want to think that way, however the truth is a bit more complex. The truth is BF does not read a character but a byte (whatever that byte is). Let me show you example:

In linux

$ printf ł

prints:

ł

which is specific polish character. This character is not encoded by ASCII encoding. In this case it's UTF-8 encoding, so it used to take more than one byte in computer memory. We can prove it by making a hexadecimal dump:

$ printf ł | hd

which shows:

00000000  c5 82                                             |..|

Zeroes are offset. 82 is first and c5 is second byte representing ł (in order we will read them). |..| is graphical representation which is not possible in this case.

Well, if you pass ł as input to your BF program that reads single byte, program memory will look like:

...[0][0][*197*][0][0]...

Why 197 ? Well 197 decimal is c5 hexadecimal. Seems familiar ? Of course. It's first byte of ł !

3. Output

To print character you use dot . What it does is: Assuming we treat actual cell value like decimal ASCII code, print corresponding character to standard output.

Well, lets imagine your BF program memory looks like:

...[0][0][*97*][0][0]...

If you use dot (.) operator now, what BF does is print:

a

Because a decimal code in ASCII is 97.

So for example BF program like this (97 pluses 2 dots):

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++..

Will increase value of the cell it points to up to 97 and print it out 2 times.

aa

4. Loops

In BF loop consists of loop begin [ and loop end ]. You can think it's like while in C/C++ where the condition is actual cell value.

Take a look BF program below:

++[]

++ increments actual cell value twice:

...[0][0][*2*][0][0]...

And [] is like while(2) {}, so it's infinite loop.

Let's say we don't want this loop to be infinite. We can do for example:

++[-]

So each time a loop loops it decrements actual cell value. Once actual cell value is 0 loop ends:

...[0][0][*2*][0][0]...        loop starts
...[0][0][*1*][0][0]...        after first iteration
...[0][0][*0*][0][0]...        after second iteration (loop ends)

Let's consider yet another example of finite loop:

++[>]

This example shows, we haven't to finish loop at cell that loop started on:

...[0][0][*2*][0][0]...        loop starts
...[0][0][2][*0*][0]...        after first iteration (loop ends)

However it is good practice to end where we started. Why ? Because if loop ends another cell it started, we can't assume where the cell pointer will be. To be honest, this practice makes brainfuck less brainfuck.

查看更多
仙女界的扛把子
5楼-- · 2019-03-07 11:29

Wikipedia has a commented version of the code.

+++++ +++++             initialize counter (cell #0) to 10
[                       use loop to set the next four cells to 70/100/30/10
    > +++++ ++              add  7 to cell #1
    > +++++ +++++           add 10 to cell #2 
    > +++                   add  3 to cell #3
    > +                     add  1 to cell #4
    <<<< -                  decrement counter (cell #0)
]                   
> ++ .                  print 'H'
> + .                   print 'e'
+++++ ++ .              print 'l'
.                       print 'l'
+++ .                   print 'o'
> ++ .                  print ' '
<< +++++ +++++ +++++ .  print 'W'
> .                     print 'o'
+++ .                   print 'r'
----- - .               print 'l'
----- --- .             print 'd'
> + .                   print '!'
> .                     print '\n'

To answer your questions, the , and . characters are used for I/O. The text is ASCII.

The Wikipedia article goes on in some more depth, as well.

The first line initialises a[0] = 10 by simply incrementing ten times from 0. The loop from line 2 effectively sets the initial values for the array: a[1] = 70 (close to 72, the ASCII code for the character 'H'), a[2] = 100 (close to 101 or 'e'), a[3] = 30 (close to 32, the code for space) and a[4] = 10 (newline). The loop works by adding 7, 10, 3, and 1, to cells a[1], a[2], a[3] and a[4] respectively each time through the loop - 10 additions for each cell in total (giving a[1]=70 etc.). After the loop is finished, a[0] is zero. >++. then moves the pointer to a[1], which holds 70, adds two to it (producing 72, which is the ASCII character code of a capital H), and outputs it.

The next line moves the array pointer to a[2] and adds one to it, producing 101, a lower-case 'e', which is then output.

As 'l' happens to be the seventh letter after 'e', to output 'll' another seven are added (+++++++) to a[2] and the result is output twice.

'o' is the third letter after 'l', so a[2] is incremented three more times and output the result.

The rest of the program goes on in the same way. For the space and capital letters, different array cells are selected and incremented or decremented as needed.

查看更多
来,给爷笑一个
6楼-- · 2019-03-07 11:31

I think what you are asking is how does Brainfuck know what to do with all the code. There is a parser written in a higher level language such as Python to interpret what a dot means, or what an addition sign means in the code.

So the parser will read your code line by line, and say ok there is a > symbol so i have to advance memory location, the code is simply, if (contents in that memory location) == >, memlocation =+ memlocation which is written in a higher level language, similarly if (content in memory location) == ".", then print (contents of memory location).

Hope this clears it up. tc

查看更多
疯言疯语
7楼-- · 2019-03-07 11:38

Brainfuck same as its name. It uses only 8 characters > [ . ] , - + which makes it the quickest programming language to learn but hardest to implement and understand. ….and makes you finally end up with f*cking your brain.

It stores values in array: [72 ][101 ][108 ][111 ]

let, initially pointer pointing to cell 1 of array:

  1. > move pointer to right by 1

  2. < move pointer to left by 1

  3. + increment the value of cell by 1

  4. - increment the value of element by 1

  5. . print value of current cell.

  6. , take input to current cell.

  7. [ ] loop, +++[ -] counter of 3 counts bcz it have 3 ′+’ before it, and - decrements count variable by 1 value.

the values stored in cells are ascii values:

so referring to above array: [72 ][101 ][108 ][108][111 ] if you match the ascii values you’ll find that it is Hello writtern

Congrats! you have learned the syntax of BF

——-Something more ———

let us make our first program i.e Hello World, after which you’re able to write your name in this language.

+++++ +++++[> +++++ ++ >+++++ +++++ >+++ >+ <<<-]>++.>+.+++++ ++..+++.++.+++++ +++++ +++++.>.+++.----- -.----- ---.>+.>.

breaking into pieces:

+++++ +++++[> +++++ ++ 
                  >+++++ +++++ 
                  >+++ 
                  >+ 
                  <<<-]

Makes an array of 4 cells(number of >) and sets a counter of 10 something like : —-psuedo code—-

array =[7,10,3,1]
i=10
while i>0:
 element +=element
 i-=1

because counter value is stored in cell 0 and > moves to cell 1 updates its value by+7 > moves to cell 2 increments 10 to its previous value and so on….

<<< return to cell 0 and decrements its value by 1

hence after loop completion we have array : [70,100,30,10]

>++. 

moves to 1st element and increment its value by 2(two ‘+’) and then prints(‘.’) character with that ascii value. i.e for example in python: chr(70+2) # prints 'H'

>+.

moves to 2nd cell increment 1 to its value 100+1 and prints(‘.’) its value i.e chr(101) chr(101) #prints ‘e’ now there is no > or < in next piece so it takes present value of latest element and increment to it only

+++++ ++..

latest element = 101 therefore, 101+7 and prints it twice(as there are two‘..’) chr(108) #prints l twice can be used as

for i in array:
    for j in range(i.count(‘.’)):
           print_value

———Where is it used?——-

It is just a joke language made to challenge programmers and is not used practically anywhere.

查看更多
登录 后发表回答