I'm trying to make an exercise from Udacity's Full Stack Foundations course. I have the do_POST
method inside my subclass from BaseHTTPRequestHandler
, basically I want to get a post value named message
submitted with a multipart form, this is the code for the method:
def do_POST(self):
try:
if self.path.endswith("/Hello"):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers
ctype, pdict = cgi.parse_header(self.headers['content-type'])
if ctype == 'multipart/form-data':
fields = cgi.parse_multipart(self.rfile, pdict)
messagecontent = fields.get('message')
output = ""
output += "<html><body>"
output += "<h2>Ok, how about this?</h2>"
output += "<h1>{}</h1>".format(messagecontent)
output += "<form method='POST' enctype='multipart/form-data' action='/Hello'>"
output += "<h2>What would you like to say?</h2>"
output += "<input name='message' type='text'/><br/><input type='submit' value='Submit'/>"
output += "</form></body></html>"
self.wfile.write(output.encode('utf-8'))
print(output)
return
except:
self.send_error(404, "{}".format(sys.exc_info()[0]))
print(sys.exc_info() )
The problem is that the cgi.parse_multipart(self.rfile, pdict)
is throwing an exception: TypeError: can't concat bytes to str
, the implementation was provided in the videos for the course, but they're using Python 2.7 and I'm using python 3, I've looked for a solution all afternoon but I could not find anything useful, what would be the correct way to read data passed from a multipart form in python 3?
I've came across here to solve the same problem like you have.
I found a silly solution for that.
I just convert 'boundary' item in the dictionary from string to bytes with an encoding option.
ctype, pdict = cgi.parse_header(self.headers['content-type'])
pdict['boundary'] = bytes(pdict['boundary'], "utf-8")
if ctype == 'multipart/form-data':
fields = cgi.parse_multipart(self.rfile, pdict)
In my case, It seems work properly.
To change the tutor's code to work for Python 3 there are three error messages you'll have to combat:
If you get these error messages
c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
AttributeError: 'HTTPMessage' object has no attribute 'getheader'
or
boundary = pdict['boundary'].decode('ascii')
AttributeError: 'str' object has no attribute 'decode'
or
headers['Content-Length'] = pdict['CONTENT-LENGTH']
KeyError: 'CONTENT-LENGTH'
when running
c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
if c_type == 'multipart/form-data':
fields = cgi.parse_multipart(self.rfile, p_dict)
message_content = fields.get('message')
this applies to you.
Solution
First of all change the first line to accommodate Python 3:
- c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
+ c_type, p_dict = cgi.parse_header(self.headers.get('Content-Type'))
Secondly, to fix the error of 'str' object not having any attribute 'decode', it's because of the change of strings being turned into unicode strings as of Python 3, instead of being equivalent to byte strings as in Python 3, so add this line just under the above one:
p_dict['boundary'] = bytes(p_dict['boundary'], "utf-8")
Thirdly, to fix the error of not having 'CONTENT-LENGTH' in pdict just add these lines before the if statement:
content_len = int(self.headers.get('Content-length'))
p_dict['CONTENT-LENGTH'] = content_len
Full solution on my Github:
https://github.com/rSkogeby/web-server
I am doing the same course and was running into the same problem. Instead of getting it to work with cgi I am now using the parse library. This was shown in the same course just a few lessons earlier.
from urllib.parse import parse_qs
length = int(self.headers.get('Content-length', 0))
body = self.rfile.read(length).decode()
params = parse_qs(body)
messagecontent = params["message"][0]
And you have to get rid of the enctype='multipart/form-data'
in your form.
Another hack solution is to edit the source of the cgi
module.
At the very beginning of the parse_multipart
(around the 226th line):
Change the usage of the boundary
to str(boundary)
...
boundary = b""
if 'boundary' in pdict:
boundary = pdict['boundary']
if not valid_boundary(boundary):
raise ValueError('Invalid boundary in multipart form: %r'
% (boundary,))
nextpart = b"--" + str(boundary)
lastpart = b"--" + str(boundary) + b"--"
...
In my case I used cgi.FieldStorage
to extract file and name instead of cgi.parse_multipart
form = cgi.FieldStorage(
fp=self.rfile,
headers=self.headers,
environ={'REQUEST_METHOD':'POST',
'CONTENT_TYPE':self.headers['Content-Type'],
})
print('File', form['file'].file.read())
print('Name', form['name'].value)