The DOI system places basically no useful limitations on what constitutes a reasonable identifier. However, being able to pull DOIs out of PDFs, web pages, etc. is quite useful for citation information, etc.
Is there a reliable way to identify a DOI in a block of text without assuming the 'doi:' prefix? (any language acceptable, regexes preferred, and avoiding false positives a must)
This is a really old and answered question, but here's another potential substitute.
This assumes that white space is not part of the DOI.
Haven't tested this for false positives, but it seems to be able to find all the edge cases mentioned in this page.