I am writing a pdf files with embedded subset fonts. As required, I am including the ToUnicode and CIDSet objects. To test, I created a simple PDF with two Hebrew characters. I can select the two characters and copy to the clipboard, and paste it properly into another application such as Word. But I am not able to search for a word containing these two characters. Adobe Reader (or Acrobat) displays the message that the word was not found. So in essence, I have created a PDF document which can be copied properly, but is not searchable. Any idea what I might be missing when creating the document?
Additional information: 1. The file in question is a minimal file with just two characters. I have tested with many such files in many different languages including English. None of the files are searchable. 2. Curiously, if I search for the letter 'e', Adobe reader highlights an incorrect word, even if the letter 'e' does not exists in the file. 3. Adobe acrobat is also not able to search within this file, however when I save the file to another disk file, the saved file now is searchable. I confirmed that the major objects such as the font-file, ToUnicode object, CID object, and the font description objects are the same in the saved file. However, one of the font object is brought up closer to the top of the file. 4. FoxIt is able to search these files properly.
Relevant PDF objects:
5 0 obj
<>
stream
q 0.750000 0 0 0.750000 0.000000 792.000000 cm
q q q 0.160000 0.000000 0.000000 0.160000 0.000000 0.000000 cm
BT /F0 100.000000 Tf 0 g 750.000000 -690 Td[<02B0>] TJ 35.000000 0 Td[<02B9>] TJ ET Q
Q
Q
Q
endstream
endobj
10 0 obj
<>
endobj
11 0 obj
<> /FontDescriptor 10 0 R/Subtype/CIDFontType2/Type/Font>>
endobj
12 0 obj
<>
endobj
8 0 obj
<>
stream
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (Adobe)
/Ordering (UCS) /Supplement 0 >> def
/CMapName /Adobe-Identity-UCS def
/CMapType 2 def
1 begincodespacerange
<0000> <FFFF>
endcodespacerange
3 beginbfchar
<0000> <0000>
<02B0> <05E0>
<02B9> <05E9>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
endstream
endobj