I have a program that could be useful to me but the documentation and all tooltips are in a language I can't read. The source code is available and the entire thing is about 84,000 lines of code. My question is is there a way to export or grab just for example tooltip text, button text, things that would appear to the end user as part of any readable messages in order to easily translate the text?
问题:
回答1:
One possible approach is to use something that knows how to transform a VB6 program. It would need to parse VB6, pull out all the literal text strings, offer them to you for translation, and substitute your replacements for the original strings. Actually, you want two passes, the first to produce the set enabling you to translate the ones of interest, and the second to substitute your designated translations if any. You likely have some debugging to do, because usually there is something that depends on the string size.
How you go about converting the strings from one language to another is up to you. As other posters suggest, you could use an online translator and take what you get. I would expect you will do better if you have a human being do it. They generally only have to focus on the meaning of the strings, since they are extracted from the code, but you will also find cases where the translation depends on what the code is doing, and so a programmer will need to be involved.
Our DMS Software Reengineering Toolkit with its Visual Basic Front End could be easily configured to do this. DMS provides generic parsing and transformation machinery; the VB front end provides the details about Visual Basic 6 (in your case).
A variation of this idea is to replace translated literal strings with references to "resources" (what amounts to a lookup table indexed by a string number) which contain either the original(French) or new (English). This solution produces something close to what people doing internationalization want to do. (This doesn't take care of dates and currency formats; those require data flow analysis to determine computations leading to/from date or currency operations. While not needed for literal string conversion, DMS provides flow analysis, so it could be configured to do this, too.)
If you have precise information about the location of the strings in the text (e.g., starting line/column, ending line/column), you can do this another way: use that precise information to extract the strings, and then use that same precise information to re-insert the translations. To avoid damaging the string locatons, you should replace strings starting at the end of each file first, working backwards throught the file. This should be straightforward to do on a buffer of text.
Our Source Code Search Engine (SCSE) can be used to trivially find such strings and their locations. The SCSE indexes source code according to its lexical structure (and thus sees the string literals exactly), and then all allows queries across the source code for arbitrary sequences of tokens. It uses DMS's language front ends (for your purpose, the VB6 front end) to pick out the lexemes accurately.
One might hunt for statatements that assign a constant more than 10 (using a range constraint) to a variable whose name contains an X (using a wildcard) with a query like this:
I=*x* '=' N>10
The SCSE will find all matches, show you the hits, and enable seeing the hit in the source code with one additional click.
The query you want to find literal strings is extremely simple:
S=*
meaning, "find all strings regardless of content". You can turn on SCSE logging, and it will write a list of all the hits, along with precise positions, to a log file. A this point you have all the precise string information. (SCSE cannot do flow analysis so it can't help internationize dates as well as DMS could, but it could find
N 'mod' 4 '==' 0
patterns, which tend to be leap year adjustments).
回答2:
I don't have the perfect suggestion but it might be your best shot: use a document translator and get yourself close. Then compile the new source and see what breaks. Most likely it will ignore items that are not proper words (like btnOK, etc.)
There are a number of sites out there that use Google Translator. Here is just one random one I chose:
http://www.onlinedoctranslator.com/
If you wanted to build your own translator, you could look at this CodeProject article:
http://www.codeproject.com/KB/IP/GoogleTranslator.aspx
I know that isn't in the scope of your initial question, but since using a translator might be your best option, maybe you could build something to make it work better.
回答3:
One option would be
- Upgrade the program to VB.Net (which may not be easy but could extend the lifetime of the program)
- Then use VB.Net's built-in localization features, which should make life easier.
回答4:
Just my 2c. Automated translation of source code text is VERY problematic. Basically, it just doesn't ever work well. Why not? Context. For an automated translator to work well, it has to have some amount of context. But when you're talking about Source code text, you're talking little snippets of text that either have no obvious context or that are strung together via code, and hence loose their context.
You'll get "something" from an automated translation, but it's almost guaranteed to make native language speakers either 1) snicker or 2) scratch their head wondering what the heck that button caption means...
回答5:
If you open up a VB6 .frm file, you'll see at the top all the form controls, like this excerpt:
Begin VB.Frame frShipmentDetails
BackColor = &H00FFC0C0&
Caption = "Shipment Details by Part"
BeginProperty Font
Name = "Verdana"
Size = 9.75
Charset = 0
Weight = 700
Underline = 0 'False
Italic = 0 'False
Strikethrough = 0 'False
EndProperty
ForeColor = &H000000C0&
Height = 2895
Left = 480
TabIndex = 28
Top = 6960
Width = 10935
Note that it's just a name/value pair. In VB6 you'll typically be looking for properties like Caption
and ToolTip
. A simple grep will get you a good list to start with. You could run the result through some automated technical translator if you want, or send it to a real technical translator (they're expensive though).
However, there are two very big caveats:
First, there's likely some code in the app like this:
If a = b Then
lblSomeLabel.Caption = "yes"
Else
lblSomeLabel.Caption = "no"
End If
... in which case, it's not static text anymore, it's dynamic.
What's worse is that you'll sometimes find this in some event handler:
If lblSomeLabel.Caption = "yes" Then
... do something ...
End If
Which means even if you fix the first lines where the set the Caption, you'll break the later line where you do the comparison. Trust me, this happens a lot in VB6 code.
I've done a translation of a VB6 app from English to Spanish before. Beware. It's much more work than you first think.
The right way to do it is to find all of these strings, put them in some kind of lookup table (with one column for each target language), and do the lookup every time.