Does anyone know of a Java library that will let me parse .PO files? I simply want to create a Map of IDs and Values so I can load them into a database.
问题:
回答1:
According to Java gettext utilities Manual you may convert PO file to a ResourceBundle class using msgfmt --java2
program and read it using java.util.ResourceBundle or gnu.gettext.GettextResource - I suppose it to be a most efficient way. Gettext-commons do exactly the same including intermediate process creation to call msgfmt because it is positioned as following:
Gettext Commons is Java library that makes use of GNU gettext utilities.
If you still want exactly a Java library then the only way I see is to write your own library for parsing this format i.e. rewrite msgfmt source code from C to Java language. But I'm not sure it will be faster than create process + run C program.
回答2:
I searched the Internet and couldn't find an existing library, either. If you use Scala, it's quite easy to write a parser yourself, thanks to its parser combinator feature.
Call PoParser.parsePo("po file content")
. The result is a list of Translation
.
I have made this code into a library (can be used by any JVM languages, including Java, of course!): https://github.com/ngocdaothanh/scaposer
import scala.util.parsing.combinator.JavaTokenParsers
trait Translation
case class SingularTranslation(
msgctxto: Option[String],
msgid: String,
msgstr: String) extends Translation
case class PluralTranslation(
msgctxto: Option[String],
msgid: String,
msgidPlural: String,
msgstrNs: Map[Int, String]) extends Translation
// http://www.gnu.org/software/hello/manual/gettext/PO-Files.html
object PoParser extends JavaTokenParsers {
// Removes the first and last quote (") character of strings
// and concats them.
private def unquoted(quoteds: List[String]): String =
quoteds.foldLeft("") { (acc, quoted) =>
acc + quoted.substring(1, quoted.length - 1)
}
// Scala regex is single line by default
private def comment = rep(regex("^#.*".r))
private def msgctxt = "msgctxt" ~ rep(stringLiteral) ^^ {
case _ ~ quoteds => unquoted(quoteds)
}
private def msgid = "msgid" ~ rep(stringLiteral) ^^ {
case _ ~ quoteds => unquoted(quoteds)
}
private def msgidPlural = "msgid_plural" ~ rep(stringLiteral) ^^ {
case _ ~ quoteds => unquoted(quoteds)
}
private def msgstr = "msgstr" ~ rep(stringLiteral) ^^ {
case _ ~ quoteds => unquoted(quoteds)
}
private def msgstrN = "msgstr[" ~ wholeNumber ~ "]" ~ rep(stringLiteral) ^^ {
case _ ~ number ~ _ ~ quoteds => (number.toInt, unquoted(quoteds))
}
private def singular =
(opt(comment) ~ opt(msgctxt) ~
opt(comment) ~ msgid ~
opt(comment) ~ msgstr ~ opt(comment)) ^^ {
case _ ~ ctxto ~ _ ~ id ~ _ ~ s ~ _ =>
SingularTranslation(ctxto, id, s)
}
private def plural =
(opt(comment) ~ opt(msgctxt) ~
opt(comment) ~ msgid ~
opt(comment) ~ msgidPlural ~
opt(comment) ~ rep(msgstrN) ~ opt(comment)) ^^ {
case _ ~ ctxto ~ _ ~ id ~ _ ~ idp ~ _ ~ tuple2s ~ _ =>
PluralTranslation(ctxto, id, idp, tuple2s.toMap)
}
private def exp = rep(singular | plural)
def parsePo(po: String): List[Translation] = {
val parseRet = parseAll(exp, po)
if (parseRet.successful) parseRet.get else Nil
}
}
回答3:
gettext-commons is the only one I've found while doing some research some time back.
回答4:
The tennera project on github contains an ANTLR-based parser for GNU Gettext PO/POT. I think it is used by Redhat for a web-based translation software.
回答5:
.MO parser (not Java, but Scala), parses into Map : http://scalamagic.blogspot.com/2013/03/simple-gettext-parser.html , source: http://pastebin.com/csWx5Sbb
回答6:
I have found some java classes to read and write po files : https://launchpad.net/po-parser