Natural Language date and time parser for java

2020-01-28 14:40发布

问题:

I am working on a Natural Language parser which examines a sentence in english and extracts some information like name, date etc.

for example: "Lets meet next tuesday at 5 PM at the beach."

So the output will be something like : "Lets meet 15/09/2009 at 1700 hr at the beach"

So basically, what i want to know is that is there any framework or library available for JAVA to do these kind of operations like parsing dates from a sentence and give a output with some specified format.

Regards, Pranav


Thanks for the replies. I have looked on few NLPs like LingPipe, OpenPL, Stanford NLP. I wanted to ask do they hav anything for date parsing for java.

回答1:

Natty is a really good replacement for JChronic.



回答2:

You can use JChronic, the Java port of Chronic.



回答3:

Have you tried jchronic? However, I doubt any library could directly work with sentences: you'd have to extract sentence fragments and feeding them to a NLP date parsing framework yourself, perhaps on a trial-n-error basis (larger and larger fragments until the framework throws an error).



回答4:

I don't think there's any framework out there that does that out of the box. What you can do is create a set of regular expressions to match those patterns.



回答5:

I would suggest using UIMA with OpenNLP connectors and same hand made regexp rules.



回答6:

I wrote a NLP script in Python's NLTK and fed the results to Ruby's chronic.



回答7:

For my use case, I had more luck with chrono-java - sadly it looks stale and is not available in any Maven repository (also not via https://jitpack.io/ since the build is broken), so you have to fix and build it for yourself.

However, checking out the code and fixing a dependency (maven-javadoc-plugin was missing groupId and I updated the version), allowed me to build and run a simple example successfully:

List<ParsedResult> results = Chrono.Parse("Datum  Freitag, 08.04. bis einschl. Sonntag 10.04.2016");
    results.forEach(result -> System.out.println(result));

resulted in 2 Dates being extracted:

ParsedResult: " 08.04" > 04/08/2018 12:00
ParsedResult: "10.04.2016" > 04/10/2016 12:00