Java what's the best data structure to search

2019-04-17 02:25发布

suppose I have a "journal article" class which has variables such as year, author(s), title, journal name, keyword(s), etc.

variables such as authors and keywords might be declared as String[] authors and String[] keywords

What's the best data structure to search among a group of objects of "journal paper" by one or several "keywords", or one of several author names, or part of the title?

Thanks!

========================================================================== Following everybody's help, the test code realized via the Processing environment is shown below. Advices are greatly appreciated! Thanks!

ArrayList<Paper> papers = new ArrayList<Paper>();

HashMap<String, ArrayList<Paper>> hm = new HashMap<String, ArrayList<Paper>>();

void setup(){
  Paper paperA = new Paper();
  paperA.title = "paperA";
  paperA.keywords.append("cat");
  paperA.keywords.append("dog");
  paperA.keywords.append("egg");
  //println(paperA.keywords);
  papers.add(paperA);

  Paper paperC = new Paper();
  paperC.title = "paperC";
  paperC.keywords.append("egg");
  paperC.keywords.append("cat");
  //println(paperC.keywords);
  papers.add(paperC);

  Paper paperB = new Paper();
  paperB.title = "paperB";
  paperB.keywords.append("dog");
  paperB.keywords.append("egg");
  //println(paperB.keywords); 
  papers.add(paperB);

  for (Paper p : papers) {
    // get a list of keywords for the current paper
    StringList keywords = p.keywords;

    // go through each keyword of the current paper
    for (int i=0; i<keywords.size(); i++) {
      String keyword = keywords.get(i);

      if ( hm.containsKey(keyword) ) { 
        // if the hashmap has this keyword
        // get the current paper list associated with this keyword
        // which is the "value" of this keyword
        ArrayList<Paper> papers = hm.get(keyword);        
        papers.add(p); // add the current paper to the paper list        
        hm.put(keyword, papers); // put the keyword and its paper list back to hashmap
      } else { 
        // if the hashmap doesn't have this keyword
        // create a new Arraylist to store the papers with this keyword
        ArrayList<Paper> papers = new ArrayList<Paper>();        
        papers.add(p); // add the current paper to this ArrayList        
        hm.put(keyword, papers); // put this new keyword and its paper list to hashmap
      }
    }

  }

  ArrayList<Paper> paperList = new ArrayList<Paper>();
  paperList = hm.get("egg");
  for (Paper p : paperList) {
    println(p.title);
  }
}

void draw(){}

class Paper 
{
  //===== variables =====
  int ID;
  int year;
  String title;
  StringList authors  = new StringList();
  StringList keywords = new StringList();
  String DOI;
  String typeOfRef;
  String nameOfSource;
  String abs; // abstract


  //===== constructor =====

  //===== update =====

  //===== display =====
}

3条回答
淡お忘
2楼-- · 2019-04-17 02:52

Use a HashMap<String, JournalArticle> data structure.

for example

Map<String, JournalArticle> journals = new HashMap<String, JournalArticle>();
journals.put("keyword1", testJA);

if (journals.containsKey("keyword1")
{
    return journals.get("keyword1");
}

you can put your keywords as the key of String type in this map, however, it only supports "exact-match" kind of search, meaning that you have to use the keyword (stored as key in the Hashmap) in your search.

If you are looking for " like " kind of search, I suggest you save your objects in a database that supports queries for "like".

Edit: on a second thought, I think you can do some-kind-of "like" queries (just like the like clause in SQL), but the efficiency is not going to be too good, because you are iterating through all the keys in the HashMap whenever you do a query. If you know regex, you can do all kinds of queries with modification of the following example code (e.g. key.matches(pattern)):

    List<JournalArticle> results = null;

    for (String key : journals.keySet())
    {
        if (key.contains("keyword"))  /* keyword has to be part of the key stored in the HashMap, but does not have to be an exact match any more */
            results.add(journals.get(key));
    }

    return results;
查看更多
贼婆χ
3楼-- · 2019-04-17 02:55

For simple cases you can use a Multimap<String, Article>. There's one in Guava library.

For larger amounts of data Apache Lucene will be a better fit.

查看更多
我只想做你的唯一
4楼-- · 2019-04-17 03:01

I would create a map from a keyword (likewise for author, or title, etc.), to a set of JournalArticles.

Map<String, Set<JournalArticle>> keyWordMap = new HashMap<>();
Map<String, Set<JournalArticle>> authorMap = new HashMap<>();

When you create a new JournalArticle, for each of its key words, you'd add that article to the appropriate set.

JournalArticle ja = new  JournalArticle();
for(String keyWorld : ja.getKeyWords())
{
    if(keyWordMap.containsKey(keyWorld) == false)
        keyWordMap.put(keyWorld, new HashSet<JournalArticle>());
    keyWordMap.get(keyWorld).add(ja);
}

To do a look up, you'd do something like:

String keyWord = "....";
Set<JournalArticle> matchingSet = keyWordMap.get(keyWord);
查看更多
登录 后发表回答