Mapping and indexing Path hierarchy in Elastic NES

2019-09-11 07:29发布

问题:

I need to search for files and folder with in specific directories. In order to do that, elastic asks us to create the analyzer and set the tokenizer to path_hierarchy

PUT /fs
{
    "settings": {
        "analysis": {
            "analyzer": {
                "paths": {
                    "tokenizer": "path_hierarchy"
                }
            }
        }
    }
}

Then, create the mapping as illustrated below with two properties: name (holding the name of the file) and path (to store the directory path):

PUT /fs/_mapping/file
{
    "properties": {
        "name": {
            "type": "string",
            "index": "not_analyzed"
        },
        "path": {
            "type": "string",
            "index": "not_analyzed",
            "fields": {
                "tree": {
                    "type": "string",
                    "analyzer": "paths"
                }
            }
        }
    }
}

This requires us to index the path of the directory where the file lives:

PUT /fs/file/1
{
  "name": "README.txt",
  "path": "/clinton/projects/elasticsearch",
}

The Question: How can i create this mapping in NEST Elastic using c#?

回答1:

The analyzer is created by declaring a custom analyzer, and then setting its tokenizer to "path_tokenizer":

                //name of the tokenizer  "path_tokenizer"
                string pathTokenizerName = "path_tokenizer";

                //the name of the analyzer
                string pathAnalyzerName = "path";

                PathHierarchyTokenizer pathTokenizer = new PathHierarchyTokenizer();

                AnalyzerBase pathAnalyzer = new CustomAnalyzer
                {
                    Tokenizer = pathTokenizerName,
                };

The second step is creating the index with required analyzer and mapping, in the code below the property "LogicalPath" will keep the locations of directories in the system"

                //Create the index,
                     elastic.CreateIndex(_indexName, i => i
                        .NumberOfShards(1).NumberOfReplicas(0)
                        // Adding the path analyzers to the index.
                            .Analysis(an => an
                                .Tokenizers(tokenizers => tokenizers
                                    .Add(pathTokenizerName, pathTokenizer)
                                )
                                .Analyzers(analyzer => analyzer
                                    .Add(pathAnalyzerName, pathAnalyzer)
                                )
                            )
                        // Add the mappings
                            .AddMapping<Document>(t => t
                                .MapFromAttributes()
                                    .Properties(props => props
                                    //associating path tokenizer with required property  "Logical Path"
                                        .String(myPathProperty => myPathProperty
                                             .Name(_t => _t.LogicalPath)
                                             .IndexAnalyzer(pathAnalyzerName)
                                    )
                            )
                        ));