Create an index with MongoDb

2020-06-08 13:28发布

问题:

I'm beginner with MongoDB and i'm trying some stuff. I want to store URL and to avoid duplicate URL I create an unique index on the url. Like that

collection.createIndex(new BasicDBObject("url", type).append("unique", true));

But each time I launch my program the index is create again isn't it ?

Because, now my program is only inserting one url "http://site.com" and if I restart my program this url is insert again like if there isn't index.

Creating the index each time is the wrong way to handle an index ?

Here is an example of my code

mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", "true"));

mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

And the output:

{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf106"} , "url" : "http://site.com" , "crawled" : 0}
{ "_id" : { "$oid" : "50d627cf44ae5d6b5e9cf107"} , "url" : "http://site.com" , "crawled" : 0}

Thanks

EDIT :

Here is my class Mongo which handle MongoDB import java.net.UnknownHostException; import java.util.List; import java.util.Set;

import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MongoClient;

public class Mongo {

    private MongoClient mongoClient;
    private DB db;
    private DBCollection collection;
    private String db_name;

    public Mongo(String db){

        try {
            mongoClient = new MongoClient( "localhost" , 27017 );

            this.db = mongoClient.getDB(db);
            this.db_name = db;
        } catch (UnknownHostException e) {
            e.printStackTrace();
        }

    }

    public void drop(){
        mongoClient.dropDatabase(db_name);
    }

    public void listCollections(){
        Set<String> colls = db.getCollectionNames();

        for (String s : colls) {
            System.out.println(s);
        }
    }

    public void listIndex(){
         List<DBObject> list = collection.getIndexInfo();

            for (DBObject o : list) {
                System.out.println("\t" + o);
            }
    }

    public void setCollection(String col){
        this.collection = db.getCollection(col);
    }

    public void insert(BasicDBObject doc){

        this.collection.insert(doc);

    }

    public DBCollection getCollection(){
        return collection;
    }

    public void createIndex(String on, int type){
        collection.ensureIndex(new BasicDBObject(on, type).append("unique", true));
    }


}

And here is my class which handle my program

public class Explorer {

    private final static boolean DEBUG = false;
    private final static boolean RESET = false;

    private Mongo mongo;

    private String host;

    public Explorer(String url){
        mongo = new Mongo("explorer");
        mongo.setCollection("page");

        if (RESET){
            mongo.drop();
            System.out.println("Set RESET to FALSE and restart the program.");
            System.exit(1);
        }

        if (DEBUG) {
            mongo.listCollections();

        }

        this.host = url.toLowerCase();



        BasicDBObject doc = new BasicDBObject("url", "http://site.com").append("crawled", 0);

        mongo.getCollection().ensureIndex(new BasicDBObject("url", 1).append("unique", true));

        mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));

        mongo.getCollection().insert(new BasicDBObject("url", "http://site.com").append("crawled", 0));




        process();
    }


    private void process(){


        BasicDBObject query = new BasicDBObject("crawled", 0);

        DBCursor cursor = mongo.getCollection().find(query);

        try {
            while(cursor.hasNext()) {
                System.out.println(cursor.next());
            }
        } finally {
            cursor.close();
        }

    }
}

回答1:

You'll need to pass the unique value as the boolean value true, not as a string, and it's the second parameter that are options:

...ensureIndex(new BasicDBObject("url", 1), new BasicDBObject("unique", true));

Also, I tested it manually using the mongo interpreter:

> db.createCollection("sa")
{ "ok" : 1 }
> db.sa.ensureIndex({"url":1},{unique:true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
> db.sa.insert({url:"http://www.example.com", crawled: true})
E11000 duplicate key error index: test.sa.$url_1  dup key: { : "http://www.example.com" }
> db.sa.insert({url:"http://www.example2.com/", crawled: false})
> db.sa.insert({url:"http://www.example.com", crawled: false})
E11000 duplicate key error index: test.sa.$url_1  dup key: { : "http://www.example.com" }
>

There are only the two objects:

> db.sa.find()
{ "_id" : ObjectId("50d636baa050939da1e4c53b"), "url" : "http://www.example.com", "crawled" : true }
{ "_id" : ObjectId("50d636dba050939da1e4c53d"), "url" : "http://www.example2.com/", "crawled" : false }


回答2:

I don't fully understand your problem but I feel it's very likely that you should use ensureIndex instead of createIndex as the latter always tries to create the index while the former will only ensure that it exists.



回答3:

Just stumbled over this question and there are some changes since Version 3.0.0

db.collection.ensureIndex(keys, options)

Deprecated since version 3.0.0: db.collection.ensureIndex() is now an alias for db.collection.createIndex().

Creates an index on the specified field if the index does not already exist.



回答4:

To Use the unique index of the mongodb, you should use the method with 2 parameters where 3rd boolean parameter is for the "unique" index.

mongo.getCollection().ensureIndex(new BasicDBObject("url", 1),"unq_url", true));



回答5:

Also I see that you dont have a collection name specified in getCollection();

What collection would that select? curious