Solvedmongoose No retries are made after "failed to connect on first connect"

By default, mongoose throws an Error if the first connect fails, which crashes node.

So to reproduce this bug, you will need the following code in your app, to catch the error and prevent the crash:

db.on('error', console.error.bind(console, 'connection error:'));

Now we can reproduce this bug as follows:

  1. Shut down your MongoDB
  2. Start up your node app that uses mongoose
  3. Your app will log: [MongoError: failed to connect to server [localhost:27017] on first connect [MongoError: connect ECONNREFUSED 127.0.0.1:27017]]
  4. Start up your MongoDB again
  5. Observe that mongoose does not now connect to the working MongoDB. The only way to get reconnected is to restart your app, or to use a manual workaround.

Expected behaviour: Since autoreconnect defaults to true, I would expect mongoose to establish a connection soon after the MongoDB is accessible again.

Note: If the first connect succeeds, but the connection to MongoDB is lost during runtime, then autoreconnect works fine, as expected. The problem is the inconsistency if MongoDB is not available when the app starts up.

(If this is the desired behaviour, and developers are recommended to handle this situation by not catching the error, and letting node crash, then I can accept that, but it is worth making it clear.)

node v4.4.1, mongoose@4.9.4, mongodb@2.2.19, mongodb-core@2.1.4

37 Answers

✔️Accepted Answer

For anyone wanting auto-reconnection when first connect fails, this is how I handle it:

function createConnection (dbURL, options) {
    var db = mongoose.createConnection(dbURL, options);

    db.on('error', function (err) {
        // If first connect fails because mongod is down, try again later.
        // This is only needed for first connect, not for runtime reconnects.
        // See: https://github.com/Automattic/mongoose/issues/5169
        if (err.message && err.message.match(/failed to connect to server .* on first connect/)) {
            console.log(new Date(), String(err));

            // Wait for a bit, then try to connect again
            setTimeout(function () {
                console.log("Retrying first connect...");
                db.openUri(dbURL).catch(() => {});
                // Why the empty catch?
                // Well, errors thrown by db.open() will also be passed to .on('error'),
                // so we can handle them there, no need to log anything in the catch here.
                // But we still need this empty catch to avoid unhandled rejections.
            }, 20 * 1000);
        } else {
            // Some other error occurred.  Log it.
            console.error(new Date(), String(err));
        }
    });

    db.once('open', function () {
        console.log("Connection to db established.");
    });

    return db;
}

// Use it like
var db = createConnection('mongodb://...', options);
var User = db.model('User', userSchema);

For mongoose < 4.11 use db.open() instead of db.openUri()
For mongoose 4.11.7 this technique does not work.
For mongoose 4.13.4 it is working again!


Edit 2019/09/02: There is also a shorter solution using promiseRetry here.

Other Answers:

This recent post from the Strongloop Loopback mongodb connector may be relevant. Their lazyConnect flag defers first connection until the endpoint is hit. If first connection fails in that case, the default connection loss settings will take effect (it will retry).

My interest is container orchestration, where "container startup order" can often be set and expected but "order of service availability" cannot. An orchestration tool might confirm that the mongo container is "up" even though the mongo service isn't available yet.

So, if my mongo container takes 1s to start but 5s for the service to become available, and my app container takes 1s to start and 1s for the service to be available, the app service will outrun the mongo service, causing a first connection failure as originally described.

The Docker Compose documentation has this to say:

Compose will not wait until a container is “ready” (whatever that means for your particular application) - only until it’s running. There’s a good reason for this.

The problem of waiting for a database (for example) to be ready is really just a subset of a much larger problem of distributed systems. In production, your database could become unavailable or move hosts at any time. Your application needs to be resilient to these types of failures.

To handle this, your application should attempt to re-establish a connection to the database after a failure. If the application retries the connection, it should eventually be able to connect to the database.

The best solution is to perform this check in your application code, both at startup and whenever a connection is lost for any reason.

So there's a definite gap here in the context of container orchestration, but both of these stances appear to be valid:

  1. Mongoose could support an option to retry on first connect (perhaps defaulted to false with some cautionary documentation), or
  2. Mongoose could place the responsibility on the developer to write code to retry if first connect fails.

Sure, there's a gap, but then the responsibility falls to you to decide whether to retry if initial connection fails. All mongoose tells you is that it failed. If you make the questionable decision to use docker compose in production (or in any context for that matter), it's up to you to handle retrying initial connection failures.