2018-12-19T08:07:12.719+0000 I STORAGE [initandlisten] Detected data files in /mnt/mdb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2018-12-19T08:07:12.719+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=14815M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-12-19T08:07:13.406+0000 E STORAGE [initandlisten] WiredTiger error (-31804) [1545206833:406454][2259:0x7f5870f02a40], connection: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic Raw: [1545206833:406454][2259:0x7f5870f02a40], connection: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic
2018-12-19T08:07:13.406+0000 F - [initandlisten] Fatal Assertion 50853 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 409
2018-12-19T08:07:13.406+0000 F - [initandlisten]
***aborting after fassert() failure
2018-12-19T08:07:13.419+0000 F - [initandlisten] Got signal: 6 (Aborted).
mongod 가 잘 띄워진 상태에서 replica set 에 추가되면 발생되는 오류
ulimit 제한을 65535 로 고쳐도 동일 오류 발생.
$ tail -n 100 /var/log/mongodb/mongod.log
2018-12-21T08:20:21.997+0000 I REPL [replexec-0] This node is mdb4:27017 in the config
2018-12-21T08:20:21.997+0000 I REPL [replexec-0] transition to RECOVERING from REMOVED
2018-12-21T08:20:21.998+0000 I REPL [replexec-0] Resetting sync source to empty, which was :27017
2018-12-21T08:20:21.998+0000 I ASIO [Replication] Connecting to mdb0:27017
2018-12-21T08:20:21.998+0000 I ASIO [Replication] Connecting to mdb2:27017
2018-12-21T08:20:21.998+0000 I REPL [replexec-0] Member mdb1:27017 is now in state PRIMARY
2018-12-21T08:20:21.999+0000 I REPL [replexec-4] Member mdb0:27017 is now in state ARBITER
2018-12-21T08:20:21.999+0000 I REPL [replexec-2] Member mdb2:27017 is now in state SECONDARY
2018-12-21T08:20:21.999+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59038 #17 (2 connections now open)
2018-12-21T08:20:22.000+0000 I NETWORK [conn17] received client metadata from 172.31.2.12:59038 conn17: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:20:22.004+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59040 #18 (3 connections now open)
2018-12-21T08:20:22.004+0000 I NETWORK [conn18] received client metadata from 172.31.2.12:59040 conn18: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:20:22.592+0000 I REPL [rsSync-0] transition to SECONDARY from RECOVERING
2018-12-21T08:20:22.592+0000 I REPL [rsSync-0] Resetting sync source to empty, which was :27017
2018-12-21T08:20:23.219+0000 I REPL [rsBackgroundSync] sync source candidate: mdb2:27017
2018-12-21T08:20:23.219+0000 I ASIO [RS] Connecting to mdb2:27017
2018-12-21T08:20:23.222+0000 I REPL [rsBackgroundSync] Changed sync source from empty to mdb2:27017
2018-12-21T08:20:23.223+0000 I ASIO [RS] Connecting to mdb2:27017
2018-12-21T08:21:22.102+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59042 #21 (4 connections now open)
2018-12-21T08:21:23.223+0000 I ASIO [RS] Ending idle connection to host mdb2:27017 because the pool meets constraints; 1 connections to that host remain open
2018-12-21T08:22:27.006+0000 I NETWORK [conn18] end connection 172.31.2.12:59040 (3 connections now open)
2018-12-21T08:24:01.022+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59043 #22 (4 connections now open)
2018-12-21T08:24:01.022+0000 I NETWORK [conn22] received client metadata from 172.31.2.12:59043 conn22: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:24:33.605+0000 I NETWORK [LogicalSessionCacheRefresh] Starting new replica set monitor for thingplus/mdb1:27017,mdb2:27017,mdb4:27017
2018-12-21T08:24:33.605+0000 I NETWORK [listener] connection accepted from 172.31.2.159:44750 #23 (5 connections now open)
2018-12-21T08:24:33.611+0000 I NETWORK [ReplicaSetMonitor-TaskExecutor] Successfully connected to mdb4:27017 (1 connections now open to mdb4:27017 with a 5 second timeout)
2018-12-21T08:24:33.762+0000 I NETWORK [LogicalSessionCacheRefresh] Starting new replica set monitor for thingplus/mdb1:27017,mdb2:27017,mdb4:27017
2018-12-21T08:24:33.922+0000 I NETWORK [LogicalSessionCacheRefresh] Starting new replica set monitor for thingplus/mdb1:27017,mdb2:27017,mdb4:27017
2018-12-21T08:26:25.031+0000 I NETWORK [conn22] end connection 172.31.2.12:59043 (4 connections now open)
2018-12-21T08:26:35.032+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59046 #25 (5 connections now open)
2018-12-21T08:26:35.032+0000 I NETWORK [conn25] received client metadata from 172.31.2.12:59046 conn25: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:27:53.044+0000 I NETWORK [conn25] end connection 172.31.2.12:59046 (4 connections now open)
2018-12-21T08:29:39.050+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59048 #26 (5 connections now open)
2018-12-21T08:29:39.050+0000 I NETWORK [conn26] received client metadata from 172.31.2.12:59048 conn26: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:31:19.056+0000 I NETWORK [conn17] end connection 172.31.2.12:59038 (4 connections now open)
2018-12-21T08:33:27.056+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59050 #27 (5 connections now open)
2018-12-21T08:33:27.056+0000 I NETWORK [conn27] received client metadata from 172.31.2.12:59050 conn27: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:34:35.058+0000 I NETWORK [conn26] end connection 172.31.2.12:59048 (4 connections now open)
2018-12-21T08:35:05.062+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59051 #28 (5 connections now open)
2018-12-21T08:35:05.062+0000 I NETWORK [conn28] received client metadata from 172.31.2.12:59051 conn28: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:37:17.072+0000 I NETWORK [conn27] end connection 172.31.2.12:59050 (4 connections now open)
2018-12-21T08:37:39.092+0000 I NETWORK [listener] connection accepted from 172.31.2.12:59053 #29 (5 connections now open)
2018-12-21T08:37:39.092+0000 I NETWORK [conn29] received client metadata from 172.31.2.12:59053 conn29: { driver: { name: "NetworkInterfaceTL", version: "4.0.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "14.04" } }
2018-12-21T08:38:39.089+0000 I NETWORK [conn28] end connection 172.31.2.12:59051 (4 connections now open)
2018-12-21T08:42:51.000+0000 E - [ftdc] Assertion: Location13538: couldn't open [/proc/6501/stat] Unknown error src/mongo/util/processinfo_linux.cpp 81
2018-12-21T08:42:51.240+0000 E STORAGE [thread14] WiredTiger error (24) [1545381771:240958][6501:0x7f8c82b67700], log-server: __directory_list_worker, 48: /mnt/mdb/data/journal: directory-list: opendir: Too many open files Raw: [1545381771:240958][6501:0x7f8c82b67700], log-server: __directory_list_worker, 48: /mnt/mdb/data/journal: directory-list: opendir: Too many open files
2018-12-21T08:42:51.241+0000 E STORAGE [thread14] WiredTiger error (24) [1545381771:241037][6501:0x7f8c82b67700], log-server: __log_prealloc_once, 469: log pre-alloc server error: Too many open files Raw: [1545381771:241037][6501:0x7f8c82b67700], log-server: __log_prealloc_once, 469: log pre-alloc server error: Too many open files
2018-12-21T08:42:51.241+0000 E STORAGE [thread14] WiredTiger error (24) [1545381771:241049][6501:0x7f8c82b67700], log-server: __log_server, 1015: log server error: Too many open files Raw: [1545381771:241049][6501:0x7f8c82b67700], log-server: __log_server, 1015: log server error: Too many open files
2018-12-21T08:42:51.241+0000 E STORAGE [thread14] WiredTiger error (-31804) [1545381771:241058][6501:0x7f8c82b67700], log-server: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic Raw: [1545381771:241058][6501:0x7f8c82b67700], log-server: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic
2018-12-21T08:42:51.241+0000 F - [thread14] Fatal Assertion 50853 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 409
2018-12-21T08:42:51.241+0000 F - [thread14]
***aborting after fassert() failure
위 오류가 발생 한 후 mongod 데몬을 다시 띄우면 발생되는 오류
$ tail -n 100 /var/log/mongodb/mongod.log
2018-12-24T05:58:26.698+0000 E STORAGE [initandlisten] WiredTiger error (-31803) [1545631106:698316][8131:0x7f89eb866a40], file:WiredTiger.wt, WT_CURSOR.next: __schema_create_collapse, 111: metadata information for source configuration "colgroup:collection/22196--1865163962578037616" not found: WT_NOTFOUND: item not found Raw: [1545631106:698316][8131:0x7f89eb866a40], file:WiredTiger.wt, WT_CURSOR.next: __schema_create_collapse, 111: metadata information for source configuration "colgroup:collection/22196--1865163962578037616" not found: WT_NOTFOUND: item not found
2018-12-24T06:01:47.503+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/mnt/mdb/data/diagnostic.data'
2018-12-24T06:01:47.969+0000 I REPL [initandlisten] Rollback ID is 1
2018-12-24T06:01:47.970+0000 I REPL [initandlisten] Starting recovery oplog application at the stable timestamp: Timestamp(1545615489, 621)
2018-12-24T06:01:47.970+0000 I REPL [initandlisten] Replaying stored operations from { : Timestamp(1545615489, 621) } (exclusive) to { : Timestamp(1545615492, 12136) } (inclusive).
2018-12-24T06:01:48.009+0000 I FTDC [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
2018-12-24T06:02:02.000+0000 E - [ftdc] Assertion: Location13538: couldn't open [/proc/8131/stat] Unknown error src/mongo/util/processinfo_linux.cpp 81
2018-12-24T06:02:02.481+0000 E STORAGE [repl writer worker 7] WiredTiger error (24) [1545631322:481214][8131:0x7f89d5de6700], file:index/323--1865163962578037616.wt, WT_SESSION.open_cursor: __posix_open_file, 715: /mnt/mdb/data/index/323--1865163962578037616.wt: handle-open: open: Too many open files Raw: [1545631322:481214][8131:0x7f89d5de6700], file:index/323--1865163962578037616.wt, WT_SESSION.open_cursor: __posix_open_file, 715: /mnt/mdb/data/index/323--1865163962578037616.wt: handle-open: open: Too many open files
2018-12-24T06:02:02.481+0000 E STORAGE [repl writer worker 7] Failed to open a WiredTiger cursor: table:index/323--1865163962578037616
2018-12-24T06:02:02.481+0000 E STORAGE [repl writer worker 7] This may be due to data corruption. Please read the documentation for starting MongoDB with --repair here: http://dochub.mongodb.org/core/repair
2018-12-24T06:02:02.481+0000 F - [repl writer worker 7] Fatal Assertion 50882 at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 137
2018-12-24T06:02:02.481+0000 F - [repl writer worker 7]
***aborting after fassert() failure
Troubleshooting
$ ulimit -n 3129048
$ vi /etc/security/limits.conf
soft nproc 65535 hard nproc 65535 soft nofile 1000000 hard nofile 1000000 root soft nproc 65535 root hard nproc 65535 root soft nofile 1000000 root hard nofile 1000000