Compute
The memory usage of vCenter consistently spikes every Thursday at 12:30 PM.
haewon83
2025. 1. 9. 16:40
매주 목요일에 vCenter의 Memory 사용률이 VAMI를 통해 확인했을 시, 높아졌다가 돌아오는 패턴을 보여주는 증상에 대해 OS Memory 관점에서 접근한 내용입니다.
[Symptom]
매주 목요일 vCenter 메모리 사용률이 증가하는 현상
아래는 VAMI 화면 캡쳐
[Troubleshooting Notes]
1. vCenter 버전 확인
$ cat ./etc/applmgmt/appliance/update.conf { ... "latestPatch": { "header": { ... "buildnumber": "20051473", ... "name": "VC-7.0U3f", "productname": "VMware vCenter Server", ... "version": "7.0.3.00700" } |
2. VAMI Memory 사용률 정의
mem_physical_used_host = MemTotal - MemFree - Buffers - Cached mem.usage = mem_physical_used_host / MemTotal |
3. Memory 사용 시, 수집된 /proc/meminfo 결과에서 2번 정의에 따른 값 확인
$ cat cat_procmeminfo.txt MemTotal: 53571176 kB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MemFree: 422128 kB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MemAvailable: 24711568 kB Buffers: 1026004 kB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cached: 12405188 kB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SwapCached: 759536 kB Active: 25660220 kB Inactive: 12474924 kB Active(anon): 19491860 kB Inactive(anon): 7352544 kB Active(file): 6168360 kB Inactive(file): 5122380 kB Unevictable: 15328 kB Mlocked: 15328 kB SwapTotal: 52420604 kB SwapFree: 45373904 kB Dirty: 2616 kB Writeback: 0 kB AnonPages: 24508156 kB Mapped: 2453660 kB Shmem: 2127988 kB Slab: 13909404 kB SReclaimable: 13542892 kB SUnreclaim: 366512 kB KernelStack: 212560 kB PageTables: 180612 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 105456068 kB Committed_AS: 57469212 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB Percpu: 54272 kB HardwareCorrupted: 0 kB AnonHugePages: 19963904 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 366400 kB DirectMap2M: 17459200 kB DirectMap1G: 38797312 kB |
4. vCenter Support Bundle에서 UI에 표시되는 vCenter Memory Usage 값 확인
※ 고객이 언급한 매주 목요일 12:30pm 경에 메모리 사용률 상승 확인
./var/log/vmware/applmgmt/StatsMonitor.log 2024-11-28T03:25:31.687Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 50.1159 status: green 2024-11-28T03:26:31.893Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 49.9106 status: green 2024-11-28T03:27:32.075Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 49.9124 status: green 2024-11-28T03:28:32.286Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 49.9929 status: green 2024-11-28T03:29:32.467Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 65.2786 status: green >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2024-11-28T03:30:32.700Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 59.7125 status: green 2024-11-28T03:31:32.900Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 67.2441 status: green 2024-11-28T03:32:33.150Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 74.3291 status: green >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2024-11-28T03:33:33.462Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 73.9225 status: green 2024-11-28T03:34:33.680Z info StatsMonitor[02334] [Originator@6876 sub=StatsMonitor] [Unit Check] Alarm Check: mem.usage value: 73.8729 status: green |
5. Messages 로그 확인 시, 해당 시점에 Cron Job이 수행
./var/log/vmware/messages 2024-11-28T03:29:01.769034+00:00 <HOSTNAME> anacron[31870]: Job `cron.weekly' started 2024-11-28T03:29:01.778149+00:00 <HOSTNAME> run-parts[15395]: (/etc/cron.weekly) starting devcheck ... |
6. /etc/cron.weekly 디렉토리 하위에 아래 4개의 파일 수행하는 것으로 확인
※ 아래 4개 파일은 vCenter가 6.x일 때는 있지만, vCenter 7.x 부터는 신규 설치시 확인되지 않음
$ grep -i "cron" * | grep weekly journalctl_-b--0.txt:Nov 28 03:01:01 <HOSTNAME> anacron[31870]: Will run job `cron.weekly' in 28 min. journalctl_-b--0.txt:Nov 28 03:29:01 <HOSTNAME> anacron[31870]: Job `cron.weekly' started journalctl_-b--0.txt:Nov 28 03:29:01 <HOSTNAME> run-parts[15398]: (/etc/cron.weekly) starting devcheck journalctl_-b--0.txt:Nov 28 03:29:25 <HOSTNAME> run-parts[15395][23675]: (/etc/cron.weekly) finished devcheck journalctl_-b--0.txt:Nov 28 03:29:25 <HOSTNAME> run-parts[23677]: (/etc/cron.weekly) starting rpmcheck journalctl_-b--0.txt:Nov 28 03:31:02 <HOSTNAME> run-parts[15395][54434]: (/etc/cron.weekly) finished rpmcheck journalctl_-b--0.txt:Nov 28 03:31:02 <HOSTNAME> run-parts[54436]: (/etc/cron.weekly) starting sgidcheck journalctl_-b--0.txt:Nov 28 03:31:45 <HOSTNAME> run-parts[15395][4359]: (/etc/cron.weekly) finished sgidcheck journalctl_-b--0.txt:Nov 28 03:31:45 <HOSTNAME> run-parts[4362]: (/etc/cron.weekly) starting suidcheck journalctl_-b--0.txt:Nov 28 03:32:23 <HOSTNAME> run-parts[15395][19657]: (/etc/cron.weekly) finished suidcheck journalctl_-b--0.txt:Nov 28 03:32:23 <HOSTNAME> anacron[31870]: Job `cron.weekly' terminated (produced output) $ ls -al total 138 drwxrwxr-x 2 106 Dec 2 15:58 . drwxrwxr-x 33 1180 Dec 2 16:02 .. -rw-rw-r-- 1 92 Jul 7 2021 devcheck -rw-rw-r-- 1 88 Jul 7 2021 rpmcheck -rw-rw-r-- 1 93 Jul 7 2021 sgidcheck -rw-rw-r-- 1 93 Jul 7 2021 suidcheck |
7. 해당 Job은 Crond에서 이용하는 것이 아닌, anacron 에서 사용하는 것으로 확인
# cat /etc/anacrontab # /etc/anacrontab: configuration file for anacron # See anacron(8) and anacrontab(5) for details. SHELL=/bin/sh PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO= # the maximal random delay added to the base delay of the jobs RANDOM_DELAY=45 # the jobs will be started during the following hours only START_HOURS_RANGE=3-22 #period in days delay in minutes job-identifier command 1 5 cron.daily nice run-parts /etc/cron.daily 7 25 cron.weekly nice run-parts /etc/cron.weekly >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @monthly 45 cron.monthly nice run-parts /etc/cron.monthly |
8. /etc/cron.weekly 디렉토리 하위에 위치한 4개 파일을 다른 곳으로 옮기고 난 후, 동일 시점에 Memory가 상승하지 않는 것으로 확인