Networking

[NSX] Search service doesn't work after upgrading to NSX 4.1.2.4.0 from NSX-T 3.2.3.0.0

haewon83 2024. 7. 18. 17:55

 

 

 

[Symptom]
NSX 4.1.2.4로 업그레이드 한 이후에 NSX UI에서 검색 기능이 올바르게 동작하지 않음
"start search resync all" 명령어를 실행해도 잠시 후에 동일 증상 발생

 

[Troubleshooting Notes]

1. UI에서 확인한, "INDEXING_FAILURES_EXHAUSTED_RETRIES" 메시지가 실제로 Search Service로그에 기록되어 있는지 확인

./var/log/search
 
$ zgrep "INDEXING_FAILURES_EXHAUSTED_RETRIES" *
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]
search-manager.log:cohttp://m.vmware.nsx.management.search.common.exceptions.SearchException: INDEXING_FAILURES_EXHAUSTED_RETRIES, params: [all]

 

 

2. Known Issue에서 확인된 내용을 기반으로 Support Bundle에서 로그 확인

./var/log/search

Caused by: cohttp://m.vmware.nsx.management.policy.policyframework.exceptions.InvalidPolicyPathException
Caused by: cohttp://m.vmware.nsx.management.policy.policyframework.exceptions.InvalidPolicyPathException
Caused by: cohttp://m.vmware.nsx.management.policy.policyframework.exceptions.InvalidPolicyPathException
 
2024-07-01T04:23:21.828Z ERROR pool-1023-thread-1 UfoGenericConverter 86312 - [nsx@6876 comp="nsx-manager" errorCode="MP60511" level="ERROR" subcomp="manager"] [Indexing: DtoConversion] Could not convert UFO object to Dto by DTO converter UfoObject{operationType=CREATE, descriptor=IndexingTypeDescriptor{tableName='Alarm', streamTag=POLICY}, identifier=string_id: "/infra/realized-state/enforcement-points/alb-endpoint/alb-pools/###/UUID", timestamp.sequence =4775284957, timestamp.epoch=12520}
java.lang.reflect.InvocationTargetException: null
        at sun.reflect.GeneratedMethodAccessor3577.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382]
        at cohttp://m.vmware.nsx.management.search.provider.UfoGenericConverter.getDtoByUfoObject(UfoGenericConverter.java:217) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.provider.UfoGenericConverter.convertToDtoByConverterOnly(UfoGenericConverter.java:166) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.provider.UfoGenericConverter.internalConvert(UfoGenericConverter.java:115) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.provider.UfoGenericConverter.convertToDataToIndex(UfoGenericConverter.java:71) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl.processForIndexing(UfoIndexingServiceImpl.java:625) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl.processUfoObjectForIndexing(UfoIndexingServiceImpl.java:557) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl.access$1(UfoIndexingServiceImpl.java:555) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl$1.process(UfoIndexingServiceImpl.java:526) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.configuration.DataStore.processDynamicTxBatch(DataStore.java:120) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl.processBatchForIndexing(UfoIndexingServiceImpl.java:514) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.service.impl.UfoIndexingServiceImpl.indexTable(UfoIndexingServiceImpl.java:379) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.manager.ReindexingTask.call(ReindexingTask.java:46) ~[?:?]
        at cohttp://m.vmware.nsx.management.search.manager.ReindexingTask.call(ReindexingTask.java:1) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_382]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_382]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_382]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]
Caused by: cohttp://m.vmware.nsx.management.policy.policyframework.exceptions.InvalidPolicyPathException
        at cohttp://m.vmware.nsx.management.policy.policyframework.service.PolicyPathUtil.getPolicyPath(PolicyPathUtil.java:413) ~[?:?]
        at cohttp://m.vmware.nsx.management.policy.policyframework.service.AlarmServiceImpl.getSourceReference(AlarmServiceImpl.java:78) ~[?:?]
        at cohttp://m.vmware.nsx.management.policy.policyframework.converter.AlarmConverter.toDto(AlarmConverter.java:97) ~[?:?]
        at cohttp://m.vmware.nsx.management.policy.policyframework.converter.AlarmConverter.toDto(AlarmConverter.java:72) ~[?:?]
    ... 20 more

 

3. 본 건은 NSX Upgrade 후에, 기존에 남아 있던 ALB Object들이 문제를 유발

따라서, 남아 있는 ALB Object를 확인하기 위해 Corfu DB에서 다음 명령어로 Table Entry 수집

/opt/vmware/bin/corfu_tool_runner.py -o showTable -n nsx -t ALBControllerAdminCreds >> albcreds.dump
/opt/vmware/bin/corfu_tool_runner.py -o showTable -n nsx -t GenericPolicyRealizedResource >> gprr.dump
/opt/vmware/bin/corfu_tool_runner.py -o showTable -n nsx -t Alarm >> alarm.dump

 

4. 관련된 ALB Object를 정리하기 위해서는 Broadcom Support Porta에서 Case Open 후 지원 필요