Networking

[NSX] LB traffic fails intermittently (1)

haewon83 2024. 6. 2. 10:17

 

3.2.4 Release Note에서 Report된 Bug 내역 중 LB와 관련된 LB Connection 끊김 현상과 관련하여 구성 환경 및 문제 재현 단계를 알아보겠습니다. 

 

VMware NSX-T Data Center 3.2.4 Release Notes

https://docs.vmware.com/en/VMware-NSX/3.2.4/rn/vmware-nsxt-data-center-324-release-notes/index.html

 

Fixed Issue 3288062: There is intermittent Load Balancer traffic failure.

 

우선, 고객사와 동일한 환경 구성을 위해서 다음 그림처럼 내부 Lab에 구성을 합니다.

 

아래 구성의 주요 항목은 다음과 같습니다.

1) Service Interface를 활용한 Tier-1 Gateway 간 Static Route

2) Client에서 VIP에 접근 시, Tier-1 Gateway에서 VIP에 대한 DNAT 구성

3) Load Balancer에 연결된 Server Pool은 SNAT 이용

 

 

 

1. LB T1 생성 및 Service Interface, Static Route 설정

LB T1(one-arm-02-for-svc)은 Tier-0에 연결하지 않음

 

Service Interface

IP Address는 172.31.3.254/24로 설정하고, 연결될 Segment는 172.31.3.0/24가 정의된 Segment 선택

 

Static Route

LB T1의 경우에는 Back-end 서버로 가기 위한 Route Entry와 그 외 default route Entry 추가

 

172.31.1.0/24로 향하는 Traffic의 Next Hop은 172.31.3.1로 설정

 

마찬가지로 Default Network도 172.31.3.1로 설정

 

2. VPC T1 생성 및 Service Interface, Static Route 설정

VPC T1(tier1-01)은 Tier-0에 연결

 

VPC T1의 Advertisement 설정에는 "All NAT IP's" 활성화 → DNAT IP를 광고하기 위해서

 

Service Interface

IP Address는 172.31.3.1/24로 설정하고, 연결될 Segment는 172.31.3.0/24가 정의된 Segment 선택

 

Static Route

VPC T1에서는 VIP인 172.31.4.70 IP로 인입되는 Packet을 LB T1으로 보내기 위한 Route Entry 추가

 

따라서, 172.31.4.70/32 네트워크의 Next Hop은 172.31.3.254(LB T1의 Service Interface)로 지정

 

3. DNAT 설정

Client는 실제 VIP인 172.31.4.70 대신 172.31.5.70(외부 IP 대용)을 바라볼 수 있도록 DNAT 설정

이 DNAT는 VPC T1의 Uplink Interface 에서 설정 확인 가능

 

아래 결과 값에서 Rule ID 536870917이 위에서 생성한 DNAT Rule

edge-node-01(tier1_sr[2])> get firewall interfaces
Tue May 28 2024 UTC 12:54:16.857
Interface           : c8f68ee2-6b45-4017-ba3c-1d676b05f4a6 >>>>>>>>>>>>>>>>>>>>>>>>
Type                : UPLINK
Sync enabled        : true
Name                : Tier0-01-tier1-01-t1_lrp
VRF ID              : 2
Context entity      : d533b216-a47a-4200-9eb3-007e68c3a024
Context name        : SR-tier1-01

...

edge-node-01(tier1_sr[2])> exit

edge-node-01> get firewall c8f68ee2-6b45-4017-ba3c-1d676b05f4a6 ruleset rules
Tue May 28 2024 UTC 12:55:39.818
DNAT rule count: 4
    Rule ID   : 6
    Rule      : in protocol tcp natpass from any to ip 172.31.1.53 port 80 lb lbtype L4 lbidletimeout 60 lbclosetimeout 8 lboptions ec41fd15-9ac8-4bb2-b780-63ee6ed684f5 e29760db-2371-432a-8056-09b40232091e tag 'loadbalancer'
 
    Rule ID   : 7
    Rule      : in protocol tcp natpass from any to ip 172.31.1.52 port 80 lb lbtype L4 lboptions ec41fd15-9ac8-4bb2-b780-63ee6ed684f5 138614e9-2a81-446c-9329-db96c8358545 with lbrule tag 'loadbalancer'
 
    Rule ID   : 9
    Rule      : in protocol tcp natpass from any to ip 172.31.1.51 port 80 lb lbtype L4 lboptions ec41fd15-9ac8-4bb2-b780-63ee6ed684f5 138614e9-2a81-446c-9329-db96c8358545 with lbrule tag 'loadbalancer'
 
    Rule ID   : 536870917 >>>>>>>>>>>>>>>>>>>>>>>> 
    Rule      : in protocol any postnat from any to ip 172.31.5.70 dnat ip 172.31.4.70

...

 

VPC T1에서 DNAT IP(172.31.4.70)에 대한 Route 정보 확인

172.31.4.70으로 향하는 Next Hop이 올바르게 172.31.3.254로 설정되어 있음

edge-node-01> vrf 2
edge-node-01(tier1_sr[2])get forwarding
Tue May 28 2024 UTC 13:05:16.813
Logical Router
UUID                                   VRF    LR-ID  Name                              Type
d533b216-a47a-4200-9eb3-007e68c3a024   2      9      SR-tier1-01                       SERVICE_ROUTER_TIER1
IPv4 Forwarding Table
IP Prefix          Gateway IP                                Type        UUID                                   Gateway MAC
0.0.0.0/0          100.64.120.0                              route       c8f68ee2-6b45-4017-ba3c-1d676b05f4a6   02:50:56:56:44:52
100.64.120.0/31                                              route       c8f68ee2-6b45-4017-ba3c-1d676b05f4a6
100.64.120.1/32                                              route       c0ce8005-5b16-5f3d-93a4-c564900f3dad
127.0.0.1/32                                                 route       8c7874ff-37bc-477f-a1e1-2a1241b56a25
169.254.0.0/28                                               route       a59de12c-1aaf-4852-834a-5229ed8643b5
169.254.0.1/32                                               route       6ad31edf-449d-5871-b330-d15908ac64b0
169.254.0.2/32                                               route       c0ce8005-5b16-5f3d-93a4-c564900f3dad
172.31.1.0/24                                                route       a62f9b69-c532-44a8-89a0-3e42c6292d94
172.31.1.1/32                                                route       6ad31edf-449d-5871-b330-d15908ac64b0
172.31.1.53/32                                               route       8c7874ff-37bc-477f-a1e1-2a1241b56a25
172.31.2.0/24                                                route       074fcf3b-056f-4b59-b19a-c0ed24c72e5a
172.31.2.1/32                                                route       6ad31edf-449d-5871-b330-d15908ac64b0
172.31.3.0/24                                                route       fff07c49-cece-4aaf-b395-91361b4dc519
172.31.3.1/32                                                route       c0ce8005-5b16-5f3d-93a4-c564900f3dad
172.31.4.70/32     172.31.3.254                              route       fff07c49-cece-4aaf-b395-91361b4dc519   02:50:56:00:2c:00 >>>>>>>>>>>>>>>>>>>>>>>>
IPv6 Forwarding Table
IP Prefix                                     Gateway IP                                Type        UUID                                   Gateway MAC
::/0                                          fc66:613:6598:8000::1                     route       c8f68ee2-6b45-4017-ba3c-1d676b05f4a6        
::1/128                                                                                 route       8c7874ff-37bc-477f-a1e1-2a1241b56a25        
fc66:613:6598:8000::/64                                                                 route       c8f68ee2-6b45-4017-ba3c-1d676b05f4a6        
fc66:613:6598:8000::2/128                                                               route       c0ce8005-5b16-5f3d-93a4-c564900f3dad

 

4. Client PC에서 Static Route 설정

테스트 환경에서 Client PC에는 외부 네트워크와 통신하기 위한 별도 Default Gateway가 있어서 172.31.x.0/24 네트워크와 통신하기 위한 별도 Route Entry를 Static 하게 추가하여 vyos(192.168.1.1)를 통해 갈 수 있도록 설정

# route -p add <목적지 네트워크> mask <서브넷 마스크> <Next Hop> if <Interface 번호> 

C:\Users\Administrator.AD>route print
===========================================================================
Interface List
 16...00 50 56 01 8c 58 ......vmxnet3 Ethernet Adapter
 14...00 50 56 05 7f 16 ......vmxnet3 Ethernet Adapter #2
  1...........................Software Loopback Interface 1
 10...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter
 11...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2
===========================================================================
 
IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
...
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    331
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    331
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    331
       172.31.1.0    255.255.255.0      192.168.1.1      192.168.1.2     16 >>>>>>>>>>>>>>>>>>>>>
       172.31.2.0    255.255.255.0      192.168.1.1      192.168.1.2     16 >>>>>>>>>>>>>>>>>>>>>
       172.31.3.0    255.255.255.0      192.168.1.1      192.168.1.2     16 >>>>>>>>>>>>>>>>>>>>>
       172.31.4.0    255.255.255.0      192.168.1.1      192.168.1.2     16 >>>>>>>>>>>>>>>>>>>>>
       172.31.5.0    255.255.255.0      192.168.1.1      192.168.1.2     16 >>>>>>>>>>>>>>>>>>>>>
      192.168.1.0    255.255.255.0         On-link       192.168.1.2    271
      192.168.1.2  255.255.255.255         On-link       192.168.1.2    271
    192.168.1.255  255.255.255.255         On-link       192.168.1.2    271
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    331
        224.0.0.0        240.0.0.0         On-link       192.168.1.2    271
        224.0.0.0        240.0.0.0         On-link     10.225.224.17    271
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    331
  255.255.255.255  255.255.255.255         On-link       192.168.1.2    271
  255.255.255.255  255.255.255.255         On-link     10.225.224.17    271
===========================================================================
Persistent Routes:
  Network Address          Netmask  Gateway Address  Metric
       172.31.1.0    255.255.255.0      192.168.1.1       1
          0.0.0.0          0.0.0.0   10.225.227.254  Default
       172.31.2.0    255.255.255.0      192.168.1.1       1
       172.31.3.0    255.255.255.0      192.168.1.1       1
       172.31.4.0    255.255.255.0      192.168.1.1       1
       172.31.5.0    255.255.255.0      192.168.1.1       1
===========================================================================

 

네트워크 설정 후, Gateway 등 연결 테스트를 위해 Ping 확인

C:\Users\Administrator.AD>ping 172.31.5.70
 
Pinging 172.31.5.70 with 32 bytes of data:
Reply from 172.31.5.70: bytes=32 time=6ms TTL=61
Reply from 172.31.5.70: bytes=32 time=1ms TTL=61
Reply from 172.31.5.70: bytes=32 time=1ms TTL=61
Reply from 172.31.5.70: bytes=32 time=1ms TTL=61
 
Ping statistics for 172.31.5.70:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 6ms, Average = 2ms
 
C:\Users\Administrator.AD>ping 172.31.3.254
 
Pinging 172.31.3.254 with 32 bytes of data:
Reply from 172.31.3.254: bytes=32 time=2ms TTL=61
Reply from 172.31.3.254: bytes=32 time=1ms TTL=61
Reply from 172.31.3.254: bytes=32 time=1ms TTL=61
Reply from 172.31.3.254: bytes=32 time=1ms TTL=61
 
Ping statistics for 172.31.3.254:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 2ms, Average = 1ms
 
C:\Users\Administrator.AD>ping 172.31.3.1
 
Pinging 172.31.3.1 with 32 bytes of data:
Reply from 172.31.3.1: bytes=32 time=2ms TTL=62
Reply from 172.31.3.1: bytes=32 time=1ms TTL=62
Reply from 172.31.3.1: bytes=32 time=1ms TTL=62
Reply from 172.31.3.1: bytes=32 time=1ms TTL=62
 
Ping statistics for 172.31.3.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 2ms, Average = 1ms

 

5. One-arm Load Balancer 생성 

LB T1에 Load Balancer Instance(one-arm-lb-02-for-svc) 생성

 

6. Back-end Server 생성

172.31.1.0/24 네트워크에 2대의 Back-end Server를 위한 Server Pool(one-arm-server-pool-http-01) 생성 

 

 

7. Virtual Server 생성

LB T1에 생성한 One-arm LB를 이용하는 Virtual Server(one-arm-virtual-server-http-01)를 생성하고, 전 단계에서 생성한 Server Pool 지정

 

8. Client PC에서 VIP로 접속 테스트

Client PC(192.168.1.2) -> NAT'ed VIP(172.31.5.70) -> VIP(172.31.4.70) -> Back-end Server(172.31.1.51, 172.31.1.52)

 

문제 재현 테스트를 위한 환경 구성이 완료되었습니다.

위 환경에서 3288062에서 확인되었던 재현 단계를 동일하게 따라가보면서 문제가 어떻게 발생하는지 확인하겠습니다.