-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPP-1680] VPP-LB crashed with some specific commit under heavy traffic #3143
Comments
Hi Tatsumi-san, Thank you for your confirm and help! Thanks, Hongjun |
Hi Hongjun, I could confirm that VPP won't crash with this patch in my machine too! Thanks a lot. |
Hi Hongjun, Thanks for your help. |
Hi Tatsumi, I have reproduced this issue in my local server. And have submitted a patch to fix it. I have tested this patch and it works well.
Below is my test command and packet trace: Packet 1 is using AS1: 192.168.50.74. When I deleted and flush AS1: 192.168.50.74, Packet 2 switch to use AS2: 192.168.50.75. Now the issue we met before has disappeared. ---------------------------------------------------------------------- DBGvpp# sh lb vip verbose ---------------------------------------------------------------------- Packet 1: 00:00:53:787360: lb4-l3dsr-port ---------------------------------------------------------------------- DBGvpp# sh lb vip verbose ---------------------------------------------------------------------- Packet 2 00:04:43:560694: lb4-l3dsr-port ----------------------------------------------------------------------
|
I used split command to device large filesize to small one. Please revert files as bellow,
|
Description
I have VPP-LB crash under heavy traffic with some specific commits.
This issue is reproductive and I need help to solve that.
VPP 18.10 with 2 patches,
- https://gerrit.fd.io/r/#/c/12680/ - https://gerrit.fd.io/r/#/c/18826/
- henry_ni (Tue, 4 Jun 2019 01:26:54 +0000): Hi Tatsumi-san,
- yusuketatsumi (Mon, 3 Jun 2019 10:54:45 +0000): Hi Hongjun,
- yusuketatsumi (Thu, 30 May 2019 11:27:40 +0000): Hi Hongjun,
- henry_ni (Thu, 30 May 2019 08:13:08 +0000): Hi Tatsumi,
- yusuketatsumi (Fri, 17 May 2019 07:59:32 +0000): I used split command to device large filesize to small one.
> Reproducing steps
1. VIP setup
Creating one VIP and 3 members for example.
2. start wget (heavy traffic)
3. Deleting the member with flush via CLI
When deleting VIP member with flush, VPP will crash.
When deleting VIP member without flush, vpp won't crash.
This crash is almost 100% reproductive. If VPP doesn't crash, repeating to creating&deleting member, VPP soon crash.
> log & coredump
- Please see attached dump files & vpp-*.rpms in details
When deleting VIP member with flush, VPP received SIGSEGV and VPP will crash.
According to gdb, I think calling vlib_refcount_add() with inappropriate number in src/plugins/lb/node.c:366 leads this issue but I can't find out root cause.
> Corresponding part of code
I pick up corresponding part of code to gdb result.
If you need all source code, please see above "VPP version" part.
src/vnet/util/refcount.h
src/plugins/lb/node.c
Assignee
Hongjun Ni
Reporter
Yusuke Tatsumi
Comments
Thank you for your confirm and help!
Thanks,
Hongjun
I could confirm that VPP won't crash with this patch in my machine too!
Thanks a lot.
Thanks for your help.
I will try your patch on my machine.
I have reproduced this issue in my local server.
And have submitted a patch to fix it.
I have tested this patch and it works well.
Below is my test command and packet trace:
Packet 1 is using AS1: 192.168.50.74.
When I deleted and flush AS1: 192.168.50.74,
Packet 2 switch to use AS2: 192.168.50.75.
Now the issue we met before has disappeared.
----------------------------------------------------------------------
lb vip 90.1.2.1/32 protocol tcp port 20000 encap l3dsr dscp 7 new_len 16
lb as 90.1.2.1/32 protocol tcp port 20000 192.168.50.74
lb as 90.1.2.1/32 protocol tcp port 20000 192.168.50.75
DBGvpp# sh lb vip verbose
sh lb vip verbose
ip4-l3dsr [1] 90.1.2.1/32
new_size:16
protocol:6 port:20000
dscp:7
counters:
packet from existing sessions: 0
first session packet: 0
untracked packet: 0
no server configured: 0
#as:2
192.168.50.74 8 buckets 0 flows dpo:14 used
192.168.50.75 8 buckets 0 flows dpo:15 used
----------------------------------------------------------------------
ex /root/lb_l3dsr.pg
DBGvpp# trace add pg-input 1
DBGvpp# pa en
DBGvpp# 0: lb_get_sticky_table:154: Regenerated sticky table 0x7fffb6a7fb00
Packet 1:
00:00:53:787360: lb4-l3dsr-port
lb vip[1]: ip4-l3dsr 90.1.2.1/32 new_size:16 #as:2 protocol:6 port:20000 ds
cp:7
lb as[1]: 192.168.50.74 used
----------------------------------------------------------------------
DBGvpp# lb as 90.1.2.1/32 protocol tcp port 20000 192.168.50.74 del flush
lb as 90.1.2.1/32 protocol tcp port 20000 192.168.50.74 del flush
0: lb_as_command_fn:269: vip index is 1
DBGvpp#
DBGvpp# sh lb vip verbose
sh lb vip verbose
ip4-l3dsr [1] 90.1.2.1/32
new_size:16
protocol:6 port:20000
dscp:7
counters:
packet from existing sessions: 1023
first session packet: 1
untracked packet: 0
no server configured: 0
#as:2
192.168.50.74 0 buckets 0 flows dpo:14 removed
192.168.50.75 16 buckets 0 flows dpo:15 used
----------------------------------------------------------------------
DBGvpp# ex /root/lb_l3dsr.pg
trace add pg-input 1
pa en
ex /root/lb_l3dsr.pg
DBGvpp# trace add pg-input 1
DBGvpp# pa en
Packet 2
00:04:43:560694: lb4-l3dsr-port
lb vip[1]: ip4-l3dsr 90.1.2.1/32 new_size:16 #as:2 protocol:6 port:20000 ds
cp:7
lb as[2]: 192.168.50.75 used
Please revert files as bellow,
Original issue: https://jira.fd.io/browse/VPP-1680
The text was updated successfully, but these errors were encountered: