Next #20

malahal · 2019-04-22T10:27:09Z

No description provided.

Signed-off-by: Jeff Layton <[email protected]>

An xprt has a ref for the hash table (that's released by SVC_DESTROY()); but when it's first created, only 1 ref was taken, so there wasn't a ref for the caller. Add an extra ref for the caller when the xprt is first created. Signed-off-by: Daniel Gryniewicz <[email protected]>

svc_xprt_lookup - Add extra ref on create

The check to see if the given address will fit in the storage is backwards. Fix it, so that IPv4 can fit in the storage, and so that a huge address doesn't overflow. Signed-off-by: Daniel Gryniewicz <[email protected]>

timespec[add|sub|cmp] are now public in FreeBSD 12, they now have an additional variable similar to NetBSD. We use the system macros by default, otherwise the updated version has been implemented. Signed-off-by: Jack Halford <[email protected]>

Update timespec macros

Make URCU a mandatory dependency

Fix addr size check in clnt_dg_control()

When _LGPL_SOURCE is defined the lttng headers will inline some of the RCU functions with the "bulletproof" functions. This means that we can't currently select a urcu flavor at runtime that will be used universally. I'm a little unclear on what happens if you mix different flavored urcu calls, but I doubt that it's anything good. Let's switch back to using urcu-bp universally for now until this has a better solution. It's not ideal for performance, but our use of URCU is pretty minimal for now, and it's better to be safe than fast. Signed-off-by: Jeff Layton <[email protected]>

Force urcu-bp for now

XPRT refcounting could leak a ref in some cases. Fix this by dropping the extra ref in error cases, and by passing the event ref (which should have been dropped, because events are oneshot) to the event processor. This fixes the FD depletion while doing mount/unmount in a tight loop. Signed-off-by: Daniel Gryniewicz <[email protected]>

TCP has a double-ref setup, where the tree has one, and the event loop has one. This allows processors to destroy the xprt on error without freeing it, ensuring no use-after-free. UDP does not have a XPRT tree, since it re-uses a single socket, so it only has one ref. This means that destroy-on-error immediately frees. Add an extra ref to the UDP path so that it matches TCP, and to allow destroy. Signed-off-by: Daniel Gryniewicz <[email protected]>

When destroying a client, the XPRT is not also destroyed, but just unref'd. This is fine, if multiple clients share a XPRT, but local clients all have unique FDs, and therefore unique XPRTs. This means that their XPRTs are never destroyed. Ganesha uses these a lot, leading to many leaked XPRTs and therefore FDs. Add a flag on a client that indicates it's local, and use that flag to desrtoy the XPRT when the client is destroyed. Signed-off-by: Daniel Gryniewicz <[email protected]>

XPRT Refcount fixes

Signed-off-by: Frank S. Filz <[email protected]>

Instead of calling request_cb which alllocates an svc_req and then calls SVC_DECODE and then frees the request, we call alloc_cb, call SVC_DECODE ourselves, and then call free_cb. This makes it a little easier to handle suspending an async request in the future. Signed-off-by: Frank S. Filz <[email protected]>

If the protocol layer needs to save some data (it probably does), it should use SVCXPRT->xp_u1 or SVCXPRT->xp_u2 (Ganesha only uses xp_u2 so it could use xp_u1). If a request is to be suspended, the SVCXPRT and svc_req should not be used once it is possible for another thread to be processing an async completion of the request. This is because if such an async completion happened to run before the original thread returned up the stack, the request could actually be requeued and even executing on another RPC thread while the first thread is still unwinding. Signed-off-by: Frank S. Filz <[email protected]>

Async

Fix security warnings from AppScan tool

Convert unsafe strncpy to strlcpy

Signed-off-by: Daniel Gryniewicz <[email protected]>

Make a header for strl*()

Coverity Scan fix: In __rpc_taddr2uaddr_af() free the allocated memory for variable 'ret' when returning NULL from the function. Signed-off-by: Madhu Thorat <[email protected]>

Free 'ret' when returning NULL in __rpc_taddr2uaddr_af()

On modern Linux, statd only listens on TCP. Forcing UDP for connections causes them to time out. Any system that claims to support TCP but doesn't is so out-of-date we probably have other issues there. Fix it so that TCP actually uses TCP. Signed-off-by: Daniel Gryniewicz <[email protected]>

NSM - Don't force UDP portmapper lookups

We are doing accept and returning XPRT_DIED without closing fd, which will be unmonitored and cause fd leak. Client will see connection succeded but IO will hang on it.

Close fd if makefd_xprt fails.

"gss_sign" is a deprecated GSS-API function. This needs to be replaced by "gss_get_mic". Signed-off-by: Sachin Punadikar <[email protected]>

When testing asynchronous I/O, observed all the work threads eventually waiting on the same epoll event. This was because the svc worker was being rescheduled multiple times when EWOULDBLOCK occurs on send and we request evchan write (OUT) control on the fd_send. Only one svc_rqst_epoll_loop should be running per sr_rec. Signed-off-by: TweakySolution <[email protected]>

Non-Blocking I/O results in hang.

Signed-off-by: Daniel Gryniewicz <[email protected]>

Move async callback to svc_req and give it its own wpe

Signed-off-by: Daniel Gryniewicz <[email protected]>

XPRTs are generally stored in some form of lookup table. When we're destroying the XPRT, we need to remove it from it's lookup table before we drop the sentinal ref. Otherwise, another thread can find it, and increment it's ref from 0 -> 1. This is very hard to handle in an atomic fashion. Add an unlink() API method, and implement it for all backends. This allows us to remove the XPRT from it's lookup table before we drop it's sentinal ref. Signed-off-by: Daniel Gryniewicz <[email protected]>

XPRT - Unregister xprt before dropping sentinal ref

There is a race when unhooking events from epoll, where the event could be ready for delivery (or even delivered, but the thread not scheduled) and so the event is processed after the unhook, and therefore after the XPRT has been freed. To close this, stop putting a pointer to the rec in the event data, and instead put the FD in there and use it to look up the XPRT. This ensures that, if we got the XPRT from lookup, it's valid and ref'd for the duration of the event. Once we're no longer storing a XPRT pointer in the epoll event, we don't need a refcount across the hook/event/unhook series. Remove these refcounts, allowing a destroyed XPRT to just be freed. Signed-off-by: Daniel Gryniewicz <[email protected]>

clang complains about uninitialized use of xv/gv_count variables because there is no final else clause, even if we do check that svc is either of these types at the start of the function. Silence false positive by skipping the last "always-true" check

authgss_prot: silence false-positive clang warning

Take a XPRT ref when hooking events

Many error paths call DESTROY, which will unlink and drop the ref. This means that the final RELEASE will free, causing the DESTROY to use-after-free. Instead, make sure we DESTROY first. Signed-off-by: Daniel Gryniewicz <[email protected]>

VC - RELEASE after DESTROY

The check for initialization in GSS was not locked, so there could be a race where two threads both initialize. Fix this by moving the check inside the lock. Signed-off-by: Daniel Gryniewicz <[email protected]>

Avoid race checking for GSS init

In clnt_tli_ncreate() if client creation fails then we don't close FD and cause FD leak. Fixed by closing FD in case of client creation failure. Signed-off-by: Madhu Thorat <[email protected]>

Seperate out work queue and worker queue. Always submit work to the work queue and wakeup available work thread. Each work thread will always check for entry in work queue, so that it can pickup next work without scheduling. Reduce the number of idle thread count, less number of threads helps in less scheduling overhead.

work_pool: Fix for efficient scheduling.

We are updating data->ptr while rearming fd_send, which is changing the data fd since its union, cauing IO is hung. Signed-off-by: Gaurav Gangalwar <[email protected]>

rqst: Fix hang in non blocking send.

In clnt_tli_ncreate() close FD in case client creation failed

We need to free gss_buffer

Fix mem_leak in xdr_rpc_gss_buf

Neither uid nor gid should be -1. If a client sends such a value, setfsuid, setfsuid or __NR_setgroups may fail. Check for invalid uid/gid and return a failure in such a case. Signed-off-by: Malahal Naineni <[email protected]>

jtlayton and others added 30 commits October 18, 2018 15:44

cmake: move urcu library to main build

2442f77

Signed-off-by: Jeff Layton <[email protected]>

pthreads: have all thread functions register themselves with urcu

ea57c9d

Signed-off-by: Jeff Layton <[email protected]>

Merge pull request #155 from dang/push

78c409d

svc_xprt_lookup - Add extra ref on create

Fix addr size check in clnt_dg_control()

bc28438

The check to see if the given address will fit in the storage is backwards. Fix it, so that IPv4 can fit in the storage, and so that a huge address doesn't overflow. Signed-off-by: Daniel Gryniewicz <[email protected]>

Update timespec macros

56ade49

timespec[add|sub|cmp] are now public in FreeBSD 12, they now have an additional variable similar to NetBSD. We use the system macros by default, otherwise the updated version has been implemented. Signed-off-by: Jack Halford <[email protected]>

Merge pull request #158 from jzck/next

c347df0

Update timespec macros

Merge pull request #156 from jtlayton/urcu

2936387

Make URCU a mandatory dependency

Merge pull request #157 from dang/push

f7c4a3e

Fix addr size check in clnt_dg_control()

Merge pull request #159 from jtlayton/lttng

b154101

Force urcu-bp for now

Merge pull request #161 from dang/push

e866504

XPRT Refcount fixes

Simplify epoll task processing

d16697b

Signed-off-by: Frank S. Filz <[email protected]>

Merge pull request #162 from nfs-ganesha/async

929c9da

Async

Convert unsafe strncpy to strlcpy

0104cd1

Fix security warnings from AppScan tool

Merge pull request #164 from malahal/strlcpy

fddcb84

Convert unsafe strncpy to strlcpy

Make a header for strl*()

ec03704

Signed-off-by: Daniel Gryniewicz <[email protected]>

Merge pull request #165 from dang/push

57b2605

Make a header for strl*()

Free 'ret' when returning NULL in __rpc_taddr2uaddr_af()

f8b5fa5

Coverity Scan fix: In __rpc_taddr2uaddr_af() free the allocated memory for variable 'ret' when returning NULL from the function. Signed-off-by: Madhu Thorat <[email protected]>

Merge pull request #167 from madhuthorat/cid__rpc_taddr2uaddr_af

cb0cf40

Free 'ret' when returning NULL in __rpc_taddr2uaddr_af()

Merge pull request #169 from dang/push

52bb2d5

NSM - Don't force UDP portmapper lookups

Close fd if makefd_xprt fails.

1bfa572

We are doing accept and returning XPRT_DIED without closing fd, which will be unmonitored and cause fd leak. Client will see connection succeded but IO will hang on it.

Merge pull request #171 from Gaurav-Gangalwar/fd_leak

bf703cc

Close fd if makefd_xprt fails.

Use "gss_get_mic" instead of "gss_sign"

6bce534

"gss_sign" is a deprecated GSS-API function. This needs to be replaced by "gss_get_mic". Signed-off-by: Sachin Punadikar <[email protected]>

TweakySolution and others added 6 commits December 12, 2019 10:06

fix erronous bracket

7590594

Merge pull request #198 from TweakySolution/async_io_hang

f2719f6

Non-Blocking I/O results in hang.

3.1

2a14f3d

Signed-off-by: Daniel Gryniewicz <[email protected]>

Merge pull request #199 from nfs-ganesha/async

3cee347

Move async callback to svc_req and give it its own wpe

3.2

1ff6036

Signed-off-by: Daniel Gryniewicz <[email protected]>

malahal force-pushed the next branch from a588e51 to e2154e2 Compare February 19, 2020 12:05

dang and others added 19 commits February 20, 2020 11:52

Merge pull request #203 from dang/unlink

7db112a

XPRT - Unregister xprt before dropping sentinal ref

Merge pull request #207 from martinetd/clang

46acb85

authgss_prot: silence false-positive clang warning

Merge pull request #204 from dang/hook

344a636

Take a XPRT ref when hooking events

VC - RELEASE after DESTROY

2d13724

Many error paths call DESTROY, which will unlink and drop the ref. This means that the final RELEASE will free, causing the DESTROY to use-after-free. Instead, make sure we DESTROY first. Signed-off-by: Daniel Gryniewicz <[email protected]>

Merge pull request #208 from dang/push

8bfe968

VC - RELEASE after DESTROY

Avoid race checking for GSS init

d4f1c2d

The check for initialization in GSS was not locked, so there could be a race where two threads both initialize. Fix this by moving the check inside the lock. Signed-off-by: Daniel Gryniewicz <[email protected]>

Merge pull request #210 from dang/push

51ff851

Avoid race checking for GSS init

In clnt_tli_ncreate() close FD in case client creation failed

094c355

In clnt_tli_ncreate() if client creation fails then we don't close FD and cause FD leak. Fixed by closing FD in case of client creation failure. Signed-off-by: Madhu Thorat <[email protected]>

work_pool: Make sure thread max > channels.

c8310d0

Merge pull request #213 from Gaurav-Gangalwar/push

1f128c4

work_pool: Fix for efficient scheduling.

rqst: Fix hang in non blocking send.

f6f3058

We are updating data->ptr while rearming fd_send, which is changing the data fd since its union, cauing IO is hung. Signed-off-by: Gaurav Gangalwar <[email protected]>

Merge pull request #214 from Gaurav-Gangalwar/push

dadc0cd

rqst: Fix hang in non blocking send.

Merge pull request #211 from madhuthorat/upstatdfail_fdclose

925ee1e

In clnt_tli_ncreate() close FD in case client creation failed

Fix mem_leak in xdr_rpc_gss_buf

06dac58

We need to free gss_buffer

Merge pull request #216 from ashishsangwan2/next

ec6083e

Fix mem_leak in xdr_rpc_gss_buf

malahal force-pushed the next branch 2 times, most recently from 2ff5a1b to f21f32a Compare August 27, 2020 14:22

Prevent set_threadgroups() failure

031cebc

Neither uid nor gid should be -1. If a client sends such a value, setfsuid, setfsuid or __NR_setgroups may fail. Check for invalid uid/gid and return a failure in such a case. Signed-off-by: Malahal Naineni <[email protected]>

malahal force-pushed the next branch from f21f32a to 031cebc Compare August 27, 2020 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Next #20

Next #20

malahal commented Apr 22, 2019

Next #20

Are you sure you want to change the base?

Next #20

Conversation

malahal commented Apr 22, 2019