Skip to content

[Bug] 协程黑名单线程内 Selector.select() 无法被 wakeup 唤醒 #734

@gDreamcatcher

Description

@gDreamcatcher

Description
如果线程被加入协程黑名单,在该线程中使用 Selector.select() 进行 I/O 等待,从另一个线程调用Selector.wakeup() 尝试唤醒时会失败。

Steps to Reproduce
运行命令(复现问题):

  • javac WispSelectorDeadlockTest.java
  • java -XX:+UnlockExperimentalVMOptions -XX:+UseWisp2 -Dcom.alibaba.wisp.threadAsWisp.black=name:IoLoopGroup* WispSelectorDeadlockTest

JDK version
openjdk version "1.8.0_462"
OpenJDK Runtime Environment (Alibaba Dragonwell Extended Edition 8.26.25) (build 1.8.0_462-b01)
OpenJDK 64-Bit Server VM (Alibaba Dragonwell Extended Edition 8.26.25) (build 25.462-b01, mixed mode)

测试用例

import java.io.IOException;
import java.nio.channels.Pipe;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;

public class WispSelectorDeadlockTest {
    private static final String TEST_THREAD_NAME = "IoLoopGroup-Thread";
    
    public static void main(String[] args) {
       
        boolean testResult = testRealSelectorWakeup();
        if (testResult) {
            System.out.println("PASS - Selector正常工作,在指定时间唤醒");
        } else {
            System.out.println("FAIL - 未在指定时间唤醒!");
        }
    }
    
    /**
     * 测试真实的Selector wakeup调用
     * 这个测试更接近实际场景,直接使用Selector.wakeup()方法
     */
    private static boolean testRealSelectorWakeup() {
        
        long TIMEOUT_MS = 10000;
        CountDownLatch selectorReady = new CountDownLatch(1);
        CountDownLatch testCompleted = new CountDownLatch(1);
        
        try (Selector selector = Selector.open()) {
            Pipe pipe = Pipe.open();
            // 创建IoLoopGroup线程 - 这个线程名会触发协程黑名单
            new Thread(() -> {
                try {
                    // 注册一个channel到selector(模拟真实场景)
                    Pipe.SourceChannel sourceChannel = pipe.source();
                    sourceChannel.configureBlocking(false);
                    sourceChannel.register(selector, SelectionKey.OP_READ);
                    selectorReady.countDown();
                    int selectedKeys = selector.select(TIMEOUT_MS*3); // 设置长超时
                    testCompleted.countDown();
                } catch (Exception e) {
                    System.err.println("  IoLoopGroup线程异常: " + e.getMessage());
                }
            }, TEST_THREAD_NAME).start();
     
                        
            if (!selectorReady.await(2000, TimeUnit.MILLISECONDS)) {
                System.err.println("  错误:Selector创建超时");
                return false;
            }
            Thread.sleep(800); // 等待800ms后执行wakeup
            selector.wakeup();
    
            // 等待测试完成或检测超时
            boolean completed = testCompleted.await(TIMEOUT_MS, TimeUnit.MILLISECONDS);
            return completed;
        } catch(Exception e) {
            System.err.println("  错误:" + e.getMessage());
        }
        return false;

    }
}

jstack

2025-11-11 15:51:50
Full thread dump OpenJDK 64-Bit Server VM (25.462-b01 mixed mode):

"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007f41dc0024b0 nid=0xce93 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"IoLoopGroup-Test-Thread" #11 prio=5 os_prio=0 tid=0x00007f421c1d4b60 nid=0xcbed runnable [0x00007f4220ae8000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPoll.epollWait(Native Method)
        at sun.nio.ch.EPollPort$1.eventWait(EPollPort.java:387)
        at com.alibaba.wisp.engine.WispEventPump.epollWaitForWisp(WispEventPump.java:244)
        at com.alibaba.wisp.engine.WispEventPump.access$400(WispEventPump.java:40)
        at com.alibaba.wisp.engine.WispEventPump$Pool.epollWaitForWisp(WispEventPump.java:124)
        at com.alibaba.wisp.engine.WispEngine$5.epollWait(WispEngine.java:227)
        at sun.nio.ch.EPollArrayWrapper.handleEPollWithWisp(EPollArrayWrapper.java:298)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:279)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:96)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x0000000080408848> (a sun.nio.ch.Util$3)
        - locked <0x00000000804087c0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000080408060> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at WispSelectorDeadlockTest.lambda$testRealSelectorWakeup$0(WispSelectorDeadlockTest.java:85)
        at WispSelectorDeadlockTest$$Lambda$5/81628611.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:855)

   Locked ownable synchronizers:
        - None

"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x00007f421c12c8d0 nid=0xcbeb runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"C1 CompilerThread1" #9 daemon prio=9 os_prio=0 tid=0x00007f421c12a640 nid=0xcbea waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"C2 CompilerThread0" #8 daemon prio=9 os_prio=0 tid=0x00007f421c128990 nid=0xcbe9 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Signal Dispatcher" #7 daemon prio=9 os_prio=0 tid=0x00007f421c126bf0 nid=0xcbe8 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Wisp-Root-Worker-0" #4 daemon prio=5 os_prio=0 tid=0x00007f421c1251c0 nid=0xcbe7 runnable [0x00007f42210ee000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPoll.epollWait(Native Method)
        at sun.nio.ch.EPollPort$1.eventWait(EPollPort.java:387)
        at com.alibaba.wisp.engine.WispEventPump.pollAndDispatchEvents(WispEventPump.java:322)
        at com.alibaba.wisp.engine.WispScheduler$Worker.doPolling(WispScheduler.java:203)
        at com.alibaba.wisp.engine.WispScheduler$Worker.doParkOrPolling(WispScheduler.java:190)
        at com.alibaba.wisp.engine.WispScheduler$Worker.runCarrier(WispScheduler.java:170)
        at com.alibaba.wisp.engine.WispScheduler$Worker.run(WispScheduler.java:141)
        at java.lang.Thread.run(Thread.java:855)

   Locked ownable synchronizers:
        - None

"Wisp-Sysmon" #6 daemon prio=5 os_prio=0 tid=0x00007f421c123a90 nid=0xcbe6 waiting on condition [0x00007f42211ef000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park0(Native Method)
        at sun.misc.Unsafe.access$200(Unsafe.java:45)
        at sun.misc.Unsafe$1.park0(Unsafe.java:65)
        at com.alibaba.wisp.engine.WispSysmon.sysmonLoop(WispSysmon.java:72)
        at com.alibaba.wisp.engine.WispSysmon$$Lambda$4/1324119927.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:855)

   Locked ownable synchronizers:
        - None

"Wisp-Unpark-Dispatcher" #5 daemon prio=5 os_prio=0 tid=0x00007f421c121710 nid=0xcbe5 runnable [0x00007f42212f0000]
   java.lang.Thread.State: RUNNABLE
        at com.alibaba.wisp.engine.WispEngine.getProxyUnpark(Native Method)
        at com.alibaba.wisp.engine.WispEngine.access$000(WispEngine.java:60)
        at com.alibaba.wisp.engine.WispEngine$4.run(WispEngine.java:185)
        at java.lang.Thread.run(Thread.java:855)

   Locked ownable synchronizers:
        - None

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f421c084a40 nid=0xcbe4 runnable [0x00007f422184d000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000008020bb30> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
        - waiting to lock <0x000000008020bb30> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:250)

   Locked ownable synchronizers:
        - None

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f421c080cb0 nid=0xcbe3 runnable [0x00007f422194e000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000080209ec8> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - waiting to lock <0x0000000080209ec8> (a java.lang.ref.Reference$Lock)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

   Locked ownable synchronizers:
        - None

"main" #1 prio=5 os_prio=0 tid=0x00007f421c009d40 nid=0xcbe1 waiting on condition [0x00007f422549f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park0(Native Method)
        at sun.misc.Unsafe.access$200(Unsafe.java:45)
        at sun.misc.Unsafe$1.park0(Unsafe.java:65)
        at com.alibaba.wisp.engine.WispTask.parkInternal(WispTask.java:423)
        at com.alibaba.wisp.engine.WispTask.park(WispTask.java:484)
        at sun.nio.ch.SelectorImpl.implCloseSelector(SelectorImpl.java:111)
        - waiting to lock <0x0000000080408060> (a sun.nio.ch.EPollSelectorImpl)
        at java.nio.channels.spi.AbstractSelector.close(AbstractSelector.java:111)
        at WispSelectorDeadlockTest.testRealSelectorWakeup(WispSelectorDeadlockTest.java:103)
        at WispSelectorDeadlockTest.main(WispSelectorDeadlockTest.java:57)

   Locked ownable synchronizers:
        - None

"VM Thread" os_prio=0 tid=0x00007f421c077090 nid=0xcbe2 runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00007f421c1372a0 nid=0xcbec waiting on condition

JNI global references: 340

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions