Skip to content

Program gets stuck in reader.stop_reading() #110

Open
@HanYangZhao

Description

@HanYangZhao

I'm testing the program by repeating reader.start_reading() and reader.stop_reading(). At some point (few minutes or few hours), reader.stop_reading() will not return resulting in the program getting stuck. I was unable to replicate the problem with a pure C implementation, so, there is something going on with the way python threads is mixed with c threads. Using gdb I was able to confirm that one of the c threads gets stuck

30 Jun 06:45:18 2020 - initializeReader
30 Jun 06:45:18 2020 - stopping current reader
stopping_read
read_callback_null
stats_callback_null
stopping_read_cs
read_callback
^C
Thread 1 "python3" received signal SIGINT, Interrupt.
__libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46      ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) info thread
  Id   Target Id         Frame 
* 1    Thread 0x76fee210 (LWP 3019) "python3" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  2    Thread 0x769b3460 (LWP 3025) "python3" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  3    Thread 0x75fff460 (LWP 3026) "python3" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  4    Thread 0x757fe460 (LWP 3027) "python3" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
(gdb) bt 
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
#1  0x76ec3072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x3f22b4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#2  __pthread_cond_wait_common (abstime=0x0, mutex=0x3f21e0, cond=0x3f2288) at pthread_cond_wait.c:502
#3  __pthread_cond_wait (cond=0x3f2288, mutex=0x3f21e0) at pthread_cond_wait.c:655
#4  0x76b6757a in TMR_stopReading (reader=0x3f10e0) at tm_reader_async.c:387
#5  0x76b5fab0 in Reader_stop_reading (self=0x3f10d8) at mercury.c:976
#6  0x0009ffb2 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) thread 4
[Switching to thread 4 (Thread 0x757fe460 (LWP 3027))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46      in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
#1  0x76ec5194 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=1, futex_word=0x3f21a8) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#2  do_futex_wait (sem=sem@entry=0x3f21a8, abstime=0x0) at sem_waitcommon.c:115
#3  0x76ec5274 in __new_sem_wait_slow (sem=0x3f21a8, abstime=0x0) at sem_waitcommon.c:282
#4  0x76b6839e in process_async_response (reader=0x3f10e0) at tm_reader_async.c:977
#5  0x76b689b2 in do_background_reads (arg=0x3f10e0) at tm_reader_async.c:1218

tm_reader_async.c:1218  -> process_async_response(reader);
tm_reader_async.c:977 -> sem_wait(&reader->queue_slots);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions