Skip to content

DSP: add dsp support #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: siyuan-embarc_mli_v2.0-base_tempo
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -564,6 +564,11 @@ config CPU_HAS_DCLS
This option is enabled when the processor hardware is configured in
Dual-redundant Core Lock-step (DCLS) topology.

config CPU_HAS_DSP
bool
help
This option is enabled when the CPU has hardware DSP unit.

config CPU_HAS_FPU
bool
help
Expand Down Expand Up @@ -742,6 +747,24 @@ config FPU_SHARING

endmenu

menu "Digital Signal Processing Options"

config DSP
bool "This option enables digital signal processing (DSP)"
depends on CPU_HAS_DSP
help
This option enables DSP and DSP instructions.

config DSP_SHARING
bool "DSP register sharing"
depends on DSP && MULTITHREADING
help
This option enables preservation of the hardware DSP registers
across context switches to allow multiple threads to perform concurrent
DSP operations.

endmenu

menu "Cache Options"

config DCACHE
Expand Down
44 changes: 44 additions & 0 deletions arch/arc/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,50 @@ config ARC_NORMAL_FIRMWARE
resources of the ARC processors, and, therefore, it shall avoid
accessing them.

config ARC_DSP_BFLY_SHARING
bool "ARC complex DSP operation"
depends on DSP && CPU_ARCEM
default n
help
This option is to enable Zephyr to store and restore DSP_BFLY0
and FFT_CTRL registers during context switch. This option is
only required when butterfly instructions are used in
multi-thread.

config ARC_HAS_AGU_REGS
bool "ARC address generation unit registers"
default n
help
Processors with XY memory and AGU registers can configure this
option to accelerate DSP instrctions.

config AGU_SHARING
bool "ARC address generation unit register sharing"
depends on ARC_HAS_AGU_REGS && MULTITHREADING
help
This option enables preservation of the hardware AGU registers
across context switches to allow multiple threads to perform concurrent
operations on XY memory. Save and restore small size AGU registers is
set as default, including 4 address pointers regs, 2 address offset regs
and 4 modifiers regs.

config ARC_AGU_MEDIUM
bool "ARC AGU medium size register"
depends on AGU_SHARING
default n
help
Save and restore medium AGU registers, including 8 address pointers regs,
4 address offset regs and 12 modifiers regs.

config ARC_AGU_LARGE
bool "ARC AGU large size register"
depends on AGU_SHARING
default n
select ARC_AGU_MEDIUM
help
Save and restore large AGU registers, including 12 address pointers regs,
8 address offset regs and 24 modifiers regs.

menu "ARC MPU Options"
depends on CPU_HAS_MPU

Expand Down
63 changes: 62 additions & 1 deletion arch/arc/core/offsets/offsets.c
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,69 @@ GEN_OFFSET_SYM(_callee_saved_stack_t, dpfp2l);
GEN_OFFSET_SYM(_callee_saved_stack_t, dpfp1h);
GEN_OFFSET_SYM(_callee_saved_stack_t, dpfp1l);
#endif

#endif
#ifdef CONFIG_DSP_SHARING
GEN_OFFSET_SYM(_callee_saved_stack_t, dsp_ctrl);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to save the control registers (DSP_CTRL and DSP_FFT_CTRL)?
I guess it depends on the application, but the rounding/saturation etc. settings may be a global choice, not per task.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite familiar with DSP. But I think If two threads are running two different DSP applications, DSP_CTRL is quite likely different. And Jacco in MLI team told me that, MLI sets the dsp mode on every call, make sure that at least accumulator registers and dsp mode are saved and restored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, if also DSP mode is used dynamically and not statically configured, then it should be saved. Maybe ask @JaccovG to review this PR as well?

GEN_OFFSET_SYM(_callee_saved_stack_t, acc0_lo);
GEN_OFFSET_SYM(_callee_saved_stack_t, acc0_glo);
GEN_OFFSET_SYM(_callee_saved_stack_t, acc0_hi);
GEN_OFFSET_SYM(_callee_saved_stack_t, acc0_ghi);
#ifdef CONFIG_ARC_DSP_BFLY_SHARING
GEN_OFFSET_SYM(_callee_saved_stack_t, dsp_bfly0);
GEN_OFFSET_SYM(_callee_saved_stack_t, dsp_fft_ctrl);
#endif
#endif
#ifdef CONFIG_AGU_SHARING
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap0);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap1);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap2);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap3);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os0);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os1);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod0);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod1);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod2);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod3);
#ifdef CONFIG_ARC_AGU_MEDIUM
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap4);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap5);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap6);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap7);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os2);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os3);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod4);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod5);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod6);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod7);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod8);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod9);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod10);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod11);
#endif
#ifdef CONFIG_ARC_AGU_LARGE
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap8);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap9);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap10);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_ap11);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os4);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os5);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os6);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_os7);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod12);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod13);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod14);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod15);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod16);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod17);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod18);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod19);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod20);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod21);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod22);
GEN_OFFSET_SYM(_callee_saved_stack_t, agu_mod23);
#endif
#endif

GEN_ABSOLUTE_SYM(___callee_saved_stack_t_SIZEOF, sizeof(_callee_saved_stack_t));

GEN_ABSOLUTE_SYM(_K_THREAD_NO_FLOAT_SIZEOF, sizeof(struct k_thread));
Expand Down
33 changes: 33 additions & 0 deletions arch/arc/core/thread.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@
#include <zephyr/arch/arc/v2/mpu/arc_core_mpu.h>
#endif

#if defined(CONFIG_DSP) && defined(CONFIG_DSP_SHARING)
static struct k_spinlock lock;
#endif
/* initial stack frame */
struct init_stack_frame {
uintptr_t pc;
Expand Down Expand Up @@ -253,3 +256,33 @@ int arch_float_enable(struct k_thread *thread, unsigned int options)
return 0;
}
#endif /* CONFIG_FPU && CONFIG_FPU_SHARING */

#if defined(CONFIG_DSP) && defined(CONFIG_DSP_SHARING)
int arch_dsp_disable(struct k_thread *thread, unsigned int options)
{
/* Ensure a preemptive context switch does not occur */

k_spinlock_key_t key = k_spin_lock(&lock);

/* Disable DSP or AGU capabilities for the thread */
thread->base.user_options &= ~(uint8_t)options;

k_spin_unlock(&lock, key);

return 0;
}

int arch_dsp_enable(struct k_thread *thread, unsigned int options)
{
/* Ensure a preemptive context switch does not occur */

k_spinlock_key_t key = k_spin_lock(&lock);

/* Enable dsp or agu capabilities for the thread */
thread->base.user_options |= (uint8_t)options;

k_spin_unlock(&lock, key);

return 0;
}
#endif /* CONFIG_DSP && CONFIG_DSP_SHARING */
62 changes: 62 additions & 0 deletions arch/arc/include/kernel_arch_data.h
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,69 @@ struct _callee_saved_stack {
uintptr_t dpfp1h;
uintptr_t dpfp1l;
#endif
#endif

#ifdef CONFIG_DSP_SHARING
#ifdef CONFIG_ARC_DSP_BFLY_SHARING
uintptr_t dsp_fft_ctrl;
uintptr_t dsp_bfly0;
#endif
uintptr_t acc0_ghi;
uintptr_t acc0_hi;
uintptr_t acc0_glo;
uintptr_t acc0_lo;
uintptr_t dsp_ctrl;
#endif

#ifdef CONFIG_AGU_SHARING
uintptr_t agu_ap0;
uintptr_t agu_ap1;
uintptr_t agu_ap2;
uintptr_t agu_ap3;
uintptr_t agu_os0;
uintptr_t agu_os1;
uintptr_t agu_mod0;
uintptr_t agu_mod1;
uintptr_t agu_mod2;
uintptr_t agu_mod3;
#ifdef CONFIG_ARC_AGU_MEDIUM
uintptr_t agu_ap4;
uintptr_t agu_ap5;
uintptr_t agu_ap6;
uintptr_t agu_ap7;
uintptr_t agu_os2;
uintptr_t agu_os3;
uintptr_t agu_mod4;
uintptr_t agu_mod5;
uintptr_t agu_mod6;
uintptr_t agu_mod7;
uintptr_t agu_mod8;
uintptr_t agu_mod9;
uintptr_t agu_mod10;
uintptr_t agu_mod11;
#endif
#ifdef CONFIG_ARC_AGU_LARGE
uintptr_t agu_ap8;
uintptr_t agu_ap9;
uintptr_t agu_ap10;
uintptr_t agu_ap11;
uintptr_t agu_os4;
uintptr_t agu_os5;
uintptr_t agu_os6;
uintptr_t agu_os7;
uintptr_t agu_mod12;
uintptr_t agu_mod13;
uintptr_t agu_mod14;
uintptr_t agu_mod15;
uintptr_t agu_mod16;
uintptr_t agu_mod17;
uintptr_t agu_mod18;
uintptr_t agu_mod19;
uintptr_t agu_mod20;
uintptr_t agu_mod21;
uintptr_t agu_mod22;
uintptr_t agu_mod23;
#endif
#endif
/*
* No need to save r31 (blink), it's either already pushed as the pc or
Expand Down
Loading