Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: dma: intel_adsp_hda: fix intel_adsp_hda_unused() check #77805

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Aug 30, 2024

drivers: dma: intel_adsp_hda: fix intel_adsp_hda_unused()
check

The ringbuffer availability check is subject to race with regards to
update of BF (Buffer Full) and BNE (Buffer Not Empty) bits in DGCS
register, and status of RP (Read Position) and WP (Write Position).

Following sequence is observed without this patch when
calling dma_get_status() on multiple Intel ADSP platforms:

iter 154 pending 1536 RP 768 WP 768, BNE 1, BF 1
-> dma_reload for 384
iter 155 pending 1536 RP 1152 WP 1152, BNE 1, BF 1
-> dma_reload for 384
iter 156 pending 0 RP 0 WP 0, BNE 1, BF 0

Value of pending is not expected to go from 1536 to zero if only 384
bytes have been consumed via dma_reload() since last call to
dma_get_status().

Change the logic to read DGCS register later, after the WP and RP have
been already read, and only check the BNE bit if Read and Write
Positions are equal.

Link: thesofproject/sof#9418
Signed-off-by: Kai Vehmanen [email protected]
Co-developed-by: Peter Ujfalusi [email protected]
Signed-off-by: Peter Ujfalusi [email protected]

@kv2019i kv2019i requested a review from ujfalusi August 30, 2024 12:51
@zephyrbot zephyrbot added platform: Intel ADSP Intel Audio platforms size: XS A PR changing only a single line of code labels Aug 30, 2024
@@ -283,7 +283,7 @@ static inline uint32_t intel_adsp_hda_unused(uint32_t base, uint32_t regblock_si
int32_t wp = *DGBWP(base, regblock_size, sid);
int32_t size = rp - wp;

if (size <= 0) {
if (size < 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively you could also re-read DGCS after reading rp and wp. It's quite possible that when DGCS is read line 269, the buffer full status is not yet updated.

I also wonder if we should even use the BNE and BF bits and instead only use read and write pointers. this is super racy to read multiple registers that all can change at any time, and use their value to infer a position ....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or the decision should be:
read RP and WP, if RP==WP, read the DGCS and see if it is empty or full?

Copy link
Collaborator

@ujfalusi ujfalusi Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, this diff also works correctly:

diff --git a/soc/intel/intel_adsp/common/include/intel_adsp_hda.h b/soc/intel/intel_adsp/common/include/intel_adsp_hda.h
index 1616d1cb67b2..782ae368b103 100644
--- a/soc/intel/intel_adsp/common/include/intel_adsp_hda.h
+++ b/soc/intel/intel_adsp/common/include/intel_adsp_hda.h
@@ -273,24 +273,25 @@ static inline bool intel_adsp_hda_is_enabled(uint32_t base, uint32_t regblock_si
  */
 static inline uint32_t intel_adsp_hda_unused(uint32_t base, uint32_t regblock_size, uint32_t sid)
 {
-	uint32_t dgcs = *DGCS(base, regblock_size, sid);
 	uint32_t dgbs = *DGBS(base, regblock_size, sid);
-
-	/* Check if buffer is empty */
-	if ((dgcs & DGCS_BNE) == 0) {
-		return dgbs;
-	}
-
-	/* Check if the buffer is full */
-	if (dgcs & DGCS_BF) {
-		return 0;
-	}
-
 	int32_t rp = *DGBRP(base, regblock_size, sid);
 	int32_t wp = *DGBWP(base, regblock_size, sid);
 	int32_t size = rp - wp;
 
-	if (size <= 0) {
+	if (size == 0) {
+		uint32_t dgcs = *DGCS(base, regblock_size, sid);
+
+		/* Check if buffer is empty */
+		if ((dgcs & DGCS_BNE) == 0) {
+			return dgbs;
+		}
+
+		/*
+		 * Buffer is not empty and pointers equal, it can only be full, no need to check
+		 * the DGCS_BF flag
+		 */
+
+	} else if (size < 0) {
 		size += dgbs;
 	}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your change can save two register read most of the time when the DGCS flags are correct, you only read the RP/WP otherwise.

If we go with that, the commit message should be updated, I don't get how we 'prioritize BNE bit' at all, what we are at the end doing is written in the comment of the diff I posted above.
Also, a comment explaining what is going on would not hurt in the code either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack @plbossart and @ujfalusi . I now uploaded V2 to use @ujfalusi 's version. This is called once per timer tick, so I don't think avoiding register reads is critical here and reading DGCS seems like the safer approach.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Aug 30, 2024

SOF test PR for V2 of this at thesofproject/sof#9423

Copy link
Collaborator

@ujfalusi ujfalusi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to make of the LNL-HDA alsabat results, but apart from the small coding style issue, I think this looks good and I have tested it myself on TGL where I have been able to reproduce t he original issue.

The ringbuffer availability check is subject to race with regards to
update of BF (Buffer Full) and BNE (Buffer Not Empty) bits in DGCS
register, and status of RP (Read Position) and WP (Write Position).

Following sequence is observed without this patch when
calling dma_get_status() on multiple Intel ADSP platforms:

iter 154 pending 1536 RP 768 WP 768, BNE 1, BF 1
-> dma_reload for 384
iter 155 pending 1536 RP 1152 WP 1152, BNE 1, BF 1
-> dma_reload for 384
iter 156 pending 0 RP 0 WP 0, BNE 1, BF 0

Value of pending is not expected to go from 1536 to zero if only 384
bytes have been consumed via dma_reload() since last call to
dma_get_status().

Change the logic to read DGCS register later, after the WP and RP have
been already read, and only check the BNE bit if Read and Write
Positions are equal.

Link: thesofproject/sof#9418
Signed-off-by: Kai Vehmanen <[email protected]>
Co-developed-by: Peter Ujfalusi <[email protected]>
Signed-off-by: Peter Ujfalusi <[email protected]>
@kv2019i kv2019i requested a review from teburd September 3, 2024 10:42
@kv2019i
Copy link
Collaborator Author

kv2019i commented Sep 3, 2024

@teburd can you take a look?

@carlescufi carlescufi merged commit 81977f2 into zephyrproject-rtos:main Sep 4, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform: Intel ADSP Intel Audio platforms size: XS A PR changing only a single line of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants