Skip to content

Commit c7575f3

Browse files
committed
release 1.2.3
1 parent 480507c commit c7575f3

File tree

8 files changed

+150
-35
lines changed

8 files changed

+150
-35
lines changed

ERRATA.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
2+
# AWS EC2 FPGA HDK+SDK Errata
3+
4+
Any items in this release marked as WIP (Work-in-progress) or NA (Not avaiable yet) are not currently supported by the 1.2.0 release.
5+
6+
## Integrated DMA in Beta Release. AWS Shell now includes DMA capabilities on behalf of the CL
7+
* The DMA bus toward the CL is multiplexed over sh_cl_dma_pcis AXI4 interface so the same address space can be accessed via DMA or directly via PCIe AppPF BAR4
8+
* DMA usage is covered in the new [CL_DRAM_DMA example](./hdk/cl/examples/cl_dram_dma) RTL verification/simulation and Software
9+
* A corresponding AWS Elastic DMA ([EDMA](./sdk/linux_kernel_drivers/edma)) driver is provided.
10+
* [EDMA Installation Readme](./sdk/linux_kernel_drivers/edma/edma_install.md) provides installation and usage guidlines
11+
* The initial release supports a single queue in each direction
12+
* DMA support is in Beta stage with a known issue for DMA READ transactions that cross 4K address boundaries. See [Kernel_Drivers_README](./sdk/linux_kernel_drivers/edma/README.md) for more information on restrictions for this releas
13+
14+
## Implementation Restrictions
15+
16+
* PCIE AXI4 interfaces between Custom Logic(CL) and Shell(SH) have following restrictions:
17+
* All PCIe transactions must adhere to the PCIe Exress base spec
18+
* 4Kbyte Address boundary for all transactions(PCIe restriction)
19+
* Multiple outstanding outbound PCIe Read transactions with same ID not supported
20+
* PCIE extended tag not supported, so read-request is limited to 32 outstanding
21+
* Address must match DoubleWord(DW) address of the transaction
22+
* WSTRB(write strobe) must reflect appropriate valid bytes for AXI write beats
23+
* Only Increment burst type is supported
24+
* AXI lock, memory type, protection type, Quality of service and Region identifier are not supported
25+
* PCIE AXI4 interfaces between Custom Logic(CL) and Shell(SH) must follow the AMBA AXI4 protocol specification.
26+
* Prior to running on F1 instance, it is highly recommended that developers run logic simulations with the ARM or Xilinx AXI4 protocol checker
27+
28+
29+
## Unsupported Features (Planned for future releases)
30+
31+
* PCI-M AXI interface is not supported in this release.
32+
* FPGA to FPGA communication over PCIe for F1.16xl
33+
* FPGA to FPGA over the 400Gbps Ring for F1.16xl
34+
* Aurora and Reliabile Aurora modules for the FPGA-to-FPGA
35+
* Preserving the DRAM content between different AFI loads (by the same running instance)
36+
* Cadence RTL simulations tools
37+
* All AXI-4 interfaces (PCIM, DDR4) do not support AxSIZE other than 0b110 (64B)
38+
39+
## Known Bugs/Issues
40+
41+
* The PCI-M AXI interface is not supported in this release.
42+
* The interface is included in cl_ports.vh and required in a CL design, but not enabled for functional use
43+
44+
* The integrated DMA function is in Beta stage. Known issues:
45+
* DMA READ addresses crossing 4K page boundaries. The failure can be triggered by READ transfers that start on an address other than 4K aligned AND cross the 4K page boundary. READ transfers that do not cross the 4K boundary OR transfers that start at the beginning of a 4K page and greater than 4K size are not susceptible to the error. WRITE transfers are not affected by this issue Developers should use 4K aligned address boundaries on any READ transfer that can cross a 4K boundary to avoid the issue.
46+
* Transfer sizes of 8KB or less are supported with the integrated DMA engine for this revision of the Shell. Integrated DMA with large transfer sizes (16KB or greater) can cause timeouts between the Shell and CL if the Shell can’t respond with all data before the timeout. Please see documentation on how to [detect a timeout has occured](./hdk/docs/HOWTO_detect_shell_timeout.md)
47+
48+
49+

RELEASE_NOTES.md

Lines changed: 12 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11

22
# AWS EC2 FPGA HDK+SDK Release Notes
33

4+
See [Errata](./ERRATA.md) for additional documentation of unsupported features and known bugs/issues.
45

56
## AWS EC2 F1 Platform Features:
67
* 1-8 Xilinx UltraScale+ VU9P based FPGA slots
@@ -26,8 +27,15 @@
2627
* 1 DDR controller implemented in the SH (always available)
2728
* 3 DDR controllers implemented in the CL (configurable number of implemented controllers allowed)
2829

30+
# Release 1.2.3
31+
* New [Errata](./ERRATA.md)
32+
* Added debug probes (.ltx) generation to build scripts
33+
* Fixed a bug with the simulation model that fixed the AXI behavior of wlast on unaligned address
34+
* Added [timeout debug documentation](./hdk/docs/HOWTO_detect_shell_timeout.md)
35+
2936
# Release 1.2.2
3037
* Expanded [clock recipes](./hdk/docs/clock_recipes.csv)
38+
* Virtual JTAG documentation updates
3139
* Reduced DCP build times by 13% (34 mins) for cl_dram_dma example by adding an option to disable virtual jtag
3240
* Included encryption of .sv files for CL examples
3341

@@ -43,12 +51,12 @@
4351
## NOTE on Release 1.2.0
4452
Release 1.2.0 is the first Generally Available release of the Shell, HDK, and SDK. This release provides F1 developers with documentation and tools to start building their Custom Logic (CL) designs to work with the F1 instances.
4553

46-
Any items in this release marked as WIP (Work-in-progress) or NA (Not avaiable yet) are not currently supported by the 1.2.0 release.
54+
Any items in this release marked as WIP (Work-in-progress) or NA (Not avaiable yet) are not currently supported by the 1.2.X release.
4755

4856

4957
## Release 1.2.0 Content Overview
5058

51-
This is the first Generally Available release of the AWS EC2 FPGA Development Kit. Major updates are included for both the HDK and SDK directories. 1.2.0 a required version for all Developers running on F1 instances, and prior releases of the FPGA Development Kit are not supported.
59+
This is the first Generally Available release of the AWS EC2 FPGA Development Kit. Major updates are included for both the HDK and SDK directories. 1.2.X is required version for all Developers running on F1 instances, and prior releases of the FPGA Development Kit are not supported.
5260

5361
**All AFIs created with previous HDK versions will no longer correctly load on an F1 instance**, hence a `fpga-load-loca-image` command executed with an AFI created prior to 1.2.0 will return an error and not load.
5462

@@ -194,7 +202,7 @@ Additional tunable auxiliary clocks are generated by the Shell and fed to the CL
194202

195203
* Matching the new Shell/CL interface
196204
* Add support for 32-bit peek/poke via ocl\_ AXI-L bus
197-
* Adding Virtual JTAG support with Xilinx ILA and VIO debug cores (WIP)
205+
* Virtual JTAG support with Xilinx ILA and VIO debug cores
198206
* Demonstrate the use of Virtual LED and Virtual DIPSwitch
199207
* Runtime software examples, leveraging fpga_pci and fpga_mgmt C-libraries
200208
* Updated PCIe Vendor ID and Device ID
@@ -208,7 +216,7 @@ Additional tunable auxiliary clocks are generated by the Shell and fed to the CL
208216
* Using SystemVerilog Bus constructs to simplify the code
209217
* Demonstrate the use of User interrupts
210218
* Demonstrate the use of bar1\_ AXI-L bus
211-
* Includes Runtime C-code application under [CL_DRAM_DMA software](./hdk/cl/examples/cl_dram_dma/software) (WIP)
219+
* Includes Runtime C-code application under [CL_DRAM_DMA software](./hdk/cl/examples/cl_dram_dma/software)
212220
* See [CL_DRAM_DMA README](./hdk/cl/examples/cl_dram_dma/README.md)
213221

214222

@@ -284,24 +292,6 @@ Additional tunable auxiliary clocks are generated by the Shell and fed to the CL
284292
* Only Increment burst type is supported
285293
* AXI lock, memory type, protection type, Quality of service and Region identifier are not supported
286294

287-
## Unsupported Features (Planned for future releases)
288-
289-
* PCI-M AXI interface is not supported in this release.
290-
* FPGA to FPGA communication over PCIe for F1.16xl
291-
* FPGA to FPGA over the 400Gbps Ring for F1.16xl
292-
* Aurora and Reliabile Aurora modules for the FPGA-to-FPGA
293-
* Preserving the DRAM content between different AFI loads (by the same running instance)
294-
* Cadence RTL simulations tools
295-
* All AXI-4 interfaces (PCIM, DDR4) do not support AxSIZE other than 0b110 (64B)
296-
297-
## Known Bugs/Issues
298-
299-
* The PCI-M AXI interface is not supported in this release. The interface is included in cl_ports.vh and required in a CL design, but not enabled for functional use in this release.
300-
301-
* The integrated DMA function is in Beta stage. There is a known issue with DMA READ addresses crossing 4K page boundaries. The failure can be triggered by READ transfers that start on an address other than 4K aligned AND cross the 4K page boundary. READ transfers that do not cross the 4K boundary OR transfers that start at the beginning of a 4K page and greater than 4K size are not susceptible to the error. WRITE transfers are not affected by this issue Developers should use 4K aligned address boundaries on any READ transfer that can cross a 4K boundary to avoid the issue.
302-
303-
* aws_dcp_verify flow (aws_dcp_verify.tcl) does not work. The script will be fixed in a future release. Currently the script will always give an error even if the DCP is OK.
304-
305295
## Supported Tools and Environment
306296

307297
* The HDK and SDK are designed for **Linux** environment and has not been tested on other platforms

hdk/cl/examples/cl_dram_dma/build/scripts/create_dcp_from_cl.tcl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,10 @@ if { $failval==0 } {
347347
puts "AWS FPGA: ([clock format [clock seconds] -format %T]) writing post synth checkpoint.";
348348

349349
write_checkpoint -force $CL_DIR/build/checkpoints/${timestamp}.CL.post_synth.dcp
350+
351+
# Generate debug probes file
352+
write_debug_probes -force -no_partial_ltxfile -file $CL_DIR/build/checkpoints/${timestamp}.debug_probes.ltx
353+
350354
close_project
351355
#Set param back to default value
352356
set_param sta.enableAutoGenClkNamePersistence 1

hdk/cl/examples/cl_hello_world/build/scripts/create_dcp_from_cl.tcl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -476,6 +476,10 @@ report_timing_summary -file $CL_DIR/build/reports/${timestamp}.SH_CL_final_timin
476476
puts "AWS FPGA: ([clock format [clock seconds] -format %T]) writing final DCP to to_aws directory.";
477477

478478
write_checkpoint -force $CL_DIR/build/checkpoints/to_aws/${timestamp}.SH_CL_routed.dcp
479+
480+
# Generate debug probes file
481+
write_debug_probes -force -no_partial_ltxfile -file $CL_DIR/build/checkpoints/${timestamp}.debug_probes.ltx
482+
479483
close_project
480484

481485
# ################################################

hdk/common/verif/models/sh_bfm/sh_bfm.sv

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1872,14 +1872,16 @@ module sh_bfm #(
18721872
bit last_beat;
18731873
logic [5:0] start_addr;
18741874
bit aligned;
1875+
bit last_data_beat;
18751876

18761877
num_of_data_beats = 0;
1877-
byte_cnt = 0;
1878-
num_bytes = 0;
1879-
aligned_addr = 0;
1880-
last_beat = 0;
1881-
start_addr = 0;
1882-
aligned = 0;
1878+
last_data_beat = 0;
1879+
byte_cnt = 0;
1880+
num_bytes = 0;
1881+
aligned_addr = 0;
1882+
last_beat = 0;
1883+
start_addr = 0;
1884+
aligned = 0;
18831885

18841886
for (int chan = 0; chan < 4; chan++) begin
18851887
if ((h2c_dma_started[chan] != 1'b0) && (h2c_dma_list[chan].size() > 0)) begin
@@ -1922,9 +1924,10 @@ module sh_bfm #(
19221924
axi_data.data = 0;
19231925
axi_data.strb = 64'b0;
19241926
axi_data.id = chan;
1925-
axi_data.last = (((num_of_data_beats - 1) - burst_cnt) == 0) ? 1 : 0;
1927+
last_data_beat = (((num_of_data_beats - 1) - burst_cnt) == 0) ? 1 : 0;
19261928
num_bytes = last_beat ? (dop.len + dop.cl_addr[5:0])%64 : 64;
1927-
if(axi_data.last) begin
1929+
axi_data.last = (j == axi_cmd.len) ? 1 : 0;
1930+
if(last_data_beat) begin
19281931
for(int i=0; i < num_bytes; i++) begin
19291932
axi_data.data = axi_data.data | tb.hm_get_byte(.addr(dop.buffer + byte_cnt)) << 8*i;
19301933
axi_data.strb = axi_data.strb | 1 << i;
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
2+
# AXI Timeouts
3+
4+
* The Shell provides a timeout mechanism which terminates any outstanding AXI transactions after 2.5 uS. There is a separate timeout per interface. Upon the first timeout, metrics registers are updated with the offending address and a counter is incremented. Upon further timeouts the counter is incremented. These metrics registers can be read via the fpga-describe-local-image found in [Amazon FPGA Image Management Tools README](../../sdk//userspace/fpga_mgmt_tools/README.md)
5+
6+
* Timeouts can occur for three reasons:
7+
1. The CL doesn’t respond to the address (reserved address space)
8+
2. The CL has a protocol violation on AXI which hangs the bus
9+
3. The address is going to F1 card’s DDR memory and the CL design’s latency is exceeding timeout value.
10+
11+
* Best practice is to ensure addresses to reserved address space are fully decoded in your CL design.
12+
* DMA accesses to DDR will accumulate which can sometimes lead to timeouts.
13+
* CL designs which have multiple masters to DDR will also incur arbitration delays.
14+
* If you suspect a timeout, debug by reading the metrics registers. The saved offending address should help narrow whether this is to DDR or registers/RAMs inside the FPGA. If it’s inside the FPGA the developer should investigate protocol violations.
15+
16+
# How to detect a shell timeout has occured
17+
18+
* Shell-CL interface timeouts can be detected by checking for non-zero timeout counters. These metrics can be read using this command:
19+
```
20+
$sudo fpga-describe-local-image -S 0 --metrics
21+
AFI 0 agfi-0f0e045f919413242 loaded 0 ok 0 0x04151701
22+
AFIDEVICE 0 0x1d0f 0xf000 0000:00:1d.0
23+
sdacl-slave-timeout=0
24+
virtual-jtag-slave-timeout=0
25+
ocl-slave-timeout=0
26+
bar1-slave-timeout=0
27+
dma-pcis-timeout=0
28+
pcim-range-error=0
29+
pcim-axi-protocol-error=0
30+
pcim-axi-protocol-4K-cross-error=0
31+
pcim-axi-protocol-bus-master-enable-error=0
32+
pcim-axi-protocol-request-size-error=0
33+
pcim-axi-protocol-write-incomplete-error=0
34+
pcim-axi-protocol-first-byte-enable-error=0
35+
pcim-axi-protocol-last-byte-enable-error=0
36+
pcim-axi-protocol-bready-error=0
37+
pcim-axi-protocol-rready-error=0
38+
pcim-axi-protocol-wchannel-error=0
39+
sdacl-slave-timeout-addr=0x0
40+
sdacl-slave-timeout-count=0
41+
virtual-jtag-slave-timeout-addr=0x0
42+
virtual-jtag-slave-timeout-count=0
43+
ocl-slave-timeout-addr=0x8001
44+
ocl-slave-timeout-count=0
45+
bar1-slave-timeout-addr=0x2001
46+
bar1-slave-timeout-count=0
47+
dma-pcis-timeout-addr=0x0
48+
dma-pcis-timeout-count=0
49+
pcim-range-error-addr=0x0
50+
pcim-range-error-count=0
51+
pcim-axi-protocol-error-addr=0x0
52+
pcim-axi-protocol-error-count=0
53+
pcim-write-count=0
54+
pcim-read-count=0
55+
DDR0
56+
write-count=0
57+
read-count=0
58+
DDR1
59+
write-count=0
60+
read-count=0
61+
DDR2
62+
write-count=29797854199
63+
read-count=4
64+
DDR3
65+
write-count=0
66+
read-count=0
67+
```
68+
* For detailed infomation on metrics, see [Amazon FPGA Image Management Tools README](../../sdk//userspace/fpga_mgmt_tools/README.md)

hdk/docs/Virtual_JTAG_XVC.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -129,10 +129,7 @@ Upon successful connection, Vivado's Hardware panel will be populated with a deb
129129

130130
5) Select the debug bridge instance from the Vivado Hardware panel
131131

132-
6) You will need a "Probes file" in the next step. Once you run the EC2 API create-fpga-image and the process of creating the AFI is complete, a "Probes file" is generated that has a ".ltx" extension.
133-
```
134-
$ aws s3 cp s3://<bucket-name>/<logs-folder-name>/*_debug_probes.ltx $CL_DIR #copy to the example directory
135-
```
132+
6) You will need a "Probes file" in the next step. A "Probes file" with an ".ltx" extension is generated during the build process and written to the checkpoints directory.
136133

137134
7) In the Hardware Device Properties window select the appropriate “Probes file” for your design by clicking the icon next to the “Probes file” entry, selecting the file, and clicking “OK”. This will refresh the hardware device and it should now show the debug cores present in your design. Note the Probes file is written out during the design implementation, and is typically has the extension ".ltx".
138135

hdk/hdk_version.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
HDK_VERSION=1.2.2
1+
HDK_VERSION=1.2.3

0 commit comments

Comments
 (0)