Skip to content

XLS backend#1475

Open
vasdommes wants to merge 104 commits into
fastmachinelearning:mainfrom
vasdommes:xls_backend
Open

XLS backend#1475
vasdommes wants to merge 104 commits into
fastmachinelearning:mainfrom
vasdommes:xls_backend

Conversation

@vasdommes
Copy link
Copy Markdown

@vasdommes vasdommes commented May 14, 2026

Description

This PR adds XLS backend. It is based on PR #1343, with most of the code rewritten and new features added.

Google XLS is an open-source (Apache 2) High Level Syntesis toolchain that produces an RTL (Verilog or SystemVerilog) design from a high-level description (DSLX or C++).

Adding XLS as a new hls4ml backend allows to generate RTL without vendor-specific dependencies and benefit from the developments that XLS brings to HLS field.

XLS workflow

XLS backend performs the following transformations:

DSLX -> IR -> (System)Verilog conversion is done by XLS.
IP is generated by Vivado. One can choose another vendor and generate IP from Verilog file manually.

XLS features

XLS backend supports the following layers: Input, ApplyAlpha, BatchNormalization, Dense, Conv1D, DepthwiseConv1D, Conv2D, DepthwiseConv2D, Pooling1D, Pooling2D, GlobalPooling1D, GlobalPooling2D, Merge, Concatenate, Dot, Activation, HardActivation, ParametrizedActivation, PReLU, Reshape, Softmax, Transpose, TernaryTanh.

You can override default codegen options as follows:

config = hls4ml.utils.config_from_keras_model(model)
# This sets hls_model.config['XLSCodegenFlags']
hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, backend='XLS',
    xls_codegen_flags={'delay_model': 'asap7', 'generator': 'pipeline', 'use_system_verilog': False}
)

DSLX standard library has only signed FixedPoint type (similar to ap_fixed). Thus, unsigned types are not supported.

Currently, XLS backend implements only IOType: io_parallel. Strategy is ignored.
All operations are fully unrolled.

io_stream could be implemented via DSLX procs. @calad0i and I are going to work on that after finishing this PR.

Other changes

I made some minor changes in non-XLS code:

Dependencies

XLS backend uses xls-python to access XLS API. It is enabled by dependency group xls:

pip install hls4ml[xls]

xls-python comes with batteries (libxls.so and DSLX standard library) included, no separate XLS installation is required.
The code has been tested for the version xls-python=0.1.9875.

Known issues

XLS doesn't work with Dense layer imported from PyTorch Linear layer because of shape mismatch: PyTorch stores Linear weights as (out_features, in_features), while hls4ml Dense layers use the Keras-style layout (in_features, out_features).

Repro: add XLS backend to test_pytorch_api.py/test_squeeze and run the test.

Note that the weights in this test are constant, and other backends flatten them without checking shape.
So, it is unclear whether they handle this situation correctly or not.

Type of change

  • Documentation update
  • New feature (non-breaking change which adds functionality)

Tests

📝 Please describe the tests that you ran to verify your changes.

  • Provide instructions so we can reproduce.
  • Please also list any relevant details for your test configuration.

XLS has been added to the following tests:
test_activations.py, test_auto_precision.py, test_binary_cnn.py, test_causalpadding.py, test_depthconv1d.py, test_depthconv2d.py, test_keras_api.py, test_keras_v3_api.py, test_merge.py, test_multi_dense.py, test_pointwiseconv.py, test_pooling.py, test_pytorch_api.py, test_reshape.py, test_sepconv1d.py, test_sepconv2d.py, test_softmax.py.

Test Configuration

Add xls dependency, e.g.

pip install .[da,testing,testing-keras2,sr,optimization,xls]"
# or: pip install .[da,testing,testing-keras3,sr,xls]"

and run tests, e.g.:

pytest test/pytest --randomly-dont-reset-seed -k XLS

Notes on performance

Some test cases are very slow for XLS (e.g. ~30 minutes vs ~10 seconds on other backends).
This happens because XLS generates (in model.compile()) and uses (in model.predict()) an optimized XLS IR code, where all loops are fully unrolled. The resulting file can be huge and thus slow for the likes of Conv2D.

During development, I made test faster by reducing dimensions in some tests.
For example, in test_keras_api.py/test_conv2d I replaced

input_shape = (28, 28, 3)
filters=32

with

input_shape = (14, 14, 3)
filters=8

I haven't pushed such changes, but that could be one of the ways of speeding things up.

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

Girjoaba and others added 30 commits July 16, 2025 18:04
… pass, merge of dense_relu written as an opt pass
This fixes two test cases in test_softmax.py
(one of them still fails due to another error)

TODO: check layer.class_name == 'Input' instead of taking layers[0]?
This fixes DSLX compilation error in test_softmax.py
# Conflicts:
#	docs/requirements.txt
#	hls4ml/backends/__init__.py
#	hls4ml/model/graph.py
#	hls4ml/report/__init__.py
#	test/pytest/test_activations.py
#	test/pytest/test_keras_api.py
#	test/pytest/test_softmax.py
See fastmachinelearning#1443

Setting 'strategy' for Softmax layer did not affect anything, and the code always chose the default implementation=stable.

TODO: all backends fail when implementation=latency (low accuracy, probably due to overflow).
vasdommes added 28 commits May 12, 2026 15:51
…ision, make assertion for TensorVariable precision less strict.
TODO: implement and test layers that actually use that, e.g. Bidirectional (multiple output) or Merge (multiple input).
QKeras can generate weights of XnorPrecisionType {0, 1}, which encode values {-1, 1}.
See e.g. test_binary_cnn.py
…mpile().

This avoids reparsing .opt.ir file on subsequent model.predict() calls.
Updated by running docs/attr_doc_gen.py

Added XLS backend and other things added to hls4ml since the last update of attributes.rst (Dec 2024):
- Libero backend
- New layers, e.g.: BipolarQuant, Cropping1D, Cropping2D
- New attributes, e.g.: n_inner and n_outer for Softmax.
@jmitrevs jmitrevs added the please test Trigger testing by creating local PR branch label May 16, 2026
… speed up tests.

This should fix timeout failure on CI: https://gitlab.cern.ch/fastmachinelearning/hls4ml/-/jobs/75309630

Note that XLS tests can be slow due to big (fully unrolled) IR size.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

please test Trigger testing by creating local PR branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_softmax.py does not test argmax and latency implementations; latency fails

3 participants