Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C API header and C language build test. #5796

Merged
merged 5 commits into from
Feb 13, 2025
Merged

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Jan 31, 2025

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds C API header file.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks + the gaps for the pipeline constructor parameters - are those passed only via the serialized pipeline?

Also, I suspect that we are returning pointers to internal Tensor/TensorList data directly (the layout and the shape). Any safety considerations - we break the encapsulation this way, and the user may overwrite them, if we just hand out the internal pointers.

include/dali/dali.h Show resolved Hide resolved
daliExecType_t exec_type;
daliBool enable_checkpointing;
daliBool enable_memory_stats;
} daliPipelineParams_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the criteria to get in this struct.
We also have set_affinity, control over queue depth and in python we can pass the information about expected output types and dimensionality. Do we consider any of those to be parts of the C API as well?
(Yes, there is stuff like callback pickler in Python which for sure doesn't reach to here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly believe that output dtype and ndim should be taken from the serialized pipeline. If they're not - then we'd better add them there instead of breaking the pipeline construction interface again.
As for set_affinity - I guess that's a simple omission and we can/should add it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added daliExecFlags_t.

daliTensorList_h input_data,
const char *data_id,
daliFeedInputFlags_t options,
const cudaStream_t *stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of stream is that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean, missing documentation? Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/** Gets a descriptor of the specified pipeline output.
*
* @param pipeline [in] The pipeline
* @param out_desc [out] A pointer to the returned descriptor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's me, but it reads as if we return a pointer to a descriptor.
Dunno if it is much better:

Suggested change
* @param out_desc [out] A pointer to the returned descriptor.
* @param out_desc [out] A pointer to where the output descriptor is written to/stored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used slightly different wording.

* must contain num_samples*ndim extents
* @param dtype the element type
* @param layout a layout string describing the order of axes in each sample (e.g. HWC),
* if NULL, and the TensorList's number of dimensions is equal to `ndim,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* if NULL, and the TensorList's number of dimensions is equal to `ndim,
* if NULL, and the TensorList's number of dimensions is equal to `ndim`,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* The shape starts with the "slowest" dimension - a row-major 640x480 interleaved RGB image
* would have the shape [480, 640, 3].
*
* The shape can be NULL if ndim == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can or is? Should we limit ambiguity?

Copy link
Contributor Author

@mzient mzient Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely "can". Overvalidation is evil. If you have std::vector<int> with a nonzero capacity but zero size, then .data() will return a non-null pointer. We don't want to force the user to "fix" std::vector - that'd be daft.

@mzient
Copy link
Contributor Author

mzient commented Feb 11, 2025

Some nitpicks + the gaps for the pipeline constructor parameters - are those passed only via the serialized pipeline?

Also, I suspect that we are returning pointers to internal Tensor/TensorList data directly (the layout and the shape). Any safety considerations - we break the encapsulation this way, and the user may overwrite them, if we just hand out the internal pointers.

We are returning const pointers, so it should be impossible to overwrite those by accident.

include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved
include/dali/dali.h Outdated Show resolved Hide resolved

/* TODO(michalz): Make stream policy configurable in the pipeline */
/** Masks the part of the flags that represent the stream policy. */
DALI_EXEC_FLAGS_STREAM_POLICY_MASK = 0x00000070,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should have some blocks reserved for that? Like every byte?

Copy link
Contributor Author

@mzient mzient Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A byte would be too much. I currently reserved 3 bits and now we have 3 nonzero values, so we can squeeze 4 more. I can hardly imagine we'd have that many stream policies or concurrency policies.

@mzient mzient force-pushed the C_API2_header branch 4 times, most recently from fc4cbfe to 2a986e1 Compare February 12, 2025 15:03

/** Gets the readiness event associated with the TensorList or creates a new one.
*
* @param tensor_list [in] the tensor list to associate an even twith
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param tensor_list [in] the tensor list to associate an even twith
* @param tensor_list [in] the tensor list to associate an event with


/** Gets the readiness event associated with the Tensor or creates a new one.
*
* @param tensor [in] the tensor to associate an even twith
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param tensor [in] the tensor to associate an even twith
* @param tensor [in] the tensor to associate an event with

Signed-off-by: Michal Zientkiewicz <[email protected]>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [23844740]: BUILD STARTED

@mzient mzient merged commit d36e0ba into NVIDIA:main Feb 13, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants