Generate standalone code that compares the accuracy between corresponding aten ops and lowered TTNN ops #611

kevinwuTT · 2024-12-16T17:05:52Z

Ticket

None

Problem description

Some models are reporting to have bad accuracy, (low pcc values). We want to pin-point the ops that caused this.

What's changed

This PR extracts the aten and ttnn graphs during compilation/runtime and create a standalone Python script that can be run with minimum dependencies. In between each aten op(s) and matching lowered ttnn ops(s), a function will be called to compare the outputs. Currently, an assertion will terminate the script if the pcc is below the desired value. This will narrow down the offending set of ops. Input data values from the model run will also be exported alongside the script, so we will not use synthetic values.

Create Github action to enable/disable generating the accuracy test scripts
Create README or doc to explain how to use this feature
Refactor and cleanup

Suggestions:

Move generated scripts to tests, similar to autogen tests
Consider using safetensors instead of pickle
There are issues with saving the tensors in this format because of share memory. save_model won't work because this is not a torch.nn.Module. May separate this into another issue.

RuntimeError: 
            Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'arg0_1', 'arg27_1'}].
            A potential way to correctly save your model is to use `save_model`.
            More information at https://huggingface.co/docs/safetensors/torch_shared_tensors

Error handling of pickle file in main
Remaining redundant code
Docstrings
Verbose debug
Considerations for inspect.getsource

ayerofieiev-tt · 2024-12-17T00:53:06Z

torch_ttnn/generate_op_accuracy_tests.py

@@ -0,0 +1,280 @@
+import inspect


move to tools/?

would be good to find a better location, instead of a top level module dir

Moved to tools/. In a future update, I will further separate them into

tools\ |-- export code base module |-- generate accuracy code |-- generate profiling code

ayerofieiev-tt

Great tooling! Thank you!

Here are some thoughts on how to improve this further

Code Organization

The current script is hard to navigate because it mixes logic, utils, all together.
Can you split the code?

Error Handling:

In main_code, the try-except block silently catches all exceptions without specifying the error type. This can mask issues unrelated to file opening.
Solution:
Catch specific exceptions like FileNotFoundError or IOError:

try:
    with open("{full_input_pkl_path}", "rb") as file:
        inputs = pickle.load(file)
except FileNotFoundError:
    with open("{input_pkl_file}", "rb") as file:
        inputs = pickle.load(file)

Redundant Code
There is repetitive code, especially for string manipulations and conditions like:

if node.op != "placeholder" and node.op != "output":

Maybe create a function to encapsulate repetitive logic:

def is_valid_node(node):
    return node.op not in ["placeholder", "output"]

Documentation

Consider adding docstrings

def rename_nodes(graph, prefix):
    """
    Renames nodes in the graph to prevent conflicts with wrapper or built-in functions.

    Args:
        graph: The computational graph to process.
        prefix: A string prefix for renaming.

    Returns:
        The modified graph with renamed nodes.
    """
    for node in graph.nodes:
        if node.op not in ["placeholder", "output"]:
            opname = str(node.target) if str(node.target).startswith("aten.") else node.target.__name__
            if not opname.startswith(("aten.", "ttnn.")):
                node._rename(f"{prefix}_{node.name}")
    return graph

inspect.getsource Edge Cases

I am minorly concerned abotu using inspect.getsource() with an assumtion that the target function is always accessible and valid. If node.target is a dynamically created or non-source-mapped function, it may fail.

Verbose Debug Printing

There are many print() statements for debugging (print("aten graph args:", arg_nodes)), which might clutter outputs in production. Use Python's logging module for more controlled logging:

import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

logger.info("aten graph args: %s", arg_nodes)

Now you can control verbosity using:

python script.py --log-level DEBUG

List Operations

List comprehensions and dictionary lookups (like compute_key) are repeated within loops. Cache results outside loops.

…t is 1" This reverts commit 39b81e7.

…ve their own shapes instead of inheriting from original aten

…ntout of generated items

This reverts commit 17ea0ed.

kevinwuTT added 2 commits December 16, 2024 01:37

Add option to generate standalone python script for model runs

793132d

Refactor some variables and path names

efa1ebd

ayerofieiev-tt reviewed Dec 17, 2024

View reviewed changes

kevinwuTT added 15 commits December 17, 2024 17:56

Add lzma compression for input pickle files

672f741

Add action to run accuracy tests in CI

61d4f7b

Try different for loop in actions

9fb9353

Fix gen accuracy flag to the right action

f92d244

Fix file paths

b1ce493

Add ttnn.squeeze for lowering embedding when the rank of input is 1

39b81e7

Support metadata for get_attr nodes

2119337

Support torch.device cases

b2ab995

Cleanup

a5bdb40

Merge branch 'main' into kw/gen_acc_tests

f0487d0

Cleanup

7200411

Revert "Add ttnn.squeeze for lowering embedding when the rank of inpu…

4a27c75

…t is 1" This reverts commit 39b81e7.

Remove uneeded code

d94b511

Support get_attr nodes

57c8a5e

Dynamically rename wrapper functions to avoid naming conflicts

52b47f3

kevinwuTT force-pushed the kw/gen_acc_tests branch from 326af78 to 52b47f3 Compare December 24, 2024 00:53

kevinwuTT added 8 commits December 26, 2024 20:03

Some get_attr nodes do not have tensor_meta or val metadata

d45109a

Merge branch 'main' into kw/gen_acc_tests

2b960d1

Refactor

cd5a3ca

Merge branch 'main' into kw/gen_acc_tests

7020388

Support cases where aten ops are decomposed and the decomposed ops ha…

6bbb613

…ve their own shapes instead of inheriting from original aten

Refactor

50e00f8

Merge branch 'main' into kw/gen_acc_tests

12065a7

Fix oops

d3cf5db

kevinwuTT marked this pull request as ready for review January 6, 2025 15:32

Fix issue with 0-d tensor and add section on how to use this feature

5f04797

Merge branch 'main' into kw/gen_acc_tests

a0941b6

kevinwuTT force-pushed the kw/gen_acc_tests branch from f10cf5f to 9e3ca15 Compare January 6, 2025 18:16

Move autogen directory under tests/autogen_accuracy_tests and add pri…

4f815b7

…ntout of generated items

kevinwuTT force-pushed the kw/gen_acc_tests branch from 9e3ca15 to 4f815b7 Compare January 6, 2025 18:19

kevinwuTT added 8 commits January 6, 2025 21:49

Use safetensors instead of pickle

17ea0ed

Add small helper function for detecting regular operations

125935d

Revert "Use safetensors instead of pickle"

f247e4b

This reverts commit 17ea0ed.

Fix accuracy test paths

4524d66

Merge branch 'main' into kw/gen_acc_tests

278e8e2

Merge branch 'main' into kw/gen_acc_tests

acdc5e8

Merge branch 'main' into kw/gen_acc_tests

a1206eb

Move gen standalone code to tools

46e489f

kevinwuTT mentioned this pull request Jan 27, 2025

Improvements to code exporting #728

Open

ayerofieiev-tt approved these changes Jan 29, 2025

View reviewed changes

kevinwuTT added this pull request to the merge queue Jan 29, 2025

Merged via the queue into main with commit 5fd9996 Jan 29, 2025
1 check passed

kevinwuTT deleted the kw/gen_acc_tests branch January 29, 2025 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate standalone code that compares the accuracy between corresponding aten ops and lowered TTNN ops #611

Generate standalone code that compares the accuracy between corresponding aten ops and lowered TTNN ops #611

kevinwuTT commented Dec 16, 2024 •

edited

Loading

ayerofieiev-tt Dec 17, 2024

ayerofieiev-tt Dec 17, 2024

kevinwuTT Jan 27, 2025

ayerofieiev-tt left a comment •

edited

Loading

Generate standalone code that compares the accuracy between corresponding aten ops and lowered TTNN ops #611

Generate standalone code that compares the accuracy between corresponding aten ops and lowered TTNN ops #611

Conversation

kevinwuTT commented Dec 16, 2024 • edited Loading

Ticket

Problem description

What's changed

ayerofieiev-tt Dec 17, 2024

Choose a reason for hiding this comment

ayerofieiev-tt Dec 17, 2024

Choose a reason for hiding this comment

kevinwuTT Jan 27, 2025

Choose a reason for hiding this comment

ayerofieiev-tt left a comment • edited Loading

Choose a reason for hiding this comment

Code Organization

Error Handling:

Documentation

inspect.getsource Edge Cases

Verbose Debug Printing

List Operations

kevinwuTT commented Dec 16, 2024 •

edited

Loading

ayerofieiev-tt left a comment •

edited

Loading