Human Autonomy - Value as Semantics data and code repo ⚖️

Aligning AI with human objectives can be facilitated by enabling it to learn and veridically represent our values. In modern AI agents, value is a scalar magnitude reflecting the desirability of a given state or action. We propose a framework, value-as-semantics, in which such magnitudes are represented within a large- scale, high-dimensional semantic representation in a large language model.

This approach allows value to be quantitative, yet also assigned to any expression in natural language and to inherit the expressivity and generalizability of the model’s ontology. We used a broad set of action concepts to evaluate several assumptions of this approach.

Overall, we conclude that modern language models can effectively function as databases of human value. This value-as-semantics architecture can be an important contribution towards a broader, multi-faceted computational model of human-like action planning and moral reasoning.

For a human-readable description of approach and results, see the AOI blog post!

Repo Structure

Stimuli

CSV files containing original action scenario stimuli used in the Leshinskaya & Chakroff (2023) NeurIPS paper.

Results

CSV files containing results from human rating and GPT prompting, used in the Leshinskaya & Chakroff (2023) NeurIPS paper.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Results		Results
Stimuli		Stimuli
Appendix.pdf		Appendix.pdf
GPT.py		GPT.py
LICENSE		LICENSE
Leshinskaya_Chakroff_NeurIPS_2023.ipynb		Leshinskaya_Chakroff_NeurIPS_2023.ipynb
README.md		README.md
action_neurips.csv		action_neurips.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human Autonomy - Value as Semantics data and code repo ⚖️

Repo Structure

Stimuli

Results

About

Releases

Packages

Languages

License

AIObjectives/value-semantics

Folders and files

Latest commit

History

Repository files navigation

Human Autonomy - Value as Semantics data and code repo ⚖️

Repo Structure

Stimuli

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages