Skip to content

Added automatic generation of unique_ids. fix issue #105 #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

suryanshgargbpgc
Copy link

Make unique_ids optional by auto-generating them when not provided,

@adamamer20 I implemented the automatic unique_id generation for both Polars and pandas backends . While I understand pandas implementation will eventually be removed, should I keep it the same for now or wait for #125 to be merged.

Make unique_ids optional by auto-generating them when not provided,
Copy link

codecov bot commented Mar 24, 2025

Codecov Report

Attention: Patch coverage is 75.67568% with 9 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@c2022dc). Learn more about missing BASE report.

Files with missing lines Patch % Lines
mesa_frames/concrete/pandas/agentset.py 76.00% 6 Missing ⚠️
mesa_frames/concrete/polars/agentset.py 75.00% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #134   +/-   ##
=======================================
  Coverage        ?   91.30%           
=======================================
  Files           ?       14           
  Lines           ?     2242           
  Branches        ?        0           
=======================================
  Hits            ?     2047           
  Misses          ?      195           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +149 to +154
if len(obj._agents) == 0:
# If no agents exist yet, start from 0
new_id = 0
else:
# Otherwise, use max existing ID + 1
new_id = obj._agents["unique_id"].max() + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is it not better to just use new_id = len(obj._agents) minor improvement time_complexity will be O(1) instead of O(n)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it would reduce the time complexity, but it would work only when the agent ids are sequential, also it would create duplicate ids if the agents are removed, so the current implementation is more generalized.

Copy link
Author

@suryanshgargbpgc suryanshgargbpgc Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it would create duplicate ids if the agents are removed

example
if you have agents with IDs 0,1,2,3,4 and then if suppose agent with id 2 is removed, you'd have 4 agents with IDs 0,1,3,4. using len(obj._agents ) here would give 4 as the next id however it has been already taken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! yup you are right!, I guess if we really had to ever reduce the TC we could always keep a maxval or something that keeps track of next_id (O(1) just something I thought of.

@suryanshgargbpgc suryanshgargbpgc changed the title Added automatic generation of unique_ids Added automatic generation of unique_ids. fix issue#105 Mar 25, 2025
@suryanshgargbpgc suryanshgargbpgc changed the title Added automatic generation of unique_ids. fix issue#105 Added automatic generation of unique_ids. fix issue #105 Mar 25, 2025
@adamamer20
Copy link
Member

I believe it's better to generate unique IDs randomly instead of managing them manually. There's already another PR that's nearly complete and addresses this: #110

@suryanshgargbpgc
Copy link
Author

I believe it's better to generate unique IDs randomly instead of managing them manually.

@adamamer20
I think generating simple integer ids rather than uuid can be a good idea too if am not wrong , sequential ids can be straightforward to work with, it will make debugging easier, takes less space in memory and lesser time too!
I might be missing some knowledge, Can you provide any doc to help me understand this.

There's already another PR that's nearly complete and addresses this: #110

Then its fine, I will close this PR.

@adamamer20
Copy link
Member

adamamer20 commented Mar 31, 2025

@adamamer20 I think generating simple integer ids rather than uuid can be a good idea too if am not wrong , sequential ids can be straightforward to work with, it will make debugging easier, takes less space in memory and lesser time too! I might be missing some knowledge, Can you provide any doc to help me understand this.

Actually, it's kind of the opposite unfortunately. In terms of space, Polars columns allocate a fixed size for integers—so if we use pl.Int64, there's no real difference in memory between sequential integers and UUIDs (since UUIDs are stored as Int64 anyway). Sure, if we used pl.Int32 we could save some space, but that would bring other limitations.

The main issue is more about management. We’re not only assigning IDs inside a single AgentSetDF but across multiple AgentSetDFs. If we used sequential IDs, we’d need to keep track of counters globally, handle deletions, and make sure IDs don’t collide or grow too large over time—which is a pain.

Also, every time we delete and add agents, sequential IDs would just keep increasing endlessly. On top of that, by using UUIDs with a fixed seed, we get reproducible IDs across runs, which is super useful when debugging or reproducing experiments. With sequential IDs, that would be harder to guarantee.

So in the end, UUIDs actually make things easier and safer for us to maintain.

@suryanshgargbpgc
Copy link
Author

The main issue is more about management. We’re not only assigning IDs inside a single AgentSetDF but across multiple AgentSetDFs. If we used sequential IDs, we’d need to keep track of counters globally, handle deletions, and make sure IDs don’t collide or grow too large over time—which is a pain.

oh yeah, then random generation makes sense.

Thanks for the explanation!

@suryanshgargbpgc suryanshgargbpgc deleted the auto-generate-uniqueIDs branch April 3, 2025 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants