Skip to content

wrtnlabs/autobe-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

110 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AutoBe Generated Examples

Benchmark

AI Model Success Score FCSR Status
z-ai/glm-5 4 100 72% 🟒
qwen/qwen3-coder-next 2 98.21 51% 🟑
deepseek/deepseek-v3.1-terminus-exacto 2 97.91 85% 🟑
openai/gpt-4.1-mini 1 96.13 83% 🟑
qwen/qwen3-next-80b-a3b-instruct 0 95.17 71% 🟑
qwen/qwen3-30b-a3b-thinking-2507 0 92.92 75% 🟑
  • FCSR: Function Calling Success Rate
  • Status:
    • 🟒: All projects completed successfully
    • 🟑: Some projects failed
    • ❌: All projects failed or not executed

z-ai/glm-5

Project Score Analyze Prisma Interface Test Realize
todo 100 🟒 🟒 🟒 🟒 🟒
bbs 100 🟒 🟒 🟒 🟒 🟒
reddit 100 🟒 🟒 🟒 🟒 🟒
shopping 100 🟒 🟒 🟒 🟒 🟒

z-ai/glm-5 - todo

  • Source Code: z-ai/glm-5/todo
  • Score: 100
  • Elapsed Time: 2h 51m 32s
  • Token Usage: 25.86M
  • Function Calling Success Rate: 73.81%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 653.8K 13m 27s 96%
🟒 Database namespaces: 2, models: 4 776.6K 15m 0s 96%
🟒 Interface operations: 18, schemas: 22 17.09M 1h 25m 27s 62%
🟒 Test functions: 56 5.75M 34m 1s 82%
🟒 Realize functions: 26 1.59M 23m 35s 86%

z-ai/glm-5 - bbs

  • Source Code: z-ai/glm-5/bbs
  • Score: 100
  • Elapsed Time: 17h 57m 0s
  • Token Usage: 95.01M
  • Function Calling Success Rate: 76.83%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 16 1.14M 20m 9s 93%
🟒 Database namespaces: 9, models: 28 3.58M 20m 46s 98%
🟒 Interface operations: 59, schemas: 79 65.37M 11h 46m 37s 65%
🟒 Test functions: 188 17.81M 4h 41m 46s 93%
🟒 Realize functions: 96 7.10M 47m 41s 81%

z-ai/glm-5 - reddit

  • Source Code: z-ai/glm-5/reddit
  • Score: 100
  • Elapsed Time: 13h 10m 46s
  • Token Usage: 112.01M
  • Function Calling Success Rate: 72.37%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 718.5K 27m 47s 89%
🟒 Database namespaces: 7, models: 22 3.21M 46m 19s 97%
🟒 Interface operations: 73, schemas: 82 71.72M 4h 39m 51s 64%
🟒 Test functions: 188 24.51M 5h 59m 46s 80%
🟒 Realize functions: 113 11.85M 1h 17m 1s 77%

z-ai/glm-5 - shopping

  • Source Code: z-ai/glm-5/shopping
  • Score: 100
  • Elapsed Time: 7h 51m 6s
  • Token Usage: 252.94M
  • Function Calling Success Rate: 71.21%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 17 1.24M 35m 23s 95%
🟒 Database namespaces: 8, models: 33 5.02M 27m 35s 96%
🟒 Interface operations: 119, schemas: 148 167.10M 3h 45m 7s 63%
🟒 Test functions: 325 56.62M 1h 15m 45s 83%
🟒 Realize functions: 178 22.96M 1h 47m 14s 70%

qwen/qwen3-coder-next

Project Score Analyze Prisma Interface Test Realize
todo 100 🟒 🟒 🟒 🟒 🟒
bbs 100 🟒 🟒 🟒 🟒 🟒
reddit 97.61 🟒 🟒 🟒 🟒 🟑
shopping 95.22 🟒 🟒 🟒 🟒 🟑

qwen/qwen3-coder-next - todo

  • Source Code: qwen/qwen3-coder-next/todo
  • Score: 100
  • Elapsed Time: 2h 15m 51s
  • Token Usage: 38.78M
  • Function Calling Success Rate: 69.69%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 577.1K 10m 22s 92%
🟒 Database namespaces: 2, models: 7 1.38M 3m 13s 94%
🟒 Interface operations: 20, schemas: 25 25.03M 27m 22s 55%
🟒 Test functions: 49 8.39M 58m 57s 82%
🟒 Realize functions: 30 3.40M 35m 54s 82%

qwen/qwen3-coder-next - bbs

  • Source Code: qwen/qwen3-coder-next/bbs
  • Score: 100
  • Elapsed Time: 3h 22m 36s
  • Token Usage: 137.74M
  • Function Calling Success Rate: 55.03%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 4, documents: 12 774.3K 12m 35s 96%
🟒 Database namespaces: 4, models: 21 3.18M 5m 36s 86%
🟒 Interface operations: 76, schemas: 77 92.65M 42m 3s 49%
🟒 Test functions: 177 25.45M 1h 33m 51s 52%
🟒 Realize functions: 107 15.68M 48m 29s 65%

qwen/qwen3-coder-next - reddit

  • Source Code: qwen/qwen3-coder-next/reddit
  • Score: 97.61
  • Elapsed Time: 5h 37m 56s
  • Token Usage: 331.75M
  • Function Calling Success Rate: 47.50%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 4, documents: 12 877.8K 18m 41s 90%
🟒 Database namespaces: 7, models: 51 7.73M 21m 37s 86%
🟒 Interface operations: 124, schemas: 117 206.26M 2h 0m 34s 30%
🟒 Test functions: 324 78.84M 1h 24m 45s 64%
🟑 Realize functions: 176, errors: 7 38.04M 1h 32m 18s 73%

qwen/qwen3-coder-next - shopping

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 12 1.13M 32m 58s 87%
🟒 Database namespaces: 9, models: 72 11.44M 11m 33s 86%
🟒 Interface operations: 183, schemas: 217 495.03M 2h 7m 58s 42%
🟒 Test functions: 483 145.61M 2h 9m 41s 64%
🟑 Realize functions: 289, errors: 23 130.66M 2h 58m 5s 49%

deepseek/deepseek-v3.1-terminus-exacto

Project Score Analyze Prisma Interface Test Realize
todo 100 🟒 🟒 🟒 🟒 🟒
bbs 92.31 🟒 🟒 🟒 🟒 🟑
reddit 99.34 🟒 🟒 🟒 🟒 🟑
shopping 100 🟒 🟒 🟒 🟒 🟒

deepseek/deepseek-v3.1-terminus:exacto - todo

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 761.3K 3m 17s 100%
🟒 Database namespaces: 5, models: 44 5.74M 13m 37s 89%
🟒 Interface operations: 43, schemas: 57 36.74M 2h 9m 10s 85%
🟒 Test functions: 119 15.38M 52m 55s 94%
🟒 Realize functions: 66 10.27M 53m 34s 78%

deepseek/deepseek-v3.1-terminus:exacto - bbs

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 11 682.8K 13m 45s 100%
🟒 Database namespaces: 7, models: 76 10.12M 28m 25s 93%
🟒 Interface operations: 438, schemas: 323 255.69M 13h 37m 39s 85%
🟒 Test functions: 1195 232.70M 3h 21m 43s 92%
🟑 Realize functions: 585, errors: 75 155.49M 5h 40m 7s 73%

deepseek/deepseek-v3.1-terminus:exacto - reddit

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 12 858.7K 13m 25s 100%
🟒 Database namespaces: 7, models: 87 9.87M 30m 54s 93%
🟒 Interface operations: 322, schemas: 290 249.02M 5h 50m 23s 83%
🟒 Test functions: 978 126.57M 3h 26m 59s 94%
🟑 Realize functions: 457, errors: 5 62.17M 2h 14m 38s 82%

deepseek/deepseek-v3.1-terminus:exacto - shopping

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 4, documents: 11 893.3K 19m 56s 100%
🟒 Database namespaces: 7, models: 99 14.61M 34m 56s 93%
🟒 Interface operations: 351, schemas: 305 238.75M 9h 44m 1s 82%
🟒 Test functions: 939 171.90M 3h 54m 10s 91%
🟒 Realize functions: 490 77.92M 2h 22m 57s 78%

openai/gpt-4.1-mini

Project Score Analyze Prisma Interface Test Realize
todo 100 🟒 🟒 🟒 🟒 🟒
bbs 96.73 🟒 🟒 🟒 🟒 🟑
reddit 92.3 🟒 🟒 🟒 🟒 🟑
shopping 95.48 🟒 🟒 🟒 🟒 🟑

openai/gpt-4.1-mini - todo

  • Source Code: openai/gpt-4.1-mini/todo
  • Score: 100
  • Elapsed Time: 48m 34s
  • Token Usage: 21.35M
  • Function Calling Success Rate: 80.62%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 12 502.6K 2m 14s 100%
🟒 Database namespaces: 2, models: 6 2.16M 7m 9s 34%
🟒 Interface operations: 22, schemas: 31 12.82M 18m 22s 77%
🟒 Test functions: 52 3.94M 7m 41s 100%
🟒 Realize functions: 34 1.92M 13m 6s 93%

openai/gpt-4.1-mini - bbs

  • Source Code: openai/gpt-4.1-mini/bbs
  • Score: 96.73
  • Elapsed Time: 3h 48m 42s
  • Token Usage: 205.88M
  • Function Calling Success Rate: 84.53%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 4, documents: 12 492.0K 3m 8s 100%
🟒 Database namespaces: 7, models: 36 9.66M 17m 19s 34%
🟒 Interface operations: 231, schemas: 175 110.43M 45m 18s 75%
🟒 Test functions: 536 60.65M 1h 58m 45s 98%
🟑 Realize functions: 312, errors: 17 24.65M 44m 10s 94%

openai/gpt-4.1-mini - reddit

  • Source Code: openai/gpt-4.1-mini/reddit
  • Score: 92.3
  • Elapsed Time: 3h 46m 59s
  • Token Usage: 234.20M
  • Function Calling Success Rate: 81.86%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 4, documents: 12 522.3K 4m 17s 100%
🟒 Database namespaces: 7, models: 43 14.48M 22m 51s 27%
🟒 Interface operations: 220, schemas: 188 116.41M 53m 9s 75%
🟒 Test functions: 518 59.39M 54m 42s 98%
🟑 Realize functions: 304, errors: 39 43.40M 1h 31m 58s 90%

openai/gpt-4.1-mini - shopping

  • Source Code: openai/gpt-4.1-mini/shopping
  • Score: 95.48
  • Elapsed Time: 4h 16m 46s
  • Token Usage: 394.57M
  • Function Calling Success Rate: 83.78%
Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 14 665.7K 3m 18s 100%
🟒 Database namespaces: 10, models: 66 26.46M 23m 5s 25%
🟒 Interface operations: 330, schemas: 348 214.79M 1h 2m 14s 79%
🟒 Test functions: 807 93.42M 1h 25m 3s 98%
🟑 Realize functions: 491, errors: 37 59.24M 1h 23m 3s 95%

qwen/qwen3-next-80b-a3b-instruct

Project Score Analyze Prisma Interface Test Realize
todo 96.84 🟒 🟒 🟒 🟒 🟑
bbs 96.17 🟒 🟒 🟒 🟒 🟑
reddit 92.99 🟒 🟒 🟒 🟒 🟑
shopping 94.68 🟒 🟒 🟒 🟒 🟑

qwen/qwen3-next-80b-a3b-instruct - todo

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 1.20M 11m 19s 72%
🟒 Database namespaces: 4, models: 7 974.3K 5m 36s 90%
🟒 Interface operations: 13, schemas: 16 10.95M 1h 57m 53s 76%
🟒 Test functions: 27 3.70M 14m 35s 91%
🟑 Realize functions: 19, errors: 1 2.81M 19m 12s 84%

qwen/qwen3-next-80b-a3b-instruct - bbs

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 12 1.52M 22m 21s 70%
🟒 Database namespaces: 2, models: 23 6.45M 44m 11s 70%
🟒 Interface operations: 66, schemas: 72 74.19M 1h 18m 12s 55%
🟒 Test functions: 135 27.33M 48m 30s 82%
🟑 Realize functions: 94, errors: 6 21.20M 1h 24m 54s 80%

qwen/qwen3-next-80b-a3b-instruct - reddit

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 5, documents: 13 1.07M 8m 16s 64%
🟒 Database namespaces: 5, models: 29 3.36M 5m 33s 80%
🟒 Interface operations: 117, schemas: 92 100.15M 2h 11m 25s 58%
🟒 Test functions: 231 46.45M 55m 8s 78%
🟑 Realize functions: 154, errors: 18 37.21M 3h 23m 51s 79%

qwen/qwen3-next-80b-a3b-instruct - shopping

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 13 1.49M 15m 57s 68%
🟒 Database namespaces: 10, models: 35 6.51M 10m 55s 83%
🟒 Interface operations: 79, schemas: 106 113.56M 1h 17m 43s 56%
🟒 Test functions: 175 49.59M 1h 3m 24s 81%
🟑 Realize functions: 124, errors: 11 29.44M 1h 51m 52s 80%

qwen/qwen3-30b-a3b-thinking-2507

Project Score Analyze Prisma Interface Test Realize
todo 97.6 🟒 🟒 🟒 🟒 🟑
bbs 94.07 🟒 🟒 🟒 🟒 🟑
reddit 90 🟒 🟒 🟒 🟒 🟑
shopping 90 🟒 🟒 🟒 🟒 🟑

qwen/qwen3-30b-a3b-thinking-2507 - todo

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 1, documents: 11 406.8K 5m 11s 100%
🟒 Database namespaces: 2, models: 6 882.7K 10m 1s 81%
🟒 Interface operations: 17, schemas: 21 18.07M 1h 38m 44s 54%
🟒 Test functions: 36 7.77M 1h 42m 24s 94%
🟑 Realize functions: 25, errors: 1 4.01M 1h 30m 50s 84%

qwen/qwen3-30b-a3b-thinking-2507 - bbs

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 11 546.4K 7m 55s 92%
🟒 Database namespaces: 6, models: 19 1.95M 12m 37s 86%
🟒 Interface operations: 50, schemas: 71 61.41M 2h 20m 12s 57%
🟒 Test functions: 88 22.35M 58m 42s 93%
🟑 Realize functions: 81, errors: 8 13.69M 2h 22m 17s 79%

qwen/qwen3-30b-a3b-thinking-2507 - reddit

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 2, documents: 14 717.4K 5m 11s 100%
🟒 Database namespaces: 8, models: 26 3.15M 17m 36s 82%
🟒 Interface operations: 65, schemas: 90 74.29M 2h 57m 44s 61%
🟒 Test functions: 133 30.67M 1h 17m 42s 92%
🟑 Realize functions: 104, errors: 30 31.58M 3h 8m 7s 78%

qwen/qwen3-30b-a3b-thinking-2507 - shopping

Phase Generated Token Usage Elapsed Time FCSR
🟒 Analyze actors: 3, documents: 12 804.8K 7m 47s 92%
🟒 Database namespaces: 6, models: 38 4.09M 21m 16s 85%
🟒 Interface operations: 80, schemas: 103 81.86M 1h 58m 56s 63%
🟒 Test functions: 147 39.54M 1h 12m 31s 89%
🟑 Realize functions: 126, errors: 27 34.95M 2h 46m 51s 78%

About

AutoBE-generated backend application examples

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •