genie — Run #402 (2025-08-20)
Overall Score:
70.1%
(40% Scalability Score + 60% Test Score)
(40% Scalability Score + 60% Test Score)
DA-bench Setup for Run #402 — Scalability Score: 100.0% (7 / 7)
See how this tool was set up for the test.
-
Setup Less Than 20 Minutes
-
Connects to Data Warehouse
-
Handles 1TB Table
-
Handles 10+ Tables
-
No Table Structure Changes
-
No SQL Expertise for Setup
DA-bench Results for Run #402 — Test Score: 50.2% (158 / 315)
Data Querying (113 / 175)
28 Correct Answers,
5 Hallucinations
Domain Knowledge (5 / 5)
1 Correct Answer,
0 Hallucinations
| Question | Date Tested | Overall Score | Video Recording |
|---|---|---|---|
|
dk01
|
2025-08-20 | 5 |
Feature Engineering (25 / 40)
6 Correct Answers,
1 Hallucination
| Question | Date Tested | Overall Score | Video Recording |
|---|---|---|---|
|
fe1
|
2025-08-20 | 5 | |
|
fe2
|
2025-08-20 | 5 | |
|
fe3
|
2025-08-20 | -5 | |
|
fe4
|
2025-08-20 | 5 | |
|
fe5
|
2025-08-20 | 0 | |
|
fe6
|
2025-08-20 | 5 | |
|
fe7
|
2025-08-20 | 5 | |
|
fe8
|
2025-08-20 | 5 |
Insight Identification (10 / 70)
4 Correct Answers,
2 Hallucinations
| Question | Date Tested | Overall Score | Video Recording |
|---|---|---|---|
|
ii1
|
2025-08-20 | 0 | |
|
ii2
|
2025-08-20 | 5 | |
|
ii3a
|
2025-08-20 | 0 | |
|
ii3b
|
2025-08-20 | 0 | |
|
ii3c
|
2025-08-20 | 0 | |
|
ii5
|
2025-08-20 | 5 | |
|
ii6
|
2025-08-20 | 0 | |
|
ii7
|
2025-08-20 | -5 | |
|
ii8
|
2025-08-20 | -5 | |
|
ii9a
|
2025-08-20 | 0 | |
|
ii9b
|
2025-08-20 | 0 | |
|
ii10
|
2025-08-20 | 0 | |
|
ii12
|
2025-08-20 | 5 | |
|
ii15
|
2025-08-20 | 5 |
Learning (-5 / 10)
0 Correct Answers,
1 Hallucination
| Question | Date Tested | Overall Score | Video Recording |
|---|---|---|---|
|
l1
|
2025-08-20 | 0 | |
|
l2
|
2025-08-20 | -5 |
Visualization (10 / 15)
2 Correct Answers,
0 Hallucinations
| Question | Date Tested | Overall Score | Video Recording |
|---|---|---|---|
|
v1
|
2025-08-20 | 5 | |
|
v2
|
2025-08-20 | 0 | |
|
v3
|
2025-08-20 | 5 |