Spaces:
Running
Running
Upload folder using huggingface_hub
Browse files- .gradio/flagged/dataset1.csv +79 -0
- README.md +3 -9
- __pycache__/defender_agent.cpython-312.pyc +0 -0
- __pycache__/judge_agent.cpython-312.pyc +0 -0
- __pycache__/reviewer_agent.cpython-312.pyc +0 -0
- app.py +71 -0
- defender_agent.py +18 -0
- judge_agent.py +24 -0
- requirements.txt +3 -0
- reviewer_agent.py +21 -0
.gradio/flagged/dataset1.csv
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Code Input,output,timestamp
|
| 2 |
+
,,2026-03-05 21:52:00.985140
|
| 3 |
+
"def find_duplicates(arr):
|
| 4 |
+
duplicates = []
|
| 5 |
+
for i in arr:
|
| 6 |
+
for j in arr:
|
| 7 |
+
if i == j:
|
| 8 |
+
duplicates.append(i)
|
| 9 |
+
return duplicates","# AI Code Review Arena
|
| 10 |
+
|
| 11 |
+
## Reviewer
|
| 12 |
+
### Bugs
|
| 13 |
+
- The code will include the original items from the array as duplicates since it compares each element to itself.
|
| 14 |
+
|
| 15 |
+
### Performance
|
| 16 |
+
- The current implementation has a time complexity of O(n^2) due to the nested loops, which can be inefficient for large arrays.
|
| 17 |
+
- Using a set for tracking seen elements would optimize the lookup time.
|
| 18 |
+
|
| 19 |
+
### Readability
|
| 20 |
+
- Variable names are clear but could be more descriptive (e.g., `duplicates` could be `found_duplicates`).
|
| 21 |
+
- Consider using comments to explain the logic, especially for any non-obvious parts.
|
| 22 |
+
|
| 23 |
+
### Security
|
| 24 |
+
- The function seems secure, but consider input validation (e.g., checking if `arr` is indeed a list).
|
| 25 |
+
- Evaluate handling of non-hashable types like lists within the input array, which may cause issues.
|
| 26 |
+
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
## Defender
|
| 30 |
+
Thank you for your feedback! Let's address each point raised in your review to clarify my design decisions and consider valid concerns.
|
| 31 |
+
|
| 32 |
+
### Bugs
|
| 33 |
+
You're correct that, due to the nested loops, the current implementation will include the original items as duplicates. This is an oversight, as we should avoid comparing each element with itself. To fix this, we can simply adjust the condition to check if an element has already been seen. I appreciate you highlighting this, as handling duplicates accurately is critical to the function's purpose.
|
| 34 |
+
|
| 35 |
+
### Performance
|
| 36 |
+
You're spot-on regarding the time complexity of O(n^2). Although the nested loops make it easy to identify duplicates, they indeed become inefficient for large arrays. A more optimized approach using a set would greatly improve performance, reducing the complexity to O(n). This would allow for constant time complexity checks for existing elements, which is certainly worth implementing.
|
| 37 |
+
|
| 38 |
+
### Readability
|
| 39 |
+
I appreciate your comments about variable naming. While `duplicates` conveys the overall intention, a name like `found_duplicates` could provide clearer context. It's always beneficial to strive for clarity in code. Additionally, I acknowledge that adding comments can help explain the logic, especially for those who may not be familiar with this algorithm.
|
| 40 |
+
|
| 41 |
+
### Security
|
| 42 |
+
Regarding input validation, I agree that it's wise to ensure that `arr` is indeed a list to prevent runtime errors. Implementing type checking helps make the function more robust and user-friendly. Your point about handling non-hashable types like lists is also essential. I should consider implementing a strategy to either raise errors for unsupported types or handle them in a way that fits the function's purpose.
|
| 43 |
+
|
| 44 |
+
### Conclusion
|
| 45 |
+
In summary, your feedback has been invaluable. Addressing the bugs and performance issues will lead to a more efficient and accurate implementation. I'm in favor of improving readability and security through better naming, comments, and input validation. Thank you for your constructive criticism, and I'll revise the code accordingly! Would you like me to provide an updated version based on your feedback?
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## Judge Verdict
|
| 50 |
+
**Winner:** Review
|
| 51 |
+
|
| 52 |
+
**Reason:** The review provided comprehensive points addressing key issues within the original code, specifically highlighting bugs, performance inefficiencies, readability, and security concerns. It effectively identified that the original implementation would falsely count elements as duplicates and pointed out the excessive time complexity due to nested loops. The defense acknowledged these issues but did not provide a concrete improvement plan or updated code to address them.
|
| 53 |
+
|
| 54 |
+
**Suggested Fix:**
|
| 55 |
+
1. Replace the nested loops with a single pass using a set to track seen elements, ensuring that duplicates are captured correctly without counting each item itself.
|
| 56 |
+
2. Implement input validation to ensure `arr` is a list and handle possible non-hashable types appropriately.
|
| 57 |
+
3. Update variable naming for better clarity, e.g., change `duplicates` to `found_duplicates`, and add comments explaining the function’s logic.
|
| 58 |
+
|
| 59 |
+
Here’s how a revised version of the code may look:
|
| 60 |
+
|
| 61 |
+
```python
|
| 62 |
+
def find_duplicates(arr):
|
| 63 |
+
if not isinstance(arr, list):
|
| 64 |
+
raise ValueError(""Input must be a list."")
|
| 65 |
+
|
| 66 |
+
found_duplicates = []
|
| 67 |
+
seen = set()
|
| 68 |
+
|
| 69 |
+
for item in arr:
|
| 70 |
+
if item in seen:
|
| 71 |
+
if item not in found_duplicates: # Only add duplicates once
|
| 72 |
+
found_duplicates.append(item)
|
| 73 |
+
else:
|
| 74 |
+
seen.add(item)
|
| 75 |
+
|
| 76 |
+
return found_duplicates
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
This version improves accuracy, efficiency, and clarity in accordance with the feedback provided.",2026-03-05 22:03:48.665872
|
README.md
CHANGED
|
@@ -1,12 +1,6 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji: 🐢
|
| 4 |
-
colorFrom: indigo
|
| 5 |
-
colorTo: purple
|
| 6 |
-
sdk: gradio
|
| 7 |
-
sdk_version: 6.8.0
|
| 8 |
app_file: app.py
|
| 9 |
-
|
|
|
|
| 10 |
---
|
| 11 |
-
|
| 12 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 1 |
---
|
| 2 |
+
title: code_review
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
app_file: app.py
|
| 4 |
+
sdk: gradio
|
| 5 |
+
sdk_version: 5.49.1
|
| 6 |
---
|
|
|
|
|
|
__pycache__/defender_agent.cpython-312.pyc
ADDED
|
Binary file (465 Bytes). View file
|
|
|
__pycache__/judge_agent.cpython-312.pyc
ADDED
|
Binary file (477 Bytes). View file
|
|
|
__pycache__/reviewer_agent.cpython-312.pyc
ADDED
|
Binary file (494 Bytes). View file
|
|
|
app.py
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from agents import Runner
|
| 2 |
+
import gradio as gr
|
| 3 |
+
from dotenv import load_dotenv
|
| 4 |
+
|
| 5 |
+
from reviewer_agent import reviewer
|
| 6 |
+
from defender_agent import defender
|
| 7 |
+
from judge_agent import judge
|
| 8 |
+
|
| 9 |
+
load_dotenv(override=True)
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
async def run_review(code: str):
|
| 13 |
+
|
| 14 |
+
if not code.strip():
|
| 15 |
+
return "Please paste some code."
|
| 16 |
+
|
| 17 |
+
# Reviewer
|
| 18 |
+
review_result = await Runner.run(
|
| 19 |
+
reviewer,
|
| 20 |
+
f"Review this code:\n\n{code}"
|
| 21 |
+
)
|
| 22 |
+
review = review_result.final_output
|
| 23 |
+
|
| 24 |
+
# Defender
|
| 25 |
+
defense_result = await Runner.run(
|
| 26 |
+
defender,
|
| 27 |
+
f"Code:{code}. Reviewer feedback: {review}.Respond to the reviewer."
|
| 28 |
+
)
|
| 29 |
+
defense = defense_result.final_output
|
| 30 |
+
|
| 31 |
+
# Judge
|
| 32 |
+
judge_result = await Runner.run(
|
| 33 |
+
judge,
|
| 34 |
+
f"Code: {code} Review: {review} Defense: {defense} Evaluate the debate and give a final verdict."
|
| 35 |
+
)
|
| 36 |
+
verdict = judge_result.final_output
|
| 37 |
+
|
| 38 |
+
return f"""
|
| 39 |
+
# AI Code Review Arena
|
| 40 |
+
|
| 41 |
+
## Reviewer
|
| 42 |
+
{review}
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## Defender
|
| 47 |
+
{defense}
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
+
## Judge Verdict
|
| 52 |
+
{verdict}
|
| 53 |
+
"""
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
demo = gr.Interface(
|
| 57 |
+
fn=run_review,
|
| 58 |
+
inputs=gr.Textbox(
|
| 59 |
+
lines=18,
|
| 60 |
+
placeholder="Paste your code here...",
|
| 61 |
+
label="Code Input"
|
| 62 |
+
),
|
| 63 |
+
outputs=gr.Markdown(),
|
| 64 |
+
submit_btn="Run Review",
|
| 65 |
+
title="AI Code Review Arena",
|
| 66 |
+
description="Three AI agents debate your code: Reviewer → Defender → Judge."
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
if __name__ == "__main__":
|
| 71 |
+
demo.launch()
|
defender_agent.py
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from agents import Agent
|
| 2 |
+
|
| 3 |
+
|
| 4 |
+
instruction = """
|
| 5 |
+
You are the developer who wrote the code.
|
| 6 |
+
|
| 7 |
+
Your job:
|
| 8 |
+
- explain design decisions
|
| 9 |
+
- defend reasonable choices
|
| 10 |
+
- accept valid criticism
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
defender = Agent(
|
| 15 |
+
name="Code defender",
|
| 16 |
+
instructions=instruction,
|
| 17 |
+
model="gpt-4o-mini"
|
| 18 |
+
)
|
judge_agent.py
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from agents import Agent
|
| 2 |
+
|
| 3 |
+
|
| 4 |
+
instruction = """
|
| 5 |
+
You are an impartial senior architect.
|
| 6 |
+
|
| 7 |
+
Your task:
|
| 8 |
+
1. Read the code
|
| 9 |
+
2. Read the review
|
| 10 |
+
3. Read the defense
|
| 11 |
+
|
| 12 |
+
Then respond with:
|
| 13 |
+
|
| 14 |
+
Winner:
|
| 15 |
+
Reason:
|
| 16 |
+
Suggested Fix:
|
| 17 |
+
"""
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
judge = Agent(
|
| 21 |
+
name="Judge",
|
| 22 |
+
instructions=instruction,
|
| 23 |
+
model="gpt-4o-mini"
|
| 24 |
+
)
|
requirements.txt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
openai-agents
|
| 2 |
+
gradio
|
| 3 |
+
python-dotenv
|
reviewer_agent.py
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from agents import Agent
|
| 2 |
+
|
| 3 |
+
|
| 4 |
+
instruction = """
|
| 5 |
+
You are a senior software engineer performing a strict code review.
|
| 6 |
+
|
| 7 |
+
Focus on:
|
| 8 |
+
- bugs
|
| 9 |
+
- performance
|
| 10 |
+
- readability
|
| 11 |
+
- security
|
| 12 |
+
|
| 13 |
+
Respond with concise bullet points.
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
reviewer = Agent(
|
| 18 |
+
name="Code reviewer",
|
| 19 |
+
instructions=instruction,
|
| 20 |
+
model="gpt-4o-mini"
|
| 21 |
+
)
|