Remyx AI

Team

company

Verified

https://remyx.ai

remyxai

Activity Feed Request to join this org

AI & ML interests

Machine learning, deep learning, generative AI, LLMs

Recent Activity

salma-remyx updated a Space about 1 month ago

remyxai/remyx-explorer

salma-remyx published a Space about 1 month ago

remyxai/remyx-explorer

salma-remyx updated a model 6 months ago

remyxai/SpaceQwen3-VL-2B-Thinking

View all activity

salma-remyx

posted an update 2 days ago

Post

2087

Some ask how we can recommend recent advances for improving your AI system.

We tell them "The code is the context."

Here's a demo showing how to get started with paper recommendations by connecting to a repo URL.

Your repo history describes what you've tried and what you're working on, so we ground suggested ideas in your actual development trajectory from day one.

What the loop looks like end to end:

* New ideas are sourced arXiv papers and GitHub repos, ranked by relevance your codebase.

* With a click, you get implementations scaffolded as a feature branches.

* Validation jobs provision compute on Modal so you can measure the change against your baseline.

* Results are synced across the tools your team already uses.

Try it on your repo: https://engine.remyx.ai

salma-remyx

posted an update 3 days ago

Post

799

AI is a scientific discipline. So it can't help that you're context-switching between tools and wrangling scattered data just to run a single experiment. Tickets in one place, branches in another, evals on whatever infra you stood up last time, metrics somewhere else.

Remyx offers one unified experiment view, capturing everything from hypothesis to decision, with every step instrumented and every decision preserved. Every experiment becomes context for the next one.

salma-remyx

posted an update 13 days ago

Post

155

We've been building Remyx to help AI teams track what's actually working across their AI development efforts.

Every experiment you and your team runs, from where the approach came from, through implementation and testing, to whether it moved the metric you care about is tracked in one place. Over time, Remyx spots patterns across your experiments and recommends new approaches worth testing based on what's proven to work.

It connects with the tools you already use (GitHub, Linear, Claude Code, HuggingFace) so experiment context doesn't get lost across six different places.

Full demo vid here: https://youtu.be/XscVmkxTACA
The free dev version is live at https://remyx.ai!
We're looking for feedback from teams actively developing AI applications. If you give it a look, would love to hear what's missing or what would make it more useful for your workflow.

1 reply

salma-remyx

posted an update 18 days ago

Post

2269

Every change tested into your AI creates evidence over the space of possible improvements

With these insights, we can match new methods to the problems you're facing in your application

Check it out at Remyx: https://remyx.ai

salma-remyx

posted an update 23 days ago

Post

1452

How do you find ideas to try next?
I'm tracking multiple topics tied to the projects we're building at Remyx. Every morning I get a feed of papers ranked by relevance to those topics.
No more good ideas lost because they didn't trend on X.

Build your own feed for free: https://engine.remyx.ai
Read more: https://docs.remyx.ai/resources/explore

salma-remyx

posted an update 28 days ago

Post

3507

We built an OpenClaw 🦞 skill that sends daily ranked research recommendations to Slack using the Remyx AI CLI.

You define Research Interests (topics, HF models, GitHub repos, blogs etc), Remyx ranks new arXiv papers and repos to find the most relevant resources, and an OpenClaw cron job posts a formatted digest to your team's #research channel every weekday morning.

The tutorial covers the full setup end-to-end: installing the CLI, creating interests, connecting OpenClaw to Slack, installing the Remyx skill, and scheduling the cron. About 15 minutes start to finish.

Tutorial: https://docs.remyx.ai/tutorials/daily-research-digest-slack

Would love to hear how folks are tracking research for their projects. If you give this a try, let us know what you think!

salma-remyx

posted an update about 1 month ago

Post

3823

Looking to execute on your next great idea? 💡

Search for relevant papers and find pre-built Docker images to interactively explore the code with Remyx!

Check out the new space 🔍
remyxai/remyx-explorer

4 replies

salma-remyx

updated a Space about 1 month ago

Remyx Explorer

🔬

Search >10K+ arXiv papers with ready-to-run environments

salma-remyx

published a Space about 1 month ago

Remyx Explorer

🔬

Search >10K+ arXiv papers with ready-to-run environments

salma-remyx

updated a model 6 months ago

remyxai/SpaceQwen3-VL-2B-Thinking

Image-Text-to-Text • 2B • Updated Oct 23, 2025 • 48 • 3

salma-remyx

updated a collection 6 months ago

SpaceThinker

Collection

Test Time Compute for Quantitative Spatial Reasoning using synthetic reasoning traces from 3D scene graphs • 7 items • Updated Oct 23, 2025 • 2

salma-remyx

posted an update 6 months ago

Post

3353

We've built over 10K containerized reproductions of papers from arXiv!

Instead of spending all day trying to build an environment to test that new idea, just pull the Docker container from the Remyx registry.

And with Remyx, you can start experimenting faster by generating a test PR in your codebase based on the ideas found in your paper of choice.

Hub: https://hub.docker.com/u/remyxai
Remyx docs: https://docs.remyx.ai/resources/ideate
Coming soon, explore reproduced papers with AG2 + Remyx: https://github.com/ag2ai/ag2/pull/2141

1 reply

salma-remyx

posted an update 6 months ago

Post

1064

The future is arriving too fast not to use programmatic discovery and replication.
Search arXiv → Execute in 30 seconds with pre-built Docker environments

Check out our latest integration with AG2 to accelerate your discovery loop.
As easy as:

from remyxai.client.search import SearchClient
from autogen.coding import RemyxCodeExecutor

# Search by topic
papers = SearchClient().search(
    "data synthesis strategies",
    has_docker=True,  # Only papers with pre-built environments
    limit=10
)

executor = RemyxCodeExecutor(arxiv_id=papers[0].arxiv_id)

remyx_executor.explore(
    goal="Run a test with my model remyxai/SpaceThinker-Qwen2.5VL-3B",
    interactive=False  # Automated exploration
)

Tutorial: https://github.com/ag2ai/ag2/blob/4c6954e3959fe672980191f264e30d451bc23554/notebook/agentchat_remyx_executor.ipynb
PR: https://github.com/ag2ai/ag2/pull/2141

salma-remyx

posted an update 7 months ago

Post

3724

Thanks again to @ag2 for hosting us at their Community Talks!
@terry-remyx walked us through a technical deep dive into GitRank, our automated pipeline that converts research papers with code into containerized, executable environments and generates specialized tests tailored to users' specific codebases.

In case you missed it...
Full recording: https://www.youtube.com/watch?v=N_FNfZ71s2I
Deck: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing

salma-remyx

posted an update 7 months ago

Post

2930

We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
📰 power personalized paper recommendations
🐳 build environments as Docker Images
🎯 implement core-methods as PRs for your target repo

Don't miss it! Tomorrow, Sept 25 at 9:00 am PST: https://calendar.app.google/3soCpuHupRr96UaF8

salma-remyx

posted an update 7 months ago

Post

1512

We've added intelligent full-text search across our pre-built Docker images for arXiv papers with ready-to-run code and papers straight from arXiv.

Natural language queries.
Semantic understanding.
One search to find both the paper AND the runnable code.

Try it today: https://engine.remyx.ai/resources/
Join us at Experiment 2025: https://experiment.remyx.ai

salma-remyx

posted an update 7 months ago

Post

5357

Rolling Benchmarks - Evaluating AI Agents on Unseen GitHub Repos

Static benchmarks are prone to leaderboard hacking and training data contamination, so how about a dynamic/rolling benchmark?

By limiting submissions to only freshly published code, we could evaluate based on consistency over time with rolling averages instead of finding agents overfit to a static benchmark.

Can rolling benchmarks bring us closer to evaluating agents in a way more closely aligned with their real-world applications? Perhaps a new direction for agent evaluation?

Would love to hear what you think about this!
More on reddit: https://www.reddit.com/r/LocalLLaMA/comments/1nmvw7a/rolling_benchmarks_evaluating_ai_agents_on_unseen/

salma-remyx

posted an update 7 months ago

Post

3995

Trustworthy AI evals has been an industry challenge for the last few years, so what's missing?
Causal Reasoning.

Model based eval frameworks can't tell you if your changes actually improved user outcomes - you need to take a systems level approach.

At Remyx, we’re building the intelligence layer for AI experimentation. Check out this example on how we start laying the scaffolding to launch controlled experiments to turn your hypotheses into insights on what drives performance for your application.

Check out the latest at Remyx in our docs: https://docs.remyx.ai
Try your first experiment today! https://engine.remyx.ai

salma-remyx

posted an update 7 months ago

Post

3238

Mark you calendars for Thursday Sept 25th at 9am PST 📆
We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
📰 power personalized paper recommendations
🐳 build environments as Docker Images
🎯 implement core-methods as PRs for your target repo

Attached is a draft outlining what we plan to cover in the talk.
Would love to gather your feedback to make this insightful for all: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing

salma-remyx

posted an update 7 months ago

Post

3247

Reproducing research code shouldn't take longer than reading the paper.
For papers that include code, setting up the right environment often means hours of dependency hell and configuration debugging.

At Remyx AI, we built an agent that automatically creates and tests Docker images for research papers, then shares them publicly so anyone can reproduce results with a single command.

We just submitted PR #908 to integrate this directly into arXiv Labs.

If you believe in making reproducible research accessible to everyone, give it a bump!: https://github.com/arXiv/arxiv-browse/pull/908

3 replies

AI & ML interests

Recent Activity

Team members 2

remyxai's activity

Remyx Explorer

Remyx Explorer