AI & ML interests

Machine learning, deep learning, generative AI, LLMs

Recent Activity

salma-remyxย  updated a Space about 1 month ago
remyxai/remyx-explorer
salma-remyxย  published a Space about 1 month ago
remyxai/remyx-explorer
salma-remyxย  updated a model 6 months ago
remyxai/SpaceQwen3-VL-2B-Thinking
View all activity

salma-remyxย 
posted an update 2 days ago
view post
Post
2087
Some ask how we can recommend recent advances for improving your AI system.

We tell them "The code is the context."

Here's a demo showing how to get started with paper recommendations by connecting to a repo URL.

Your repo history describes what you've tried and what you're working on, so we ground suggested ideas in your actual development trajectory from day one.

What the loop looks like end to end:

* New ideas are sourced arXiv papers and GitHub repos, ranked by relevance your codebase.

* With a click, you get implementations scaffolded as a feature branches.

* Validation jobs provision compute on Modal so you can measure the change against your baseline.

* Results are synced across the tools your team already uses.

Try it on your repo: https://engine.remyx.ai
salma-remyxย 
posted an update 3 days ago
view post
Post
799
AI is a scientific discipline. So it can't help that you're context-switching between tools and wrangling scattered data just to run a single experiment. Tickets in one place, branches in another, evals on whatever infra you stood up last time, metrics somewhere else.

Remyx offers one unified experiment view, capturing everything from hypothesis to decision, with every step instrumented and every decision preserved. Every experiment becomes context for the next one.
salma-remyxย 
posted an update 13 days ago
view post
Post
155
We've been building Remyx to help AI teams track what's actually working across their AI development efforts.

Every experiment you and your team runs, from where the approach came from, through implementation and testing, to whether it moved the metric you care about is tracked in one place. Over time, Remyx spots patterns across your experiments and recommends new approaches worth testing based on what's proven to work.

It connects with the tools you already use (GitHub, Linear, Claude Code, HuggingFace) so experiment context doesn't get lost across six different places.

Full demo vid here: https://youtu.be/XscVmkxTACA
The free dev version is live at https://remyx.ai!
We're looking for feedback from teams actively developing AI applications. If you give it a look, would love to hear what's missing or what would make it more useful for your workflow.
  • 1 reply
ยท
salma-remyxย 
posted an update 18 days ago
view post
Post
2269
Every change tested into your AI creates evidence over the space of possible improvements

With these insights, we can match new methods to the problems you're facing in your application

Check it out at Remyx: https://remyx.ai
salma-remyxย 
posted an update 23 days ago
view post
Post
1452
How do you find ideas to try next?
I'm tracking multiple topics tied to the projects we're building at Remyx. Every morning I get a feed of papers ranked by relevance to those topics.
No more good ideas lost because they didn't trend on X.

Build your own feed for free: https://engine.remyx.ai
Read more: https://docs.remyx.ai/resources/explore
salma-remyxย 
posted an update 28 days ago
view post
Post
3507
We built an OpenClaw ๐Ÿฆž skill that sends daily ranked research recommendations to Slack using the Remyx AI CLI.

You define Research Interests (topics, HF models, GitHub repos, blogs etc), Remyx ranks new arXiv papers and repos to find the most relevant resources, and an OpenClaw cron job posts a formatted digest to your team's #research channel every weekday morning.

The tutorial covers the full setup end-to-end: installing the CLI, creating interests, connecting OpenClaw to Slack, installing the Remyx skill, and scheduling the cron. About 15 minutes start to finish.

Tutorial: https://docs.remyx.ai/tutorials/daily-research-digest-slack

Would love to hear how folks are tracking research for their projects. If you give this a try, let us know what you think!
salma-remyxย 
posted an update about 1 month ago
view post
Post
3823
Looking to execute on your next great idea? ๐Ÿ’ก

Search for relevant papers and find pre-built Docker images to interactively explore the code with Remyx!

Check out the new space ๐Ÿ”
remyxai/remyx-explorer
  • 4 replies
ยท
salma-remyxย 
posted an update 6 months ago
view post
Post
3353
We've built over 10K containerized reproductions of papers from arXiv!

Instead of spending all day trying to build an environment to test that new idea, just pull the Docker container from the Remyx registry.

And with Remyx, you can start experimenting faster by generating a test PR in your codebase based on the ideas found in your paper of choice.

Hub: https://hub.docker.com/u/remyxai
Remyx docs: https://docs.remyx.ai/resources/ideate
Coming soon, explore reproduced papers with AG2 + Remyx: https://github.com/ag2ai/ag2/pull/2141
  • 1 reply
ยท
salma-remyxย 
posted an update 6 months ago
view post
Post
1064
The future is arriving too fast not to use programmatic discovery and replication.
Search arXiv โ†’ Execute in 30 seconds with pre-built Docker environments

Check out our latest integration with AG2 to accelerate your discovery loop.
As easy as:
from remyxai.client.search import SearchClient
from autogen.coding import RemyxCodeExecutor

# Search by topic
papers = SearchClient().search(
    "data synthesis strategies",
    has_docker=True,  # Only papers with pre-built environments
    limit=10
)

executor = RemyxCodeExecutor(arxiv_id=papers[0].arxiv_id)

remyx_executor.explore(
    goal="Run a test with my model remyxai/SpaceThinker-Qwen2.5VL-3B",
    interactive=False  # Automated exploration
)


Tutorial: https://github.com/ag2ai/ag2/blob/4c6954e3959fe672980191f264e30d451bc23554/notebook/agentchat_remyx_executor.ipynb
PR: https://github.com/ag2ai/ag2/pull/2141
salma-remyxย 
posted an update 7 months ago
view post
Post
3724
Thanks again to @ag2 for hosting us at their Community Talks!
@terry-remyx walked us through a technical deep dive into GitRank, our automated pipeline that converts research papers with code into containerized, executable environments and generates specialized tests tailored to users' specific codebases.

In case you missed it...
Full recording: https://www.youtube.com/watch?v=N_FNfZ71s2I
Deck: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing
salma-remyxย 
posted an update 7 months ago
view post
Post
2930
We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
๐Ÿ“ฐ power personalized paper recommendations
๐Ÿณ build environments as Docker Images
๐ŸŽฏ implement core-methods as PRs for your target repo

Don't miss it! Tomorrow, Sept 25 at 9:00 am PST: https://calendar.app.google/3soCpuHupRr96UaF8
salma-remyxย 
posted an update 7 months ago
view post
Post
1512
We've added intelligent full-text search across our pre-built Docker images for arXiv papers with ready-to-run code and papers straight from arXiv.

Natural language queries.
Semantic understanding.
One search to find both the paper AND the runnable code.

Try it today: https://engine.remyx.ai/resources/
Join us at Experiment 2025: https://experiment.remyx.ai
salma-remyxย 
posted an update 7 months ago
view post
Post
5357
Rolling Benchmarks - Evaluating AI Agents on Unseen GitHub Repos

Static benchmarks are prone to leaderboard hacking and training data contamination, so how about a dynamic/rolling benchmark?

By limiting submissions to only freshly published code, we could evaluate based on consistency over time with rolling averages instead of finding agents overfit to a static benchmark.

Can rolling benchmarks bring us closer to evaluating agents in a way more closely aligned with their real-world applications? Perhaps a new direction for agent evaluation?

Would love to hear what you think about this!
More on reddit: https://www.reddit.com/r/LocalLLaMA/comments/1nmvw7a/rolling_benchmarks_evaluating_ai_agents_on_unseen/
salma-remyxย 
posted an update 7 months ago
view post
Post
3995
Trustworthy AI evals has been an industry challenge for the last few years, so what's missing?
Causal Reasoning.

Model based eval frameworks can't tell you if your changes actually improved user outcomes - you need to take a systems level approach.

At Remyx, weโ€™re building the intelligence layer for AI experimentation. Check out this example on how we start laying the scaffolding to launch controlled experiments to turn your hypotheses into insights on what drives performance for your application.

Check out the latest at Remyx in our docs: https://docs.remyx.ai
Try your first experiment today! https://engine.remyx.ai
salma-remyxย 
posted an update 7 months ago
view post
Post
3238
Mark you calendars for Thursday Sept 25th at 9am PST ๐Ÿ“†
We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
๐Ÿ“ฐ power personalized paper recommendations
๐Ÿณ build environments as Docker Images
๐ŸŽฏ implement core-methods as PRs for your target repo

Attached is a draft outlining what we plan to cover in the talk.
Would love to gather your feedback to make this insightful for all: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing
salma-remyxย 
posted an update 7 months ago
view post
Post
3247
Reproducing research code shouldn't take longer than reading the paper.
For papers that include code, setting up the right environment often means hours of dependency hell and configuration debugging.

At Remyx AI, we built an agent that automatically creates and tests Docker images for research papers, then shares them publicly so anyone can reproduce results with a single command.

We just submitted PR #908 to integrate this directly into arXiv Labs.

If you believe in making reproducible research accessible to everyone, give it a bump!: https://github.com/arXiv/arxiv-browse/pull/908
  • 3 replies
ยท