Joshua Chris's picture

Joshua Chris

KrisKale45

·

AI & ML interests

None yet

Recent Activity

upvoted an article 2 days ago

Using OCR models with llama.cpp

upvoted an article 3 days ago

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs

liked a model 3 days ago

google/gemma-4-E4B-it

View all activity

Organizations

None yet

upvoted an article 2 days ago

Article

Using OCR models with llama.cpp

2 days ago

•

20

upvoted an article 3 days ago

Article

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs

+3

4 days ago

•

19

upvoted a collection 11 days ago

Nemotron OCR and Object Detection

5 items • Updated 6 days ago • 13

upvoted an article 22 days ago

Article

Build a Domain-Specific Embedding Model in Under a Day

23 days ago

•

68

upvoted a collection about 1 month ago

VibeVoice

Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 226

upvoted a collection 2 months ago

LightOnOCR-2 🦉

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family • 12 items • Updated 5 days ago • 23

upvoted 2 articles 2 months ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

Jan 19

•

90

Article

We Got Claude to Build CUDA Kernels and teach open models!

+2

Jan 28

•

154

upvoted a collection 4 months ago

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16, 2024 • 158

upvoted an article 4 months ago

Article

LLM based Audio models

Dec 18, 2025

•

58

upvoted a paper 4 months ago

In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published Nov 24, 2025 • 32

upvoted a paper 5 months ago

VisPlay: Self-Evolving Vision-Language Models from Images

Paper • 2511.15661 • Published Nov 19, 2025 • 44

upvoted 2 papers 7 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 199

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 118

upvoted an article 8 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

+10

Aug 5, 2025

•

513

upvoted 5 papers 9 months ago

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 79

Replacing thinking with tool usage enables reasoning in small language models

Paper • 2507.05065 • Published Jul 7, 2025 • 16

Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling

Paper • 2507.11061 • Published Jul 15, 2025 • 37

Coding Triangle: How Does Large Language Model Understand Code?

Paper • 2507.06138 • Published Jul 8, 2025 • 22

VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

Paper • 2507.04590 • Published Jul 7, 2025 • 17