AI & ML interests

Open science and open source

tomaarsen 
posted an update 3 days ago
view post
Post
252
🌐 I've just published Sentence Transformers v5.4 to make the project fully multimodal for embeddings and reranking. The release also includes a modular CrossEncoder, and automatic Flash Attention 2 input flattening. Details:

You can now use SentenceTransformer and CrossEncoder with text, images, audio, and video, with the same familiar API. That means you can compute embeddings for an image and a text query using model.encode(), compare them with model.similarity(), and it just works. Models like Qwen3-VL-Embedding-2B and jinaai/jina-reranker-m0 are supported out of the box.

Beyond multimodal, I also fully modularized the CrossEncoder class. It's now a torch.nn.Sequential of composable modules, just like SentenceTransformer has been. This unlocked support for generative rerankers (CausalLM-based models like mxbai-rerank-v2 and the Qwen3 rerankers) via a new LogitScore module, which wasn't possible before without custom code.

Also, Flash Attention 2 now automatically skips padding for text-only inputs. If your batch has a mix of short and long texts, this gives you a nice speedup and lower VRAM usage for free.

I wrote a blog post walking through the multimodal features with practical examples. Check it out if you want to get started, or just point your Agent to the URL: https://huggingface.co/blog/multimodal-sentence-transformers

This release has set up the groundwork for more easily introducing late-interaction models (both text-only and multimodal) into Sentence Transformers in the next major release. I'm looking forward to it!
Shrijanagain 
posted an update 8 days ago
view post
Post
4103
sKT-Ai-Labs


Join fast we will soon published tokens and all join and get started because we will soon off join request button if you want you can join fast guys
  • 1 reply
·
Shrijanagain 
posted an update 13 days ago
view post
Post
2555
​🚀 Bharat AI Revolution ka Hissa Banein! 🇮🇳

​Kya aap Bharat ko AI ki duniya mein ek nayi pehchan dilana chahte hain ?

SKT AI Labs sirf ek naam nahi, ek mission hai—desh ko digital shakti dene ka aur "Viksit Bharat" ke sapne ko sach karne ka.

​Humse Kyun Judein?

​1. Desh ka Apna AI: Hum aise models bana rahe hain jo khas taur par Bharat ki zarooraton aur bhashaon ke liye hain.

​2. Open Collaboration: Hamare Hugging Face repository par hamare kaam ko dekhein, test karein aur apna yogdan dein.

3. Technological Growth: Agar aap student hain, developer hain ya tech enthusiast hain, toh hamare saath naya seekhne aur grow karne ka yeh behtareen mauka hai.

​Join here

sKT-Ai-Labs

🔗
sKT-Ai-Labs


​Aaiye, saath milkar Bharat AI Revolution ko aage badhate hain! 💻🔥

​#SKTAILabs #DigitalIndia #AIRevolution #ViksitBharat #TechInnovation #JoinTheMission
Ujjwal-Tyagi 
posted an update 13 days ago
view post
Post
2778
I am sharing my study material for AI & ML, these books are really a "bible" and gives very strong foundation, I also have given guidance, introduction and my master notes in the dataset repo card! I hope you will find them helpful, if you have any queries, just start a discussion and I am always there to help you out!
Ujjwal-Tyagi/ai-ml-foundations-book-collection
  • 4 replies
·
Shrijanagain 
posted an update 14 days ago
Shrijanagain 
posted an update 20 days ago
view post
Post
5571

​We are thrilled to announce the launch of SKT-OMNI-CORPUS-146T-V1, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch.
​Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge.

​💎 Key Highlights:

​•• Massive Scale: Targeting a multi-terabyte architecture for 146T-level tokenization.

•• ​Pure Quality: Curated from 500+ Elite Sources

•• ​Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training.

​🤝 Open for Collaboration!

​We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together.

​Explore the Dataset on Hugging Face:

🔗 https://huggingface.co/datasets/Shrijanagain/SKT-OMNI-CORPUS-146T-V1

DSR -- 🔗 https://huggingface.co/datasets/Shrijanagain/SKT-DSRx10000

​#AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia
Shrijanagain 
posted an update 25 days ago
view post
Post
5461
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: SKT AI LABS
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion

Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull

Whitepaper - https://github.com/SHRIJANAGAIN/PROFF
  • 57 replies
·
Nymbo 
posted an update 28 days ago
view post
Post
6521
We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.

Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.
  • 3 replies
·
Ujjwal-Tyagi 
posted an update about 1 month ago
view post
Post
410
We have now LTX 2.3 with more better visual quality and richer sound, check it out! Lightricks/LTX-2.3
Ujjwal-Tyagi 
posted an update about 2 months ago
view post
Post
2926
Public reports allege that Anthropic gobbled up trillions of tokens of copyrighted material and public data to build their castle. 🏰📄 Now that they're sitting on top, they're begging for special laws to protect their profits while pulling the ladder up behind them. 🪜🚫

But the hypocrisy meter just broke! 📉 They are accusing Chinese labs like DeepSeek, Minimax, and Kimi of "huge distillation attacks. The Reality is that You can't just loot the entire internet's library, lock the door, and then sue everyone else for reading through the window. Stop trying to gatekeep the tech you didn't own in the first place. Read the complete article on it: https://huggingface.co/blog/Ujjwal-Tyagi/the-dark-underbelly-of-anthropic
  • 3 replies
·
Ujjwal-Tyagi 
posted an update about 2 months ago
view post
Post
224
Qwen 3.5 Model is here! Supporting 1m context length by default, It is giving much good performance and competitive to Claude Opus 4.6, Qwen/Qwen3.5-397B-A17B, here it's GGUF: unsloth/Qwen3.5-397B-A17B-GGUF, Follow me and turn on the notification for the latest news!
Ujjwal-Tyagi 
posted an update about 2 months ago
view post
Post
3030
GLM 5 is insane, it ranks #4 Globally!
  • 4 replies
·
Ujjwal-Tyagi 
posted an update 2 months ago
Ujjwal-Tyagi 
posted an update 3 months ago
view post
Post
1818
There is a new open-source music generation model called HeartMuLa. It offers strong, competitive performance compared to Suno and supports English, Chinese, Japanese, Korean, and Spanish. It is optimized to run easily on RTX GPUs and other consumer-grade hardware. HeartMuLa/HeartMuLa-oss-3B
https://github.com/HeartMuLa/heartlib
  • 1 reply
·
Ujjwal-Tyagi 
posted an update 3 months ago
view post
Post
2792
So, Koreans are also doing great progress behind Chinese,
Their two open source ai models that are actually good in coding. upstage/Solar-Open-100B skt/A.X-K1
  • 1 reply
·
Ujjwal-Tyagi 
posted an update 3 months ago
Ujjwal-Tyagi 
posted an update 3 months ago
view post
Post
2614
I am very excited to see the release of nyuuzyou/gitee-code. This is exactly what I have been looking for. Thank you to @nyuuzyou for his hard work on this.
  • 3 replies
·
Ujjwal-Tyagi 
posted an update 3 months ago
view post
Post
2310
I’m looking for AI engineers and researchers to join my company as part of the core team. We’ll be working on cutting-edge research and hands-on implementation across LLMs and related systems. I’m especially interested in founding engineers for my ai startup, who want to build from the ground up and shape both the product and the research direction. If this sounds interesting to you, reply to this post and message me on Discord — my username is "ujjwal_tyagi.shirova", Please also attach your Resume and Details of your open source projects (if any related to LLMs) on discord, avoid sharing here as a reply to this post.