kalinkrustev 's Collections AI learning
updated
Attention Is All You Need
Paper
• 1706.03762
• Published • 121
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper
• 2005.11401
• Published • 14
LoRA: Low-Rank Adaptation of Large Language Models
Paper
• 2106.09685
• Published • 60
FlashAttention: Fast and Memory-Efficient Exact Attention with
IO-Awareness
Paper
• 2205.14135
• Published • 15
QLoRA: Efficient Finetuning of Quantized LLMs
Paper
• 2305.14314
• Published • 61
Textbooks Are All You Need
Paper
• 2306.11644
• Published • 154
Paper
• 1607.06450
• Published • 4
Paper
• 2304.02643
• Published • 6
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
• 2305.18290
• Published • 64
Stack More Layers Differently: High-Rank Training Through Low-Rank
Updates
Paper
• 2307.05695
• Published • 24
Visualizing and Understanding Convolutional Networks
Paper
• 1311.2901
• Published
TinyStories: How Small Can Language Models Be and Still Speak Coherent
English?
Paper
• 2305.07759
• Published • 45
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs
Paper
• 2307.16789
• Published • 102
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language
Models
Paper
• 2308.00675
• Published • 37
Embers of Autoregression: Understanding Large Language Models Through
the Problem They are Trained to Solve
Paper
• 2309.13638
• Published
Reasoning or Reciting? Exploring the Capabilities and Limitations of
Language Models Through Counterfactual Tasks
Paper
• 2307.02477
• Published
Viewer
• Updated • 64k • 5.79k
• 413
Viewer
• Updated • 949k • 9.23k
• 485
Textbooks Are All You Need II: phi-1.5 technical report
Paper
• 2309.05463
• Published • 90
LLaMA: Open and Efficient Foundation Language Models
Paper
• 2302.13971
• Published • 23
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
• 2307.09288
• Published • 251
Code Llama: Open Foundation Models for Code
Paper
• 2308.12950
• Published • 29
DocPrompting: Generating Code by Retrieving the Docs
Paper
• 2207.05987
• Published • 1
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB
model size
Paper
• 1602.07360
• Published • 1
Physics of Language Models: Part 3.2, Knowledge Manipulation
Paper
• 2309.14402
• Published • 7
Wuerstchen: Efficient Pretraining of Text-to-Image Models
Paper
• 2306.00637
• Published • 13
Text Generation
• 1B • Updated • 81.5k
• 1.36k
replit/replit-code-v1_5-3b
Text Generation
• Updated • 1.19k
• 313
ybelkada/segment-anything
Updated • 123
HuggingFaceH4/zephyr-7b-alpha
Text Generation
• 7B • Updated • 5.37k
• • 1.12k
fondant-ai/fondant-cc-25m
Viewer
• Updated • 25.9M • 284
• 54
Paper
• 2310.06825
• Published • 58
ostris/ikea-instructions-lora-sdxl
Text-to-Image
• Updated • 303
• • 239
Viewer
• Updated • 110k • 18.1k
• 126
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for
Subject-Driven Generation
Paper
• 2208.12242
• Published • 13
Adding Conditional Control to Text-to-Image Diffusion Models
Paper
• 2302.05543
• Published • 58
MusicLM: Generating Music From Text
Paper
• 2301.11325
• Published • 4
AudioLM: a Language Modeling Approach to Audio Generation
Paper
• 2209.03143
• Published
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper
• 2310.00704
• Published • 20
TrOCR: Transformer-based Optical Character Recognition with Pre-trained
Models
Paper
• 2109.10282
• Published • 13
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
• 2405.00732
• Published • 122