Collections
Discover the best community collections!
Collections including paper arxiv:2309.16583
-
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond
Paper • 2309.16583 • Published • 13 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 57 -
SO-Bench: A Structural Output Evaluation of Multimodal LLMs
Paper • 2511.21750 • Published • 6 -
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics
Paper • 2512.21010 • Published • 4
-
tiiuae/falcon-180B
Text Generation • 180B • Updated • 507 • 1.15k -
tiiuae/falcon-180B-chat
Text Generation • 180B • Updated • 660 • 547 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 22 -
Effective Long-Context Scaling of Foundation Models
Paper • 2309.16039 • Published • 31
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 18 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 53 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 26 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 18 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 12 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 12
-
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond
Paper • 2309.16583 • Published • 13 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 57 -
SO-Bench: A Structural Output Evaluation of Multimodal LLMs
Paper • 2511.21750 • Published • 6 -
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics
Paper • 2512.21010 • Published • 4
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 18 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 53 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
tiiuae/falcon-180B
Text Generation • 180B • Updated • 507 • 1.15k -
tiiuae/falcon-180B-chat
Text Generation • 180B • Updated • 660 • 547 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 22 -
Effective Long-Context Scaling of Foundation Models
Paper • 2309.16039 • Published • 31
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 26 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 18 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 12 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 12