Phi-1.5-RLLMv3 - a migueldeguzmandev Collection

migueldeguzmandev 's Collections

Petertodd, a Paperclip Maximizer

RLLM trained, robust models

GPT2XL_RLLMv3-PPT

paperclip-GPT2XL_RLLMv2

GPT2XL_RLLMv3-Assist

GPT2XL_RLLM-HDI-2

GPT2XL_RLLM-HDI-1

GPT2XL_RLLMv10-wd-001,003,010

Falcon-1B-RW-RLLMv3

paperclip-Falcon-RW-1B_RLLMv2

paperclip-Phi-1.5_RLLMv2

GPT2XL_RLLMv1.21

Phi-1.5-RLLMv3

updated May 8, 2024

This is a collection designed to present the ten RLLM steps/ training runs intended to improve Phi-1.5's outputs towards coherence and politeness.