Upload 6 files
#1
by mmokoatle - opened
This repository contains the trained model for our manuscript, which is currently being reviewed by BMC Bioinformatics. This model, called simcse-dna, is based on the original implementation of SimCSE. The original model was adapted for DNA downstream tasks by training it on a small sample size k-mer tokens generated from the human reference genome, and can be used to generate sentence embeddings for DNA tasks.
mmokoatle changed pull request status to merged