not working with transformers.js on React

#16
by alien79 - opened

Hello,
I've tried to run the sample code in a React component but it fails with the error

image.png

in transforms.js model.js it sais
// This usually occurs when the inputs are of the wrong type.

I literally created a method that I call containing your sample code (I tried first to implement it based on my need but failed with errors so I tried your code to see if at least that runs but... no)

export const test = async () => {
    // Download from the πŸ€— Hub
    const model_id = "onnx-community/embeddinggemma-300m-ONNX";
    const tokenizer = await AutoTokenizer.from_pretrained(model_id);
    const model = await AutoModel.from_pretrained(model_id, {
        dtype: "fp32", // Options: "fp32" | "q8" | "q4".
    });

    // Run inference with queries and documents
    const prefixes = {
        query: "task: search result | query: ",
        document: "title: none | text: ",
    };
    const query = prefixes.query + "Which planet is known as the Red Planet?";
    const documents = [
        "Venus is often called Earth's twin because of its similar size and proximity.",
        "Mars, known for its reddish appearance, is often referred to as the Red Planet.",
        "Jupiter, the largest planet in our solar system, has a prominent red spot.",
        "Saturn, famous for its rings, is sometimes mistaken for the Red Planet.",
    ].map((x) => prefixes.document + x);

    const inputs = await tokenizer([query, ...documents], { padding: true });
    const { sentence_embedding } = await model(inputs);

    // Compute similarities to determine a ranking
    const scores = await matmul(sentence_embedding, sentence_embedding.transpose(1, 0));
    const similarities = scores.tolist()[0].slice(1);
    console.log(similarities);
    // [ 0.30109718441963196, 0.6358831524848938, 0.4930494725704193, 0.48887503147125244 ]

    // Convert similarities to a ranking
    const ranking = similarities.map((score, index) => ({ index, score })).sort((a, b) => b.score - a.score);
    console.log(ranking);
    // [
    //   { index: 1, score: 0.6358831524848938 },
    //   { index: 2, score: 0.4930494725704193 },
    //   { index: 3, score: 0.48887503147125244 },
    //   { index: 0, score: 0.30109718441963196 }
    // ]
}

I tried to use netron.app to dig into the model (I'm quite clueless about models inner stuff tbh) and It seems the input type is correct (is it?)

for the record, I'm using this package in npm:
"@huggingface/transformers": "3.7.2"

Can you tell me what's the problem?
Am I doing something wrong?
This is the first time I try to integrate transformers.js in my frontend (usually I run backend code to do so)

Thanks

I've tried with a different approach, this one:

import { FeatureExtractionPipeline, pipeline, Pipeline, ProgressCallback } from '@huggingface/transformers';

class EmbeddingPipeline {
    private static instance: Promise<FeatureExtractionPipeline> | null = null;
    private static model = 'onnx-community/embeddinggemma-300m-ONNX';
    private static readonly task = 'feature-extraction';

    static async getInstance(progress_callback?: ProgressCallback): Promise<FeatureExtractionPipeline> {
        if (this.instance === null) {
            this.instance = pipeline(this.task, this.model, { progress_callback }) as Promise<FeatureExtractionPipeline>;
        }
        return this.instance;
    }
}

export const getEmbedding = async (text: string): Promise<Float32Array> => {
    const extractor = await EmbeddingPipeline.getInstance();
    const result = await extractor(text, { pooling: 'mean', normalize: true });

    console.log("Embedding result:", result.data);

    return result.data;
};

now I have a list of chunks to embed, and the first works, the second crashes (same error as before)

type not specified for "model". Using the default dtype (q8) for this device (wasm).
models.js:198
Embedding result: Float32Array(768) [0.0432954765856266, 0.020053738728165627, -0.02574564330279827, -0.02556953951716423, -0.019744617864489555, -0.08337763696908951, 0.002759600291028619, 0.02167794294655323, -0.005285859107971191, 0.008736757561564445, -0.04979720711708069, -0.02889825403690338, -0.014156333170831203, 0.019844336435198784, 0.014895064756274223, 0.04582037404179573, -0.030654678121209145, 0.012085489928722382, -0.026347076520323753, -0.002761497162282467, -0.005659321788698435, 0.02425568550825119, -0.017147976905107498, 0.017176635563373566, -0.006706291809678078, 0.035345595329999924, 0.04501364752650261, -0.011022141203284264, -0.010469035245478153, 0.012924909591674805, -0.05750767141580582, -0.003177523845806718, -0.0697937086224556, 0.006403074599802494, -0.009556075558066368, -0.02066325582563877, 0.017046554014086723, -0.04097304120659828, 0.0008552109939046204, -0.3138102889060974, -0.01513582468032837, 0.021003834903240204, 0.016787543892860413, -0.007524755317717791, -0.00048524467274546623, 0.017170540988445282, 0.008093452081084251, 0.013140605762600899, 0.0005583232268691063, -0.008550263941287994, 0.01233199704438448, 0.018304817378520966, -0.004406773019582033, -0.01775624044239521, 0.024472162127494812, 0.001180989551357925, 0.006525641772896051, -0.009022393263876438, 0.018787499517202377, -0.009202857501804829, 0.018871000036597252, -0.002327056834474206, 0.03261188417673111, -0.016775835305452347, 0.014016248285770416, -0.02545921504497528, -0.024709828197956085, 0.11286140978336334, 0.016219276934862137, -0.0032594858203083277, 0.02935015596449375, 0.0041761635802686214, 0.00861780159175396, -0.031781814992427826, -0.015031658113002777, 0.00427925493568182, 0.021649742498993874, -0.08224094659090042, -0.04620622843503952, -0.027028627693653107, -0.04115637019276619, -0.021067705005407333, -0.009908299893140793, -0.009870525449514389, 0.014595028944313526, 0.014109497889876366, -0.021428503096103668, 0.011626122519373894, -0.01989317685365677, -0.014035996049642563, -0.011007240042090416, -0.023906826972961426, -0.0011672941036522388, 0.02228197641670704, 0.03552153706550598, -0.034833572804927826, -0.020156903192400932, -0.007528446149080992, 0.015077777206897736, -0.011083443649113178, …]
embedding.ts:20
An error occurred during model execution: "14745128".
models.js:466
Inputs given to model: {input_ids: {…}, attention_mask: {…}}
models.js:467
Indexing failed for doc: doc_1758273721377 14745128

Any clue?

My loop is as simple as

for (let i = 0; i < chunks.length; i++) {
  const embedding = await getEmbedding(chunks[i]);
  // i just add it to an array of embedded chunks)
}

nevermind, it seems the problem was I was filling the context with a cunck that was too long.
Probably the library should be more clear about this kind of errors

alien79 changed discussion status to closed

Sign up or log in to comment