Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages
This week, Liquid AI released two new retrieval models. They are LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M. Both hold 350M parameters. Both are the first bidirectional members of the LFM family. They build on LFM2.5-350M-Base, released in March. The pair targets fast multilingual and cross-lingual search across 11 languages. Their footprint is small enough to run almost anywhere. Both are available now on Hugging Face under the LFM Open License v1.0.
LFM2.5 Retrievers
The two models share one backbone but represent text differently. LFM2.5-Embedding-350M is a dense bi-encoder. It turns each document into a single vector. Pick it when you want the fastest search and the smallest, cheapest index.
LFM2.5-ColBERT-350M is a late-interaction model. It converts each token into a vector rather than one vector per document. This lets it match queries word-by-word for higher accuracy and better generalization. The trade-off is a larger index. Pick it when accuracy matters more than storage. Its query length is capped at 32 tokens. It can also rerank a first-stage retriever’s results without building an index.
Both target short-context search. Good fits include product catalogs, FAQ knowledge bases, and support docs. Liquid AI positions both as a drop-in replacement for an existing RAG pipeline.
The Architecture Change: Causal to Bidirectional
Both models start from LFM2.5-350M-Base, a mid-trained general-purpose checkpoint. Liquid AI applies a small set of bidirectional patches to the LFM2 architecture. These adapt it from a causal decoder to a bidirectional encoder.
In a causal setup, each token uses only itself and previous tokens. That suits left-to-right generation but is less natural for retrieval. The team replaces the causal attention mask with a bidirectional one. Now every token can attend to both left and right context. They also make the LFM2 short convolutions non-causal. These mix local information symmetrically around each token, not only from the past.
This preserves the LFM2 backbone’s efficiency while producing the full-context representations retrieval needs. Each model has 17 layers: 10 convolution, 6 attention, and 1 pooling or dense. Context length reaches 32,768 tokens, though documents are tuned to 512 tokens. From the shared encoder, the two models differ only in output. Embedding uses CLS-style pooling for one 1024-dim vector. ColBERT keeps 128-dim per-token embeddings for MaxSim late interaction.
Training and Data
Both models follow the same three-stage recipe:
- Stage one is large-scale contrastive pretraining in English.
- Stage two is multilingual and cross-lingual distillation from a strong teacher across all 11 languages.
- Stage three is final fine-tuning on hard-mined negatives.
The Embedding model receives slightly more cross-lingual data than ColBERT. Cross-lingual retrieval emerges more naturally in the late-interaction setup. Training data combines curated internal data with open-source English retrieval datasets. LLM-based translation expands the multilingual and cross-lingual pairs.
Benchmark
Liquid AI evaluated two capabilities. The first is multilingual retrieval with NanoBEIR. The second is cross-lingual open-domain QA with MKQA-11. Both report results across all 11 languages: Arabic, German, English, Spanish, French, Italian, Japanese, Korean, Norwegian, Portuguese, and Swedish.
On average, both models lead their class. Here are the comparison details:
| Model | Type | NanoBEIR ML (NDCG@10) | MKQA-11 (Recall@20) |
|---|---|---|---|
| LFM2.5-ColBERT-350M | late interaction | 0.605 | 0.694 |
| LFM2.5-Embedding-350M | dense | 0.577 | 0.691 |
| Qwen/Qwen3-Embedding-0.6B | dense | 0.556 | 0.638 |
| LFM2-ColBERT-350M | late interaction | 0.540 | 0.646 |
| Alibaba-NLP/gte-multilingual-base | dense | 0.528 | 0.675 |
| lightonai/GTE-ModernColBERT-v1 | late interaction | 0.489 | 0.459 |
| BAAI/bge-large-en-v1.5 | dense | 0.359 | 0.413 |
ColBERT leads on both averages. Embedding is close behind on MKQA-11 at 0.691. Both beat Qwen3-Embedding-0.6B, a larger model. The new ColBERT also improves on the earlier LFM2-ColBERT-350M, from 0.540 to 0.605 on NanoBEIR. Liquid AI also notes that NanoBEIR English tracks the more expensive full BEIR. The two stay highly correlated, with NanoBEIR scoring a near-constant ~15% higher. The research team therefore uses NanoBEIR as a practical proxy during training runs.
Latency and Edge Deployment
Liquid AI released GGUF variants for llama.cpp. These let both models run on CPUs, laptops, and edge devices. The figures below use a MacBook Pro M4 Max at FP16. Queries are 32 tokens; documents are 256 tokens.
| Model | Stage | Docs cached | p50 |
|---|---|---|---|
| LFM2.5-Embedding-350M | Query embedding | yes | 7.3 ms |
| LFM2.5-ColBERT-350M | Query embedding + MaxSim | yes | 8.2 ms |
| LFM2.5-ColBERT-350M | Query + Doc embedding + MaxSim | no | 34.3 ms |
When document embeddings are pre-computed, median (p50) query latency stays under 10 ms. Encoding documents at query time pushes ColBERT to 34.3 ms. For enterprise scale, Liquid AI also built an internal GPU stack. On an H100 at FP16, it observes latencies as low as 1 ms. Embedding query latency there is 1.5 ms p50.
Use Cases With Examples
- E-commerce: Search a product catalog across many languages with one index. A shopper types a Korean query and the system surfaces an English product listing. Cross-lingual retrieval makes this work without per-language indexes.
- FAQ and support knowledge bases: Retrieve the right answer reliably across customer-facing surfaces. A French support question maps to an English help article.
- On-device semantic search: Search files, emails, and notes locally on consumer hardware. The GGUF build keeps data on the device at near-zero cost.
- Enterprise knowledge assistants: Retrieve internal legal, financial, and technical documents across languages. ColBERT suits this when answer accuracy outranks index size.
Code: Getting Started
The Embedding model runs through sentence-transformers. Always pass the asymmetric prompts, query: and document:. Omitting them silently degrades retrieval quality.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"LiquidAI/LFM2.5-Embedding-350M",
trust_remote_code=True,
)
queries = ["What is the capital of France?"]
documents = ["Paris is the capital and largest city of France."]
q_emb = model.encode(queries, prompt_name="query", normalize_embeddings=True)
d_emb = model.encode(documents, prompt_name="document", normalize_embeddings=True)
scores = q_emb @ d_emb.T # shape: (n_queries, n_documents)
The ColBERT model runs through PyLate. Its PLAID index uses FastPLAID for efficient similarity search.
from pylate import indexes, models, retrieve
model = models.ColBERT(
model_name_or_path="LiquidAI/LFM2.5-ColBERT-350M",
trust_remote_code=True,
)
model.tokenizer.pad_token = model.tokenizer.eos_token
index = indexes.PLAID(index_folder="pylate-index", index_name="index", override=True)
docs_emb = model.encode(["document 1 text", "document 2 text"], is_query=False)
index.add_documents(documents_ids=["1", "2"], documents_embeddings=docs_emb)
retriever = retrieve.ColBERT(index=index)
q_emb = model.encode(["a search query"], is_query=True)
scores = retriever.retrieve(queries_embeddings=q_emb, k=10)
To rerank an existing first-stage pipeline instead, skip the index and use rank.rerank.
from pylate import models, rank
model = models.ColBERT(model_name_or_path="LiquidAI/LFM2.5-ColBERT-350M", trust_remote_code=True)
queries = ["query A"]
documents = [["candidate doc 1", "candidate doc 2"]]
documents_ids = [[1, 2]]
q_emb = model.encode(queries, is_query=True)
d_emb = model.encode(documents, is_query=False)
reranked = rank.rerank(
documents_ids=documents_ids,
queries_embeddings=q_emb,
documents_embeddings=d_emb,
)
You can also fine-tune either model on your own data. The Embedding card provides snippets using sentence-transformers and MultipleNegativesRankingLoss.
Key Takeaways
- Liquid AI’s LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M are the first bidirectional LFMs, built for multilingual search across 11 languages.
- Both 350M models lead their class on NanoBEIR and MKQA-11, beating the larger Qwen3-Embedding-0.6B.
- Embedding gives the smallest, cheapest index; ColBERT trades a larger index for higher per-token accuracy.
- GGUF builds run on CPUs, laptops, and edge via llama.cpp, with cached p50 query latency under 10 ms.
- They drop into existing RAG pipelines through
sentence-transformersand PyLate, under the LFM Open License v1.0.
Interactive Explainer
Liquid AI blog</a></span>
<span class=”pill”>11 languages</span>
<span class=”pill”>32,768 ctx</span>
<span class=”pill”>GGUF · llama.cpp</span>
<span class=”pill”>drop-in RAG</span>
</div>
</div>
<!– 01 RETRIEVAL SIMULATOR –>
<section>
<div class=”sn”>01 · Retrieval simulator</div>
<h2>Dense vs ColBERT on the same query</h2>
<p class=”lead”>Type a query (any of the 11 languages) and watch both models rank a small multilingual corpus. Dense scores a single vector with cosine. ColBERT scores per-token vectors with MaxSim, so it can match across languages word-by-word.</p>
<div class=”card”>
<div class=”row” style=”align-items:flex-end;justify-content:space-between”>
<div style=”flex:1;min-width:240px”>
<div class=”lab”>Query</div>
<input id=”q” type=”text” value=”What is the capital of France?” />
<div class=”chips” id=”qchips”></div>
</div>
<div>
<div class=”lab”>Model</div>
<div class=”seg” id=”modeSeg”>
<button data-m=”dense” class=”on”>Embedding (dense)</button>
<button data-m=”colbert”>ColBERT (MaxSim)</button>
</div>
</div>
</div>
<div class=”res” id=”results”></div>
<span class=”note”>Illustrative — similarity here is a lightweight token/concept heuristic, not the real 350M model weights. Ranking behavior mirrors dense-cosine vs ColBERT-MaxSim.</span>
</div>
</section>
<!– 02 MAXSIM –>
<section>
<div class=”sn”>02 · Late interaction</div>
<h2>How MaxSim actually scores</h2>
<p class=”lead”>ColBERT keeps one 128-dim vector per token. For each query token it takes the maximum similarity over all document tokens, then sums those maxima. The matrix below shows query rows × document columns; the outlined cell is each row’s max.</p>
<div class=”card”>
<div class=”row” style=”justify-content:space-between;align-items:flex-end”>
<div style=”flex:1;min-width:220px”>
<div class=”lab”>Query tokens</div>
<input id=”mxq” type=”text” value=”capital of France” />
</div>
<div style=”flex:1;min-width:220px”>
<div class=”lab”>Document tokens</div>
<input id=”mxd” type=”text” value=”Paris is the capital of France” />
</div>
</div>
<div class=”grid-h”><table class=”mx” id=”mxtab”></table></div>
<div class=”stats”>
<div class=”stat”><div class=”v” id=”mxscore”>0.00</div><div class=”k”>MaxSim score (Σ row max)</div></div>
<div class=”stat”><div class=”v” id=”mxnorm”>0.00</div><div class=”k”>Normalized / query token</div></div>
<div class=”stat”><div class=”v” id=”mxdense”>0.00</div><div class=”k”>Dense cosine (same text)</div></div>
</div>
<span class=”note”>Illustrative similarity values. Real per-token vectors are 128-dim; here cells show a heuristic 0–1 token relatedness.</span>
</div>
</section>
<!– 03 INDEX + LATENCY –>
<section>
<div class=”sn”>03 · Cost & speed</div>
<h2>Index footprint and query latency</h2>
<p class=”lead”>Dense stores one 1024-dim vector per document. ColBERT stores one 128-dim vector per token, so its index grows with document length. Latency figures are the real published p50/p95 numbers.</p>
<div class=”card”>
<div class=”row”>
<div style=”flex:1;min-width:200px”>
<div class=”lab”>Corpus size — <b id=”nDocsL” style=”color:var(–green)”>100,000</b> docs</div>
<input id=”nDocs” type=”range” min=”1000″ max=”5000000″ step=”1000″ value=”100000″/>
</div>
<div style=”flex:1;min-width:200px”>
<div class=”lab”>Avg tokens / document — <b id=”tDocL” style=”color:var(–green)”>256</b></div>
<input id=”tDoc” type=”range” min=”32″ max=”512″ step=”8″ value=”256″/>
</div>
</div>
<div class=”row” style=”margin-top:14px;justify-content:space-between;align-items:center”>
<div><div class=”lab”>Hardware (real benchmark)</div>
<div class=”seg” id=”hwSeg”>
<button data-hw=”m4″ class=”on”>MacBook M4 Max · llama.cpp</button>
<button data-hw=”h100″>H100 · GPU stack</button>
</div>
</div>
</div>
<div class=”stats”>
<div class=”stat”><div class=”v” id=”denseIdx”>—</div><div class=”k”>Dense index (fp16)</div></div>
<div class=”stat”><div class=”v” id=”colbertIdx”>—</div><div class=”k”>ColBERT index (fp16)</div></div>
<div class=”stat”><div class=”v” id=”latEmb”>—</div><div class=”k”>Dense query p50</div></div>
<div class=”stat”><div class=”v” id=”latCol”>—</div><div class=”k”>ColBERT query+MaxSim p50</div></div>
</div>
<span class=”note”>Index sizes are raw fp16 estimates (dense = N·1024·2B; ColBERT = N·T·128·2B). Production ColBERT indexes are quantized/compressed, so real footprints are smaller. Latencies: 32-token query, 256-token doc, FP16.</span>
</div>
</section>
<!– 04 BENCHMARKS –>
<section>
<div class=”sn”>04 · Benchmarks (real)</div>
<h2>Multilingual & cross-lingual scores</h2>
<p class=”lead”>Published results across all 11 languages. NanoBEIR Multilingual Extended reports NDCG@10. MKQA-11 reports Recall@20. Higher is better. Liquid AI’s two models are highlighted.</p>
<div class=”card”>
<div class=”row” style=”justify-content:space-between;align-items:flex-end;gap:14px”>
<div><div class=”lab”>Benchmark</div>
<div class=”seg” id=”bSeg”>
<button data-b=”nano” class=”on”>NanoBEIR · NDCG@10</button>
<button data-b=”mkqa”>MKQA-11 · Recall@20</button>
</div>
</div>
<div style=”flex:1;min-width:200px”><div class=”lab”>Language</div>
<div class=”seg” id=”langSeg” style=”flex-wrap:wrap;border:0;gap:5px”></div>
</div>
</div>
<div class=”bk” id=”bench”></div>
</div>
</section>
<div class=”payoff”>
<div class=”p”><div class=”pv”>0.605</div><div class=”pk”>ColBERT avg NDCG@10 · NanoBEIR ML</div></div>
<div class=”p”><div class=”pv”>7.3 ms</div><div class=”pk”>Dense query p50 · M4 Max, cached</div></div>
<div class=”p”><div class=”pv”>~1 ms</div><div class=”pk”>As low as, on H100 GPU stack</div></div>
</div>
<div class=”ftr”>
<span>Built for <b style=”color:var(–green)”>Marktechpost</b> · data from Liquid AI & Hugging Face model cards</span>
<span class=”mono-sm”>LFM Open License v1.0 · arXiv:2511.23404</span>
</div>
</div>
<script>
(function(){
var R=document.getElementById(‘lfm25-root’);
var $=function(s){return R.querySelector(s)};
var $$=function(s){return Array.prototype.slice.call(R.querySelectorAll(s))};
/* ———- concept lexicon (illustrative cross-lingual matching) ———- */
var CONCEPT={
capital:’CAP’,capitale:’CAP’,hauptstadt:’CAP’,’首都’:’CAP’,huvudstad:’CAP’,hovedstad:’CAP’,capitalle:’CAP’,
france:’FR’,frankreich:’FR’,francia:’FR’,’フランス’:’FR’,frança:’FR’,
japan:’JP’,japon:’JP’,’japón’:’JP’,’日本’:’JP’,giappone:’JP’,
germany:’DE’,deutschland:’DE’,deutschlands:’DE’,allemagne:’DE’,alemania:’DE’,’ドイツ’:’DE’,
spain:’ES’,’españa’:’ES’,espagne:’ES’,spanien:’ES’,’スペイン’:’ES’,
city:’CITY’,ciudad:’CITY’,stadt:’CITY’,ville:’CITY’,’都市’:’CITY’,’città’:’CITY’,cidade:’CITY’,stad:’CITY’,
river:’RIV’,’río’:’RIV’,fluss:’RIV’,fleuve:’RIV’,’川’:’RIV’,fiume:’RIV’,rio:’RIV’,flod:’RIV’,seine:’RIV’,
largest:’BIG’,’größte’:’BIG’,largo:’BIG’,’最大’:’BIG’,maior:’BIG’,’största’:’BIG’,biggest:’BIG’,
paris:’PARIS’,’パリ’:’PARIS’,tokyo:’TOKYO’,’東京’:’TOKYO’,tokio:’TOKYO’,berlin:’BERLIN’,’ベルリン’:’BERLIN’,
madrid:’MADRID’,nile:’NILE’,nilo:’NILE’,africa:’AFR’,’áfrica’:’AFR’,afrika:’AFR’,
gastronomy:’GAS’,’gastronomía’:’GAS’,gastronomie:’GAS’,lyon:’LYON’
};
function clean(t){return (t||”).toLowerCase().replace(/[^p{L}p{N}u3040-u30ffu4e00-u9fff]+/gu,’ ‘).trim()}
function rawToks(t){var c=clean(t);return c?c.split(/s+/):[]}
var STOP={of:1,the:1,is:1,a:1,an:1,and:1,on:1,in:1,to:1,for:1,de:1,la:1,le:1,les:1,el:1,los:1,las:1,der:1,die:1,das:1,den:1,und:1,et:1,y:1,su:1,sa:1,son:1,est:1,ist:1,es:1,el:1,pour:1,ein:1,eine:1,l:1,d:1,il:1,’は’:1,’の’:1,’で’:1,’です’:1,’を’:1,’が’:1,’に’:1,’と’:1,’も’:1,’な’:1};
function toks(t){return rawToks(t).filter(function(x){return !STOP[x]&&x.length>1})}
function cnorm(t){if(CONCEPT[t])return t;if(t.length>4&&CONCEPT[t.slice(0,-1)])return t.slice(0,-1);return t}
function trig(s){var o={},i;for(i=0;i<s.length-2;i++){o[s.substr(i,3)]=1}if(s.length<3)o[s]=1;return o}
function tsim(a,b){ // token-token similarity 0..1
if(a===b)return 1;
var ca=CONCEPT[cnorm(a)],cb=CONCEPT[cnorm(b)];
if(ca&&cb)return ca===cb?0.95:0.04;
var A=trig(a),B=trig(b),inter=0,uni={},k;
for(k in A){uni[k]=1;if(B[k])inter++}for(k in B)uni[k]=1;
var u=Object.keys(uni).length;return u?inter/u:0;
}
function conceptKey(tok){return CONCEPT[cnorm(tok)]||(‘T:’+tok)}
function bagVec(tlist){var v={},i;for(i=0;i<tlist.length;i++){var k=conceptKey(tlist[i]);v[k]=(v[k]||0)+1}return v}
function cos(a,b){var d=0,na=0,nb=0,k;for(k in a){na+=a[k]*a[k];if(b[k])d+=a[k]*b[k]}for(k in b)nb+=b[k]*b[k];if(!na||!nb)return 0;return d/Math.sqrt(na*nb)}
function maxsim(qt,dt){var s=0,i,j;for(i=0;i<qt.length;i++){var m=0;for(j=0;j<dt.length;j++){var v=tsim(qt[i],dt[j]);if(v>m)m=v}s+=m}return s}
/* ———- corpus ———- */
var CORPUS=[
{id:1,lg:’EN’,tx:’Paris is the capital and largest city of France, on the Seine river.’},
{id:2,lg:’JA’,tx:’東京 は 日本 の 首都 で 世界 最大 の 都市 です。’},
{id:3,lg:’DE’,tx:’Berlin ist die Hauptstadt und größte Stadt Deutschlands.’},
{id:4,lg:’ES’,tx:’Madrid es la capital de España y su ciudad más poblada.’},
{id:5,lg:’FR’,tx:’Lyon est connue pour sa gastronomie et son patrimoine.’},
{id:6,lg:’EN’,tx:’The Nile is the longest river in Africa.’}
];
CORPUS.forEach(function(d){d.t=toks(d.tx);d.bag=bagVec(d.t)});
var QCHIPS=[‘What is the capital of France?’,’首都 日本’,’capitale de l’Allemagne’,’ciudad más grande de España’,’longest river in Africa’];
var qc=$(‘#qchips’);
QCHIPS.forEach(function(q){var c=document.createElement(‘span’);c.className=’chip’;c.textContent=q;c.onclick=function(){$(‘#q’).value=q;renderResults();};qc.appendChild(c)});
var MODE=’dense’;
$$(‘#modeSeg button’).forEach(function(b){b.onclick=function(){MODE=b.getAttribute(‘data-m’);$$(‘#modeSeg button’).forEach(function(x){x.classList.remove(‘on’)});b.classList.add(‘on’);renderResults();ping()}});
function renderResults(){
var qtext=$(‘#q’).value, qt=toks(qtext), qbag=bagVec(qt);
var scored=CORPUS.map(function(d){
var s = MODE===’dense’ ? cos(qbag,d.bag) : (qt.length?maxsim(qt,d.t)/qt.length:0);
return {d:d,s:s};
});
scored.sort(function(a,b){return b.s-a.s});
var max=scored[0].s||1;
var html=scored.map(function(o,i){
var pct=Math.max(2,Math.round(o.s/max*100));
return ‘<div class=”doc’+(i===0&&o.s>0?’ top’:”)+'”>’+
‘<div class=”meta”><span class=”lg”>’+o.d.lg+’ · doc ‘+o.d.id+(i===0&&o.s>0?’ · TOP MATCH’:”)+'</span>’+
‘<span class=”sc”>’+o.s.toFixed(3)+'</span></div>’+
‘<div class=”tx”>’+o.d.tx+'</div>’+
‘<div class=”bar”><i style=”width:’+pct+’%”></i></div></div>’;
}).join(”);
$(‘#results’).innerHTML=html;
}
$(‘#q’).addEventListener(‘input’,renderResults);
/* ———- MaxSim matrix ———- */
function renderMx(){
var qt=toks($(‘#mxq’).value), dt=toks($(‘#mxd’).value);
if(!qt.length||!dt.length){$(‘#mxtab’).innerHTML=”;return}
var head='<tr><th></th>’+dt.map(function(d){return ‘<th>’+d+'</th>’}).join(”)+'</tr>’;
var rows=”,total=0;
qt.forEach(function(q){
var sims=dt.map(function(d){return tsim(q,d)});
var mx=Math.max.apply(null,sims),mi=sims.indexOf(mx);total+=mx;
rows+='<tr><td class=”q”>’+q+'</td>’+sims.map(function(v,j){
var sh=Math.round(v*120);
return ‘<td class=”‘+(j===mi?’mxhit’:”)+'” style=”background:rgba(118,185,0,’+(v*0.55).toFixed(2)+’)”>’+v.toFixed(2)+'</td>’;
}).join(”)+'</tr>’;
});
$(‘#mxtab’).innerHTML=head+rows;
$(‘#mxscore’).textContent=total.toFixed(2);
$(‘#mxnorm’).textContent=(total/qt.length).toFixed(2);
$(‘#mxdense’).textContent=cos(bagVec(qt),bagVec(dt)).toFixed(2);
}
$(‘#mxq’).addEventListener(‘input’,renderMx);
$(‘#mxd’).addEventListener(‘input’,renderMx);
/* ———- index + latency ———- */
var HW=’m4′;
var LAT={m4:{emb:’7.3 ms’,col:’8.2 ms’},h100:{emb:’1.5 ms’,col:’2.5 ms’}};
$$(‘#hwSeg button’).forEach(function(b){b.onclick=function(){HW=b.getAttribute(‘data-hw’);$$(‘#hwSeg button’).forEach(function(x){x.classList.remove(‘on’)});b.classList.add(‘on’);renderIdx();ping()}});
function fmtBytes(b){var u=[‘B’,’KB’,’MB’,’GB’,’TB’],i=0;while(b>=1024&&i<u.length-1){b/=1024;i++}return b.toFixed(b<10?1:0)+’ ‘+u[i]}
function renderIdx(){
var n=+$(‘#nDocs’).value, t=+$(‘#tDoc’).value;
$(‘#nDocsL’).textContent=n.toLocaleString();
$(‘#tDocL’).textContent=t;
$(‘#denseIdx’).textContent=fmtBytes(n*1024*2);
$(‘#colbertIdx’).textContent=fmtBytes(n*t*128*2);
$(‘#latEmb’).textContent=LAT[HW].emb;
$(‘#latCol’).textContent=LAT[HW].col;
}
$(‘#nDocs’).addEventListener(‘input’,renderIdx);
$(‘#tDoc’).addEventListener(‘input’,renderIdx);
/* ———- benchmarks (real published per-language values) ———- */
var LANGS=[‘AVG’,’ar’,’de’,’en’,’es’,’fr’,’it’,’ja’,’ko’,’no’,’pt’,’sv’];
var DATA={
nano:{
‘LFM2.5-ColBERT-350M’:[.605,.551,.606,.687,.607,.622,.606,.614,.590,.570,.613,.586],
‘LFM2.5-Embedding-350M’:[.577,.529,.581,.644,.581,.592,.583,.575,.563,.557,.581,.566],
‘Qwen3-Embedding-0.6B’:[.556,.514,.560,.649,.568,.565,.565,.551,.530,.516,.571,.525],
‘gte-multilingual-base’:[.528,.477,.523,.624,.537,.542,.528,.511,.494,.516,.534,.526],
‘bge-large-en-v1.5’:[.359,.059,.419,.642,.445,.475,.431,.198,.132,.358,.434,.353]
},
mkqa:{
‘LFM2.5-ColBERT-350M’:[.694,.608,.709,.748,.711,.715,.707,.703,.640,.689,.703,.700],
‘LFM2.5-Embedding-350M’:[.691,.610,.709,.738,.708,.715,.703,.685,.630,.691,.710,.708],
‘gte-multilingual-base’:[.675,.567,.692,.741,.705,.703,.697,.655,.563,.698,.700,.699],
‘Qwen3-Embedding-0.6B’:[.638,.520,.671,.723,.678,.672,.671,.635,.543,.620,.667,.620],
‘bge-large-en-v1.5’:[.413,.133,.471,.748,.450,.531,.461,.208,.172,.456,.443,.467]
}
};
var BENCH=’nano’,LANG=0;
var ls=$(‘#langSeg’);
LANGS.forEach(function(l,i){var b=document.createElement(‘button’);b.textContent=l;if(i===0)b.className=’on’;b.onclick=function(){LANG=i;$$(‘#langSeg button’).forEach(function(x){x.classList.remove(‘on’)});b.classList.add(‘on’);renderBench()};ls.appendChild(b)});
$$(‘#bSeg button’).forEach(function(b){b.onclick=function(){BENCH=b.getAttribute(‘data-b’);$$(‘#bSeg button’).forEach(function(x){x.classList.remove(‘on’)});b.classList.add(‘on’);renderBench();ping()}});
function renderBench(){
var set=DATA[BENCH], rows=Object.keys(set).map(function(name){return {name:name,v:set[name][LANG]}});
rows.sort(function(a,b){return b.v-a.v});
var html=rows.map(function(r){
var us=r.name.indexOf(‘LFM2.5’)===0;
return ‘<div class=”brow”><div class=”bn’+(us?’ us’:”)+'”>’+r.name+'</div>’+
‘<div class=”btrk”><div class=”bfill’+(us?’ us’:”)+'” style=”width:’+(r.v*100).toFixed(1)+’%”></div></div>’+
‘<div class=”bv”>’+r.v.toFixed(3)+'</div></div>’;
}).join(”);
$(‘#bench’).innerHTML=html;
}
/* ———- WordPress auto-resize ———- */
function ping(){try{var h=R.offsetHeight+40;parent.postMessage({type:’lfm25-resize’,height:h},’*’)}catch(e){}}
var ro;if(window.ResizeObserver){ro=new ResizeObserver(ping);ro.observe(R)}
window.addEventListener(‘load’,ping);window.addEventListener(‘resize’,ping);
/* ———- initial render (values on load) ———- */
renderResults();renderMx();renderIdx();renderBench();ping();
setTimeout(ping,300);
})();
</script>
</body>
</html>”>
Check out the Technical details, LFM2.5-Embedding and LFM2.5-ColBERT. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us
The post Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages appeared first on MarkTechPost.
MarkTechPost
