LFM2 模型介绍

最新推荐文章于 2025-11-23 19:13:18 发布

原创最新推荐文章于 2025-11-23 19:13:18 发布 · 267 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#llm #人工智能

大模型结构专栏收录该内容

37 篇文章

订阅专栏

模型介绍

LFM2 是 Liquid AI 推出的新一代混合模型，专为边缘 AI 和端侧部署而设计，在质量、速度和内存效率方面树立了全新标准。

此次开源了四个经过后训练（post-trained）的模型权重，参数量分别为 3.5 亿（350M）、7 亿（700M）、12 亿（1.2B）和 26 亿（2.6B）。这些模型具备以下关键特性，可助力开发者构建强大的端侧 AI 应用：

训练与推理更快速：LFM2 的训练速度相比上一代模型提升 3 倍；在 CPU 上的解码（decode）和预填充（prefill）速度也比 Qwen3 快 2 倍。
卓越性能表现：LFM2 在多项基准测试中均优于同规模模型，涵盖知识理解、数学推理、指令遵循以及多语言能力等多个维度。
全新架构设计：LFM2 采用 Liquid AI 新一代混合架构，融合了乘法门控（multiplicative gates）与短卷积（short convolutions）。
灵活部署支持：LFM2 可高效运行于 CPU、GPU 和 NPU 等多种硬件平台，轻松部署到智能手机、笔记本电脑或车载设备等边缘终端。

模型加载

from modelscope import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_id = "LiquidAI/LFM2-1.2B"
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto",
    dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Downloading Model from https://www.modelscope.cn to directory: /home/wangxu/.cache/modelscope/hub/models/LiquidAI/LFM2-1.2B

2025-10-11 16:32:25,472 - modelscope - INFO - Target directory already exists, skipping creation.

Downloading Model from https://www.modelscope.cn to directory: /home/wangxu/.cache/modelscope/hub/models/LiquidAI/LFM2-1.2B

2025-10-11 16:32:30,953 - modelscope - INFO - Target directory already exists, skipping creation.

模型结构

model

Lfm2ForCausalLM(
  (model): Lfm2Model(
    (embed_tokens): Embedding(65536, 2048, padding_idx=0)
    (layers): ModuleList(
      (0-1): 2 x Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (2): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (3-4): 2 x Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (5): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (6-7): 2 x Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (8): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (9): Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (10): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (11): Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (12): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (13): Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (14): Lfm2DecoderLayer(
        (self_attn): Lfm2Attention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=512, bias=False)
          (v_proj): Linear(in_features=2048, out_features=512, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (q_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
          (k_layernorm): Lfm2RMSNorm((64,), eps=1e-05)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
      (15): Lfm2DecoderLayer(
        (conv): Lfm2ShortConv(
          (conv): Conv1d(2048, 2048, kernel_size=(3,), stride=(1,), padding=(2,), groups=2048, bias=False)
          (in_proj): Linear(in_features=2048, out_features=6144, bias=False)
          (out_proj): Linear(in_features=2048, out_features=2048, bias=False)
        )
        (feed_forward): Lfm2MLP(
          (w1): Linear(in_features=2048, out_features=8192, bias=False)
          (w3): Linear(in_features=2048, out_features=8192, bias=False)
          (w2): Linear(in_features=8192, out_features=2048, bias=False)
        )
        (operator_norm): Lfm2RMSNorm((2048,), eps=1e-05)
        (ffn_norm): Lfm2RMSNorm((2048,), eps=1e-05)
      )
    )
    (pos_emb): Lfm2RotaryEmbedding()
    (embedding_norm): Lfm2RMSNorm((2048,), eps=1e-05)
  )
  (lm_head): Linear(in_features=2048, out_features=65536, bias=False)
)

模型配置

model.config

Lfm2Config {
  "architectures": [
    "Lfm2ForCausalLM"
  ],
  "block_auto_adjust_ff_dim": true,
  "block_dim": 2048,
  "block_ff_dim": 12288,
  "block_ffn_dim_multiplier": 1.0,
  "block_mlp_init_scale": 1.0,
  "block_multiple_of": 256,
  "block_norm_eps": 1e-05,
  "block_out_init_scale": 1.0,
  "block_use_swiglu": true,
  "block_use_xavier_init": true,
  "bos_token_id": 1,
  "conv_L_cache": 3,
  "conv_bias": false,
  "conv_dim": 2048,
  "conv_dim_out": 2048,
  "conv_use_xavier_init": true,
  "dtype": "bfloat16",
  "eos_token_id": 7,
  "hidden_size": 2048,
  "initializer_range": 0.02,
  "intermediate_size": 12288,
  "layer_types": [
    "conv",
    "conv",
    "full_attention",
    "conv",
    "conv",
    "full_attention",
    "conv",
    "conv",
    "full_attention",
    "conv",
    "full_attention",
    "conv",
    "full_attention",
    "conv",
    "full_attention",
    "conv"
  ],
  "max_position_embeddings": 128000,
  "model_type": "lfm2",
  "norm_eps": 1e-05,
  "num_attention_heads": 32,
  "num_heads": 32,
  "num_hidden_layers": 16,
  "num_key_value_heads": 8,
  "pad_token_id": 0,
  "rope_theta": 1000000.0,
  "transformers_version": "4.57.0.dev0",
  "use_cache": true,
  "use_pos_enc": true,
  "vocab_size": 65536
}

模型使用

prompt = "What is C. elegans? Reply in Chinese."
input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.3,
    min_p=0.15,
    repetition_penalty=1.05,
    max_new_tokens=512,
)

print(tokenizer.decode(output[0], skip_special_tokens=False))

<|startoftext|><|im_start|>user
What is C. elegans? Reply in Chinese.<|im_end|>
<|im_start|>assistant
“C. elegans” 是线虫（线虫目，Phylum Nematoda）中的一种单细胞动物，是研究生物学、遗传学和发育生物学的经典模式生物。它是一种非常小的生物，大约长约2毫米（0.08英寸），属于地球上最小的多细胞动物之一。

C. elegans 的特点包括：

1. **简单的身体结构**：它有一对双眼、几个触角、两对腹部节段，每个节段都有三个神经节（neurons），形成一个简单的神经系统。
2. **透明体**：成虫和幼虫都是透明的，这使得科学家能够直接观察其内部结构和发育过程。
3. **寿命短**：成虫的寿命通常在2-4周，但通过实验可以延长到数十周。
4. **易于培养**：C. elegans 可以在实验室中方便地培养和维护，使其成为研究遗传和发育生物学的理想模型生物。
5. **基因组信息丰富**：它的基因组序列已完全测序，为研究基因功能提供了宝贵的资源。
6. **行为研究**：尽管体型微小，但 C. elegans 的行为复杂，包括运动、摄食和感知环境等，研究这些行为有助于理解神经系统和行为的基本机制。

由于其简单的生命周期、透明的体貌和易于操作的特性，C. elegans 在科学研究中被广泛应用于遗传学、发育生物学、神经科学和衰老研究等领域。此外，它还被用作药物筛选和毒理学研究的模型生物。<|im_end|>