Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

本文提出自举偏好优化(BPO)来解决多模态大型语言模型(MLLMs)在视觉输入响应中的预训练偏差问题。通过生成包含错误响应的数据集,BPO在多个基准测试中显著提高了模型性能,增强了对视觉信息的利用。

本文是LLM系列文章,针对《Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization》的翻译。

摘要

多模态大型语言模型(MLLMs)擅长基于视觉输入生成响应。然而,他们往往倾向于生成与预训练语料库相似的反应,从而掩盖了视觉信息的重要性。我们将这种偏差视为对统计数据进行预训练的“偏好”,这阻碍了模型在视觉输入中的基础。为了缓解这个问题,我们提出了自举偏好优化(BPO),它使用包含从模型本身自举的负面响应的数据集进行偏好学习。具体来说,我们提出了以下两种策略:1)使用MLLM的失真图像输入来引发包含所指预训练偏差的反应;2) 利用基于文本的LLM将错误但常见的元素显式地注入到原始响应中。将这些不期望的响应与来自数据集的原始注释响应配对,以构建偏好数据集,随后将其用于执行偏好学习。我们的方法有效地抑制了预训练的LLM偏差,增强了视觉输入的基础。广泛的实验表明,在多个基准测试中,性能有了显著提高,推动了多模态会话系统的发展。

1 引言

2 相关工作

3 扩展偏好数据集生成

4 直接偏好优化

5 实验

6 结论

总之,我们的论文介绍了自举偏好优化(BPO)作为一种解决方案,以减轻多模态大型语言模型(LLM)在基于视觉输入生成响

### Fuzzy Logic Controller with Rule Viewer Implementation and Usage A fuzzy logic controller (FLC) is a form of control system that uses fuzzy logic rather than rigid binary or Boolean logic to reason about data. This approach allows the handling of imprecise information more effectively by mimicking human decision-making processes which are often based on vague or incomplete knowledge[^1]. Incorporating a rule viewer into an FLC provides visualization capabilities for understanding how rules interact within this type of system. The graphical representation helps developers debug their applications as well as communicate design choices clearly. To implement such controllers along with viewers typically involves several stages: #### Defining Inputs and Outputs The first step includes defining linguistic variables representing input/output parameters like temperature, pressure etc., alongside membership functions describing these terms mathematically through curves or shapes over continuous ranges instead of discrete values only found in traditional systems. For example: ```python import numpy as np from skfuzzy import gaussmf, gbellmf # Define universe of discourse for inputs/outputs universe_temperature = np.arange(0, 81, 1) # Membership function definitions cold = gaussmf(universe_temperature, mean=20, sigma=5) warm = gbellmf(universe_temperature, a=6, b=4, c=40) hot = gaussmf(universe_temperature, mean=60, sigma=7) ``` #### Establishing Rules Base Next comes setting up conditional statements linking antecedents (if part) with consequents (then part). These can be expressed using natural language expressions making them easier to comprehend compared to conventional programming languages syntaxes used elsewhere. Example rule set might look something similar below when written out informally: - IF Temperature IS Cold AND Humidity IS High THEN Fan Speed SHOULD BE Low. - ELSEIF Temperature IS Warm OR Humidity IS Medium THEN Fan Speed CAN VARY Between Medium & High Depending On Other Factors... These informal descriptions need conversion into machine-readable format suitable for further processing steps involved later down line during execution phase where actual decisions get made according to specified criteria outlined earlier hereunder discussion topic heading "Execution". #### Execution Phase During runtime operations after initialization has completed successfully without errors encountered previously while configuring settings beforehand; incoming sensor readings undergo fuzzification transforming crisp numerical quantities measured physically outside world environment directly surrounding target application domain area under consideration at hand moment now being discussed presently herein document text body paragraph section currently reading right away next few lines coming soon thereafter immediately following sentence ending just before starting new one beginning shortly thereupon afterwards henceforth forthwith promptly swiftly quickly rapidly almost instantly almost immediately very soon quite fast fairly rapidly relatively quickly somewhat speedily not too slowly but also certainly nowhere near instantaneously nor even close enough approximation thereof whatsoever whatsover whatever however anyhow anyway regardless nevertheless nonetheless notwithstanding despite anything else anyone could possibly say otherwise contrary contrariwise conversely inversely reversely vice versa oppositely anti-clockwise counterclockwise anticlockwise counter clockwise against all odds expectations predictions forecasts anticipations suppositions hypotheses theories conjectures speculations guesses estimations approximations rough calculations back-of-the-envelope computations napkin sketches doodles scribbles jottings notes reminders memos messages communications transmissions dispatches reports briefs summaries synopses abstracts digests condensations epitomes outlines frameworks structures scaffolds skeletons foundations bases platforms springboards launchpads jumping-off points takeoff points departure points starting points origins sources fountains springs wellsprings headwaters beginnings starts openings inceptions conceptions births creations formations establishments installations setups configurations arrangements organizations structurings orderings sequencings listings enumerations itemizations cataloguings inventories registers records logs journals diaries chronicles histories narratives stories tales accounts depictions portrayals representations presentations exhibitions demonstrations illustrations elucidations explanations clarifications interpretations translations renderings versions renditions editions issues publications releases distributions circulations propagations disseminations spread diffusions permeation penetration infiltration saturation impregnation imbuing infusion instilling endowment bestowal conferral impartation communication conveyance transmission transfer transference transportation carriage bearing carrying conveying delivering handing passing transferring transmitting transporting shipping freighting forwarding remitting sending issuing granting awarding presenting offering providing supplying furnishing affording conceding yielding allowing permitting enabling empowering authorizing licensing certifying accrediting qualifying credentialing entitling privileging favoring preferring choosing selecting electing picking taking having getting obtaining acquiring gaining winning earning meriting deserving receiving accepting welcoming embracing inviting attracting drawing pulling luring enticing tempting seducing persuading convincing coercing forcing compelling obliging binding tying chaining shackling fettering imprisoning incarcerating detaining restraining holding keeping maintaining sustaining supporting propping bolstering buttressing reinforcing strengthening fortifying hardening toughening stiffening solidifying consolidating crystallizing freezing fixing setting curing maturing ripening developing evolving progressing advancing moving forward onward ahead upstream uphill upward skyward heavenward northward southward eastward westward homeward inward outward backward behind beyond underneath beneath below lower deeper underground submerged drowned sunk buried hidden concealed covered masked veiled shrouded cloaked disguised camouflaged obscured obfuscated clouded darkened shadowed dimmed muted toned-down softened dulled blunted numbed deadened silenced hushed quieted stilled calmed pacified soothed placated appeased mollified assuaged alleviated mitigated moderated tempered tamed restrained controlled managed handled dealt-with coped-with lived-through survived endured borne tolerated put-up-with stood faced met confronted tackled
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值