★【左偏树】Financial Fraud

本文探讨了Bernard Madoff欺诈案中的技术细节,包括两名程序员如何通过编程生成虚假交易记录来欺骗投资者,并详细解释了计算最小可能风险值的方法。利用左偏树数据结构,文章提供了一个有效策略来最小化欺诈行为的风险值。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Bernard Madoff is an American former stock broker, investment adviser, non-executive chairman of the
NASDAQ stock market, and the admitted operator of what has been described as the largest Ponzi scheme
in history.

Two programmers, Jerome O'Hara and George Perez, who helped Bernard Madoff program to produce false
records have been indicted. They were accused of conspiracy, falsifying records of a broker dealer
and falsifying records of an investment adviser.

In every quarter of these years, the two programmers would receive a data sheet, which provided a
series of profit records a1 a2 ... aN in that period. Due to the economic crisis, the awful records
might scare investors away. Therefore, the programmers were asked to falsifying these records into
b1 b2 ... bN. In order to deceive investors, any record must be no less than all the records before
it (it means for any 1 ≤ i < j ≤ N, bi ≤ bj).

On the other hand, they defined a risk value for their illegal work: risk value = | a1 - b1 | +
| a2 - b2 | + ... + | aN - bN | . For example, the original profit records are 300 400 200 100.
If they choose 300 400 400 400 as the fake records, the risk value is 0 + 0 + 200 + 300 = 500.
But if they choose 250 250 250 250 as the fake records, the risk value is 50 + 150 + 50 + 150 =
400. To avoid raising suspicion, they need to minimize the risk value.

Now we will give you some copies of the original profit records, you need to find out the minimal
possible risk value.

Input
There are multiple test cases (no more than 20).
For each test case, the first line contains one integer N (1 ≤ N ≤ 50000), the next line contains
N integers a1 a2 ... aN (-109 ≤ ai ≤ 109). The input will end with N = 0.

Output
For each test case, print a single line that contains minimal possible risk value.

Sample Input
4
300 400 200 100
0

Sample Output
400
此题考察可并优先队列的应用。

用左偏树实现。

首先,考虑两个极端的情况:
 1): 若a1 <= a2 <= ... <= an,那么问题的解显然是bi = ai(i = 1, 2, ..., n);
 2): 若a1 > a2 > ... > an,那么问题的解是bi = x(x为序列a1, a2, ..., an的中位数)。

所以所有序列的解都不会差于中位数。

最终一定可以将整个序列划分为若干个区间,使得每个区间的都为该区间的中位数。用左偏树来维护每个区间的中位数,当前面一个区间的中位数大于该区间的中位数时,合并这两棵左偏树即可。

PS:ZOJ的测评系统为Linux,所以输出64为整型要用“%lld”。

Accode:

#include <cstdio>
#include <cstdlib>
#include <algorithm>
#include <string>
#define abs(x) ((x) > 0 ? (x) : -(x))

const int maxN = 50010;

struct LeftList
{
    int key, dist;
    LeftList *lc, *rc;
    LeftList() {}
    LeftList(int key):
        key(key), dist(0),
        lc(NULL), rc(NULL) {}
};

LeftList *tr[maxN];
int L[maxN], R[maxN], sz[maxN], a[maxN];

inline int getint()
{
    int res = 0; char tmp; bool sgn = 1;
    do tmp = getchar();
    while (!isdigit(tmp) && tmp != '-');
    if (tmp == '-')
    {
        sgn = 0;
        tmp = getchar();
    }
    do res = (res << 3) + (res << 1) + tmp - '0';
    while (isdigit(tmp = getchar()));
    return sgn ? res : -res;
}

inline bool cmp(const LeftList *a,
                const LeftList *b)
{return a -> key > b -> key;}
//默认为小根,这里重载比较为大根。

LeftList *Merge(LeftList *a, LeftList *b)
{
    if (!a) return b;
    if (!b) return a;
	//待合并的两棵子树中,若其中一棵为空,
	//则直接返回另一棵树。
    if (cmp(b, a)) std::swap(a, b);
	//为了方便操作,让根值较大(大根树)
	//或较小(小根树)的树作为合并后的根。
    a -> rc = Merge(a -> rc, b);
    if (!(a -> lc) || a -> rc &&
        a -> rc -> dist > a -> lc -> dist)
        std::swap(a -> lc, a -> rc);
	//调整左右子树的位置使其满足左偏树的性质。
	//(左子树的距离不小于右子树的距离即左偏的性质。)
    if (!(a -> rc)) a -> dist = 0;
    else a -> dist = a -> rc -> dist + 1;
	//更新合并后的树的距离。
    return a;
}

inline void Del(LeftList *&a) //这里引用符号“&”不能省。
{
    a = Merge(a -> lc, a -> rc);
    return;
}

int main()
{
    freopen("Financial_Fraud.in", "r", stdin);
    freopen("Financial_Fraud.out", "w", stdout);
    int n;
    while (n = getint())
    {
        int top = 0;
        for (int i = 0; i < n; ++i)
        {
            tr[++top] = new LeftList(a[i] = getint());
            sz[top] = 1; L[top] = R[top] = i;
		//用L,R指针记录当前左偏树所代表的区间。
            while (top > 1 && tr[top - 1] -> key
                   > tr[top] -> key)
            {
                tr[top - 1] = Merge(tr[top - 1], tr[top]);
                sz[top - 1] += sz[top];
                R[top - 1] = R[top];
                for (--top; sz[top] > ((R[top] - L[top])
                     >> 1) + 1; --sz[top])
                    Del(tr[top]);
		//使得合并后的左偏树的根节点(即最大值)
		//不超过该区间的中位数,即让左偏树中
		//总节点数不超过该区间中总结点数的一半。
            }
        }
        long long ans = 0;
        for (int i = 1; i < top + 1; ++i)
        for (int j = L[i]; j < R[i] + 1; ++j)
            ans += abs(a[j] - tr[i] -> key);
        printf("%lld\n", ans);
    }
    return 0;
}

#undef abs

### 金融欺诈检测模型中的多模态特性 包含文本指标和财务数值数据的金融欺诈检测模型确实可以被视作一种多模态模型。这类模型能够处理来自不同源的数据并融合这些信息来提高预测性能。 #### 多模态学习定义 多模态学习指的是机器学习算法可以从多个不同的输入模式中提取特征,并将它们结合起来用于更有效的决策过程[^1]。对于金融欺诈检测而言,这意味着不仅考虑交易金额等结构化数据,还结合客户评论、社交媒体帖子或其他形式的文字描述作为非结构化数据的一部分来进行综合分析。 #### 数据预处理与特征工程 为了构建一个多模态的金融欺诈检测系统,在实际应用之前需要对不同类型的数据分别进行适当转换: - **文本数据**:通过自然语言处理技术(NLP),如词袋模型(Bag of Words),TF-IDF向量化等方式转化为数值型表示; - **数值数据**:标准化或归一化操作以确保各维度之间具有可比性; 接着利用编码器网络或者其他方法实现跨域映射,使得两种异构的信息能够在同一框架下共同作用于最终分类任务之上。 ```python from sklearn.preprocessing import StandardScaler, TfidfVectorizer import numpy as np # 假设我们有如下样本集 text_samples = ["This is a good deal", "Too cheap to be true"] numerical_features = [[0.8], [-0.5]] # 文本特征抽取 vectorizer = TfidfVectorizer() X_text = vectorizer.fit_transform(text_samples).toarray() # 数值特征缩放 scaler = StandardScaler().fit(numerical_features) X_numerical = scaler.transform(numerical_features) # 合并两个特征空间得到完整的输入矩阵 X_combined X_combined = np.hstack((X_text, X_numerical)) ``` 上述代码片段展示了如何准备两类不同性质的数据以便后续训练一个多模态的学习器实例。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值