PAT A1053 Path of Equal Weight (30分) (DFS + 剪枝)

本文详细解析了PAT甲级竞赛题A1053路径等权问题,介绍如何通过深度优先搜索(DFS)算法找出所有路径权值等于给定数值的树路径,并提供两种解决方案,一种为剪枝回溯法,另一种为记录所有结果再输出。

PAT甲级:A1053 Path of Equal Weight (30分)

Given a non-empty tree with root R, and with weight Wi assigned to each tree node Ti. The weight of a path from R to L is defined to be the sum of the weights of all the nodes along the path from R to any leaf node L.

Now given any weighted tree, you are supposed to find all the paths with their weights equal to a given number. For example, let’s consider the tree showed in the following figure: for each node, the upper number is the node ID which is a two-digit number, and the lower number is the weight of that node. Suppose that the given number is 24, then there exists 4 different paths which have the same given weight: {10 5 2 7}, {10 4 10}, {10 3 3 6 2} and {10 3 3 6 2}, which correspond to the red edges in the figure.

Input Specification:

Each input file contains one test case. Each case starts with a line containing 0<N≤100, the number of nodes in a tree, M (<N), the number of non-leaf nodes, and 0<S<230, the given weight number. The next line contains N positive numbers where Wi (<1000) corresponds to the tree node Ti. Then M lines follow, each in the format:

ID K ID[1] ID[2] ... ID[K]

where ID is a two-digit number representing a given non-leaf node, K is the number of its children, followed by a sequence of two-digit ID's of its children. For the sake of simplicity, let us fix the root ID to be 00.

Output Specification:

For each test case, print all the paths with weight S in non-increasing order. Each path occupies a line with printed weights from the root to the leaf in order. All the numbers must be separated by a space with no extra space at the end of the line.

Note: sequence {A1,A2,⋯,An} is said to be greater than sequence {B1,B2,⋯,Bm} if there exists 1≤k<min{n,m} such that Ai=Bi for i=1,⋯,k, and Ak+1>Bk+1.

Sample Input:

20 9 24
10 2 4 3 5 10 2 18 9 7 2 2 1 3 12 1 8 6 2 2
00 4 01 02 03 04
02 1 05
04 2 06 07
03 3 11 12 13
06 1 09
07 2 08 10
16 1 15
13 3 14 16 17
17 2 18 19

Sample Output:

10 5 2 7
10 4 10
10 3 3 6 2
10 3 3 6 2
  • 题意:给出一棵树,树中每个结点包含编号和权值,编号从 0~N,要求输出从根节点到某一叶子结点路径上各结点的权值(要求这条路径上的权值和等于给定的 W ),如果有多条路径满足条件,则按照降序输出各个路径,例如 a = [10, 5, 2],b = [10, 4, 9],则说 a > b,因为第二项中 5 > 4。
  • 分析:其实考察的就是DFS,每个结点有两个属性,所以定义一个结构体数组,每个元素包含其权值和孩子结点的编号。因为要求降序输出,可以在读入孩子结点的时候就按照其权值降序排序。之后在DFS中传入结点编号 u 与当前权值和 sum 两个属性,如果 sum大于W 或者 sum等于W但当前结点不是叶子结点则退出(剪枝),如果满足条件则直接输出当前路径的权值。另一种方式则是先记录下所有满足条件的路径,之后再一起输出。

方法一:(剪枝回溯)

#include <bits/stdc++.h>
using namespace std;
const int N = 110;
struct node {
    int weight;
    vector<int> child;
} Node[N];
int n, m, w;
vector<int> path;
bool cmp(int a, int b) { return Node[a].weight > Node[b].weight; }
void dfs(int u, int sum) {
    if (sum > w) return;
    if (sum == w) {
        if (Node[u].child.size() != 0) return;
        for (int i = 0; i < path.size(); i++) {
            if (i != 0) printf(" ");
            printf("%d", Node[path[i]].weight);
        }
        printf("\n");
        return;
    }
    for (auto it : Node[u].child) {
        path.push_back(it);
        dfs(it, sum + Node[it].weight);
        path.pop_back();
    }
}
int main() {
    scanf("%d%d%d", &n, &m, &w);
    for (int i = 0; i < n; i++) scanf("%d", &Node[i].weight);
    int parent, child, k;
    while (m--) {
        scanf("%d%d", &parent, &k);
        for (int i = 0; i < k; i++) {
            scanf("%d", &child);
            Node[parent].child.push_back(child);
        }
        sort(Node[parent].child.begin(), Node[parent].child.end(), cmp);
    }
    path.push_back(0);
    dfs(0, Node[0].weight);
    return 0;
}

方法二:(记录所有结果再行输出)

#include <bits/stdc++.h>
using namespace std;
const int N = 110;
struct node {
    int weight;
    vector<int> child;
} Node[N];
int n, m, w;
vector<int> path;
vector<vector<int> > ans;
bool cmp(int a, int b) { return Node[a].weight > Node[b].weight; }
void dfs(int u) {
    path.push_back(u);
    if (Node[u].child.size() == 0) {
        int tempw = 0;
        for (auto it : path) tempw += Node[it].weight;
        if (tempw == w) ans.push_back(path);
        path.pop_back();
        return;
    }
    for (auto it : Node[u].child) dfs(it);
    path.pop_back();
}
int main() {
    scanf("%d%d%d", &n, &m, &w);
    for (int i = 0; i < n; i++) scanf("%d", &Node[i].weight);
    int parent, child, k;
    while (m--) {
        scanf("%d%d", &parent, &k);
        for (int i = 0; i < k; i++) {
            scanf("%d", &child);
            Node[parent].child.push_back(child);
        }
        sort(Node[parent].child.begin(), Node[parent].child.end(), cmp);
    }
    dfs(0);
    for (int i = 0; i < ans.size(); i++) {
        for (int j = 0; j < ans[i].size(); j++) {
            if (j != 0) printf(" ");
            printf("%d", Node[ans[i][j]].weight);
        }
        printf("\n");
    }
    return 0;
}
你遇到的错误 `size of weight must equal to number of rows` 是在使用 XGBoost 时出现的一个常见错误,表示你传入的样本权重(`sample_weights`)的长度与训练数据的行数不一致。 --- ## 🧾 错误原因析 XGBoost 的 `DMatrix` 允许你为每个样本配一个权重,这个权重可以用于调整训练过程中样本的重要性。但是: - **权重数组的长度必须等于数据样本的数量** - 如果你传入了错误长度的 `sample_weights`,就会触发 `size of weight must equal to number of rows` 错误 例如: ```python import xgboost as xgb dtrain = xgb.DMatrix(X_train, label=y_train, weight=sample_weights) ``` 如果 `len(sample_weights) != len(X_train)`,就会报错。 --- ## ✅ 解决方案 ### ✅ 1. 确保 `sample_weights` 的长度与数据一致 ```python assert len(sample_weights) == len(X_train), "样本权重长度必须与数据行数一致" ``` ### ✅ 2. 在 `apply_sampling` 中正确处理样本权重 你之前在 `apply_sampling` 函数中使用了 `weighted` 方法,但可能在采样后没有同步更新 `sample_weights`。你需要在采样后保留原始权重或重新计算权重。 ```python def apply_sampling(X, y, method): if method == 'weighted': # 仅计算权重,不改变数据weight_ratio = len(y[y == 0]) / len(y[y == 1]) sample_weights = np.where(y == 1, weight_ratio, 1.0) return X, y, sample_weights elif method == 'smote': sampler = SMOTE(random_state=42) X_res, y_res = sampler.fit_resample(X, y) return X_res, y_res, None else: # raw return X, y, None ``` ### ✅ 3. 构建 DMatrix 时根据是否使用权重判断 ```python def train_and_evaluate(X_train, y_train, X_val, y_val, params, sample_weights=None): dtrain = xgb.DMatrix(X_train, label=y_train) dval = xgb.DMatrix(X_val, label=y_val) if sample_weights is not None: dtrain.set_weight(sample_weights) evals_result = {} model = xgb.train( params, dtrain, num_boost_round=1000, evals=[(dtrain, 'train'), (dval, 'val')], early_stopping_rounds=50, evals_result=evals_result, verbose_eval=False ) return model, evals_result ``` ### ✅ 4. 完整调用示例 ```python X_res, y_res, sample_weights = apply_sampling(X_train, y_train, 'weighted') model, evals_result = train_and_evaluate(X_res, y_res, X_val, y_val, params, sample_weights) ``` --- ## 🧪 示例:加权样本训练 ```python from sklearn.datasets import make_classification import xgboost as xgb # 生成模拟数据 X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, weights=[0.98, 0.02], random_state=42) # 计算权重 weight_ratio = len(y[y == 0]) / len(y[y == 1]) sample_weights = np.where(y == 1, weight_ratio, 1.0) # 构建 DMatrix dtrain = xgb.DMatrix(X, label=y) dtrain.set_weight(sample_weights) # 设置样本权重 # 参数 params = { 'objective': 'binary:logistic', 'eval_metric': 'auc', 'eta': 0.1, 'max_depth': 3 } # 训练模型 model = xgb.train(params, dtrain, num_boost_round=100) ``` --- ## 🔍 常见错误场景排查 | 场景 | 问题描述 | 解决方式 | |------|----------|----------| | 采样后未更新权重 | 使用 SMOTE 后,`sample_weights` 仍为原始长度 | 采样后不应再使用原始权重 | | 误将原始权重用于采样后数据 | 如 SMOTE 生成新样本,但权重仍使用旧数据 | 采样后无需设置权重 | | 权重数组缺失或填充错误 | 某些样本未配权重 | 检查 `np.where()` 条件逻辑 | | 权重未正确绑定到 DMatrix | 直接传入 `weight=` 参数 | 使用 `dtrain.set_weight()` | --- ## ✅ 总结 - ✅ 确保 `sample_weights` 长度与当前数据集一致 - ✅ 在采样后不再使用原始权重 - ✅ 使用 `dtrain.set_weight()` 设置样本权重而不是构造时传入 - ✅ 加权方法和采样方法要开处理,避免混淆 ---
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值