ceph rgw:bucket policy实现

本文深入探讨了Ceph RGW中Bucket Policy的实现机制与限制,包括如何将Policy设置到Bucket上,Policy的验证流程,以及与AWS S3的差异对比。通过实际测试案例,展示了如何为特定用户或所有用户设置访问权限。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

ceph rgw:bucket policy实现

相比于aws,rgw的bucket policy实现的还不是很完善,有很多细节都不支持,并且已支持的特性也在很多细节方面与s3不同,尤其是因为rgw不支持类似s3的account user结构,而使用tenant作为替代而导致的一些不同。

并且在文档中还提及,为了修正这种不同,以及支持更多特性,在不久后会重写rgw的 Authentication/Authorization subsystem。到时候可能导致一些兼容问题?

差异性,主要有以下几点:

  • 顾名思义,只支持为bucket设置policy,不能将policy设置到user等其他资源上。

  • 指定Principal使用如下格式: "Principal":{"AWS":"arn:aws:iam::<tenant>:user/<username>"};因为目前RGW use ‘tenant’ identifier in place of the Amazon twelve-digit account ID。

  • 在policy json文件中不支持变量的使用,比如${aws:username}

  • rgw支持的Action、Condition是aws的子集,文档中有列出。附录中的RGWListBucket::verify_permission()rgw_build_iam_environment(...)也能看出被支持的Condition有哪些。

  • Under AWS, all tenants share a single namespace. RGW gives every tenant its own namespace of buckets. There may be an option to enable an AWS-like ‘flat’ bucket namespace in future versions. At present, to access a bucket belonging to another tenant, address it as “tenant:bucket” in the S3 request.

  • In AWS, a bucket policy can grant access to another account, and that account owner can then grant access to individual users with user permissions. Since we do not yet support user, role, and group permissions, account owners will currently need to grant access directly to individual users, and granting an entire account access to a bucket grants access to all users in that account.

实现

为一个bucket设置bucket policy,就是向该bucket对应的bucket.instance对象的xattr中以user.rgw.iam-policy为key将上传的policy json文本存入。之后使用时从xattr中查询并解析。

而对于policy的使用,则是在rgw_process.cc/process_request(...)函数中开始的。这个函数就是rgw frontend回调函数最终验证并执行请求的地方,它属于REST API通用处理层,这一层以process_request函数作为入口,其主要步骤大概分为 用户认证、桶/对象acl/policy检查、用户/桶配额检查、执行操作 等。

bucket policy的验证,具体是在process_request调用的rgw_process_authenticated函数中,该函数先后调用了init_permissionsread_permissions,这两个函数都包含读取bucket policy到req_state.iam_policy的语句。

最后在op->verify_permission函数中,根据不同操作进行权限验证,也包括了policy的验证。验证过程大体如下:

  1. 将被验证请求的主体和操作,转换成policy的Principle和Action格式,存入对应的对象,对象变量名分别为ida和res。
  2. 判断ida与bucket policy中的Principle是否匹配,如果没有发现匹配的,则返回Effect::Pass,表示没有匹配的policy授权,那么此时需要根据其他授权机制判断请求是否执行。(其他两个状态是Effect::Allow和Effect::Deny,分别表示同意和阻止)
  3. 判断res和policy的Resource是否匹配(以及res和policy的notResource是否不匹配),如果否,返回Effect::Pass。
  4. 判断请求的操作与policy的Action是否匹配,如果否,返回Effect::Pass。
  5. 判断请求是否满足policy的所有Condition,如果满足,返回Effect::Allow,不满足,返回Effect::Deny。

其中Condition可以包括两部分的限制,一个是要求请求有指定的header项,另一个是要求请求带有指定的路径参数,在验证用户请求时,前者在rgw_build_iam_environment函数中被存入req_state::env 中;后者先被存入RGWListBucket(或RGWListBucketMultiparts等其他需要验证这些参数的对象)的成员变量中,在RGWListBucket::verify_permission()函数调用时才被存入req_state::envreq_state::env则在Condition.eval(...)中被用于比较。

有关Condition需要的参数准备过程的代码,见附录后面几个函数。

简单验证下:
创建一个名为testbucket的桶,使用s3cmd为其设置policy,发现该桶对应的bucket.instance对象的xattr中增加了相关的属性user.rgw.iam-policy,可以使用下面列出对象的所有xattr。

$ ./bin/rados -p default.rgw.meta --namespace=root listxattr .bucket.meta.testbucket:f52fe9ac-581e-432f-a8d2-363748a54fa8.4167.1

然后使用下面的命令来获得指定key的属性值,你会发现,里面存储的直接就是我们上传的json文本。

$ ./bin/rados -p default.rgw.meta --namespace=root getxattr .bucket.meta.testbucket:f52fe9ac-581e-432f-a8d2-363748a54fa8.4167.1 user.rgw.iam-policy

功能测试

基本的PUT Policy和DELETE Policy通过s3cmd测试没有问题。

下面测试了几个常用的场景用法。在测试前,先创建几个用户:
属于默认tenant(即为空)的testid 和 testid2
属于tenantone的userone和usertwo
属于tenanttwo的userthree

下面使用s3cmd测试,仅在第一个case列出完整命令,之后省略。

给所有用户授予指定权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            }
        }
    ]
}

▸ cmh@ubuntu:~$ cp .s3cfg_userone .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd setpolicy policy.json s3://bucketone

▸ cmh@ubuntu:~$ cp .s3cfg_usertwo .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://bucketone
2017-12-07 07:53       977   s3://bucketone/objone
2017-12-07 07:53       977   s3://bucketone/objtwo

▸ cmh@ubuntu:~$ cp .s3cfg_userthree .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://bucketone
ERROR: Bucket 'bucketone' does not exist
ERROR: S3 error: 404 (NoSuchBucket)
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://tenantone:bucketone
2017-12-07 07:53       977   s3://tenantone:bucketone/objone
2017-12-07 07:53       977   s3://tenantone:bucketone/objtwo

给指定用户授予指定权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": ["s3:ListBucket","s3:GetObject"],
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": ["arn:aws:iam:::user/testid2","arn:aws:iam::tenanttwo:user/userthree"]
            }
        }
    ]
}

给指定用户授予所有权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": ["arn:aws:iam:::user/testid2","arn:aws:iam::tenanttwo:user/userthree"]
            }
        }
    ]
}

给所有用户授予所有权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS":"*" 
            }
        }
    ]
}

配合Condition,给指定用户授予指定权限,并要求请求中带有指定header

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::tenantone:user/usertwo"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::bucketone"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:UserAgent": "cmh-test"
                }
            }
        },
        {
            "Sid": "statement2",
            "Effect": "Deny",
            "Principal": {
                "AWS": "arn:aws:iam::tenantone:user/usertwo"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::bucketone"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:UserAgent": "cmh-test"
                }
            }
        }
    ]
}

### 配合Condition,给指定用户授予指定权限,并要求请求带有指定路径参数

目前只支持ListBucket的s3:prefix 、 s3:delimiter 和 s3:max-keys 。

L版本验证失败,Master分支代码验证通过。

用户1设置policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS":"*" 
            },
            "Condition":{
                "NumericEquals": {
                    "s3:max-keys": "10"
                }
            }
        }
    ]
}

使用用户2发起请求

#!/bin/bash

access_key="usertwo123"
secret_key="usertwo123"
date=$(date -R -u)
string_to_sign="GET\n\n\n${date}\n/bucketone/"
signature=$(echo -en ${string_to_sign} | openssl sha1 -hmac ${secret_key} -binary | base64)

curl "http://127.0.0.1:8000/bucketone/?max-keys=10"              \
    -H "Date: ${date}"                                              \
    -H "User-Agent: cmh-test"                                       \
    -H "Authorization: AWS ${access_key}:${signature}"              \
    -X GET  -v

附录:代码片段

注:以下代码为master分支代码,不是L版本

RGWPutBucketPolicy::execute()

上传policy的请求执行函数

void RGWPutBucketPolicy::execute() {
  op_ret = get_params();
  if (op_ret < 0) {
    return;
  }

  bufferlist in_data = bufferlist::static_from_mem(data, len);

  if (!store->is_meta_master()) {
    op_ret = forward_request_to_master(s, NULL, store, in_data, nullptr);
    if (op_ret < 0) {
      ldout(s->cct, 20) << "forward_request_to_master returned ret=" << op_ret << dendl;
      return;
    }
  }

  try {
    Policy p(s->cct, s->bucket_tenant, in_data);
    // 将bucket原有的policy删除,将新的加入进去
    auto attrs = s->bucket_attrs;
    attrs[RGW_ATTR_IAM_POLICY].clear();
    attrs[RGW_ATTR_IAM_POLICY].append(p.text);
    op_ret = rgw_bucket_set_attrs(store, s->bucket_info, attrs,
                  &s->bucket_info.objv_tracker);
    if (op_ret == -ECANCELED) {
      op_ret = 0; /* lost a race, but it's ok because policies are immutable */
    }
  } catch (rgw::IAM::PolicyParseException& e) {
    ldout(s->cct, 20) << "failed to parse policy: " << e.what() << dendl;
    op_ret = -EINVAL;
  }
}

rgw_process_authenticated(...)

进行权限认证到执行的入口函数

int rgw_process_authenticated(RGWHandler_REST * const handler,
                              RGWOp *& op,
                              RGWRequest * const req,
                              req_state * const s,
                              const bool skip_retarget)
{
  req->log(s, "init permissions");
  
  // init_permissions 将acl、policy等信息从xattr读入内存
  // 它调用了do_init_permissions函数
  // do_init_permissions又调用了rgw_build_bucket_policies
  // rgw_build_bucket_policies的末尾部分,调用了get_iam_policy_from_attr函数
  // 将bucket policy存入了req_state.iam_policy变量中
  int ret = handler->init_permissions(op);
  if (ret < 0) {
    return ret;
  }

  /**
   * Only some accesses support website mode, and website mode does NOT apply
   * if you are using the REST endpoint either (ergo, no authenticated access)
   */
  if (! skip_retarget) {
    req->log(s, "recalculating target");
    ret = handler->retarget(op, &op);
    if (ret < 0) {
      return ret;
    }
    req->op = op;
  } else {
    req->log(s, "retargeting skipped because of SubOp mode");
  }

  /* If necessary extract object ACL and put them into req_state. */
  req->log(s, "reading permissions");
  // 该函数同样调用了get_iam_policy_from_attr函数
  // 将bucket policy存入了req_state.iam_policy变量中
  ret = handler->read_permissions(op);
  if (ret < 0) {
    return ret;
  }

  req->log(s, "init op");
  ret = op->init_processing();
  if (ret < 0) {
    return ret;
  }

  req->log(s, "verifying op mask");
  ret = op->verify_op_mask();
  if (ret < 0) {
    return ret;
  }

  req->log(s, "verifying op permissions");
  // 最终验证
  ret = op->verify_permission();
  if (ret < 0) {
    if (s->system_request) {
      dout(2) << "overriding permissions due to system operation" << dendl;
    } else if (s->auth.identity->is_admin_of(s->user->user_id)) {
      dout(2) << "overriding permissions due to admin operation" << dendl;
    } else {
      return ret;
    }
  }

  req->log(s, "verifying op params");
  ret = op->verify_params();
  if (ret < 0) {
    return ret;
  }
  // 执行具体的请求并返回结果给客户端
  req->log(s, "pre-executing");
  op->pre_exec();

  req->log(s, "executing");
  op->execute();

  req->log(s, "completing");
  op->complete();

  return 0;
}

rgw_build_iam_environment(...)

根据请求中的header,将Condition支持的header项存入req_state::env

rgw::IAM::Environment rgw_build_iam_environment(RGWRados* store,
                        struct req_state* s)
{
  rgw::IAM::Environment e;
  const auto& m = s->info.env->get_map();
  auto t = ceph::real_clock::now();
  e.emplace("aws:CurrentTime", std::to_string(ceph::real_clock::to_time_t(t)));
  e.emplace("aws:EpochTime", ceph::to_iso_8601(t));
  // TODO: This is fine for now, but once we have STS we'll need to
  // look and see. Also this won't work with the IdentityApplier
  // model, since we need to know the actual credential.
  e.emplace("aws:PrincipalType", "User");

  auto i = m.find("HTTP_REFERER");
  if (i != m.end()) {
    e.emplace("aws:Referer", i->second);
  }

  // These seem to be the semantics, judging from rest_rgw_s3.cc
  i = m.find("SERVER_PORT_SECURE");
  if (i != m.end()) {
    e.emplace("aws:SecureTransport", "true");
  }

  i = m.find("HTTP_HOST");
  if (i != m.end()) {
    e.emplace("aws:SourceIp", i->second);
  }

  i = m.find("HTTP_USER_AGENT"); {
  if (i != m.end())
    e.emplace("aws:UserAgent", i->second);
  }

  if (s->user) {
    // What to do about aws::userid? One can have multiple access
    // keys so that isn't really suitable. Do we have a durable
    // identifier that can persist through name changes?
    e.emplace("aws:username", s->user->user_id.id);
  }
  return e;
}

RGWListBucket_ObjStore_S3::get_params()

从请求中解析出prefix、marker、max_keys、delimiter等参数,存入RGWListBucket的成员变量中。

int RGWListBucket_ObjStore_S3::get_params()
{
  list_versions = s->info.args.exists("versions");
  prefix = s->info.args.get("prefix");
  if (!list_versions) {
    marker = s->info.args.get("marker");
  } else {
    marker.name = s->info.args.get("key-marker");
    marker.instance = s->info.args.get("version-id-marker");
  }
  max_keys = s->info.args.get("max-keys");
  op_ret = parse_max_keys();
  if (op_ret < 0) {
    return op_ret;
  }
  delimiter = s->info.args.get("delimiter");
  encoding_type = s->info.args.get("encoding-type");
  if (s->system_request) {
    s->info.args.get_bool("objs-container", &objs_container, false);
    const char *shard_id_str = s->info.env->get("HTTP_RGWX_SHARD_ID");
    if (shard_id_str) {
      string err;
      shard_id = strict_strtol(shard_id_str, 10, &err);
      if (!err.empty()) {
        ldout(s->cct, 5) << "bad shard id specified: " << shard_id_str << dendl;
        return -EINVAL;
      }
    } else {
      shard_id = s->bucket_instance_shard_id;
    }
  }
  return 0;
}

RGWListBucket::verify_permission()

将RGWListBucket成员变量中的prefix、delimiter、max-keys三者被Condition的参数,存入req_state::env中,用于之后的Condition::eval()

int RGWListBucket::verify_permission()
{
  op_ret = get_params();
  if (op_ret < 0) {
    return op_ret;
  }
  if (!prefix.empty())
    s->env.emplace("s3:prefix", prefix);

  if (!delimiter.empty())
    s->env.emplace("s3:delimiter", delimiter);

  s->env.emplace("s3:max-keys", std::to_string(max));

  if (!verify_bucket_permission(s,
                list_versions ?
                rgw::IAM::s3ListBucketVersions :
                rgw::IAM::s3ListBucket)) {
    return -EACCES;
  }

  return 0;
}

相比于aws,rgw的bucket policy实现的还不是很完善,有很多细节都不支持,并且已支持的特性也在很多细节方面与s3不同,尤其是因为rgw不支持类似s3的account user结构,而使用tenant作为替代而导致的一些不同。

并且在文档中还提及,为了修正这种不同,以及支持更多特性,在不久后会重写rgw的 Authentication/Authorization subsystem。到时候可能导致一些兼容问题?

差异性,主要有以下几点:

  • 顾名思义,只支持为bucket设置policy,不能将policy设置到user等其他资源上。

  • 指定Principal使用如下格式: "Principal":{"AWS":"arn:aws:iam::<tenant>:user/<username>"};因为目前RGW use ‘tenant’ identifier in place of the Amazon twelve-digit account ID。

  • 在policy json文件中不支持变量的使用,比如${aws:username}

  • rgw支持的Action、Condition是aws的子集,文档中有列出。附录中的RGWListBucket::verify_permission()rgw_build_iam_environment(...)也能看出被支持的Condition有哪些。

  • Under AWS, all tenants share a single namespace. RGW gives every tenant its own namespace of buckets. There may be an option to enable an AWS-like ‘flat’ bucket namespace in future versions. At present, to access a bucket belonging to another tenant, address it as “tenant:bucket” in the S3 request.

  • In AWS, a bucket policy can grant access to another account, and that account owner can then grant access to individual users with user permissions. Since we do not yet support user, role, and group permissions, account owners will currently need to grant access directly to individual users, and granting an entire account access to a bucket grants access to all users in that account.

实现

为一个bucket设置bucket policy,就是向该bucket对应的bucket.instance对象的xattr中以user.rgw.iam-policy为key将上传的policy json文本存入。之后使用时从xattr中查询并解析。

而对于policy的使用,则是在rgw_process.cc/process_request(...)函数中开始的。这个函数就是rgw frontend回调函数最终验证并执行请求的地方,它属于REST API通用处理层,这一层以process_request函数作为入口,其主要步骤大概分为 用户认证、桶/对象acl/policy检查、用户/桶配额检查、执行操作 等。

bucket policy的验证,具体是在process_request调用的rgw_process_authenticated函数中,该函数先后调用了init_permissionsread_permissions,这两个函数都包含读取bucket policy到req_state.iam_policy的语句。

最后在op->verify_permission函数中,根据不同操作进行权限验证,也包括了policy的验证。验证过程大体如下:

  1. 将被验证请求的主体和操作,转换成policy的Principle和Action格式,存入对应的对象,对象变量名分别为ida和res。
  2. 判断ida与bucket policy中的Principle是否匹配,如果没有发现匹配的,则返回Effect::Pass,表示没有匹配的policy授权,那么此时需要根据其他授权机制判断请求是否执行。(其他两个状态是Effect::Allow和Effect::Deny,分别表示同意和阻止)
  3. 判断res和policy的Resource是否匹配(以及res和policy的notResource是否不匹配),如果否,返回Effect::Pass。
  4. 判断请求的操作与policy的Action是否匹配,如果否,返回Effect::Pass。
  5. 判断请求是否满足policy的所有Condition,如果满足,返回Effect::Allow,不满足,返回Effect::Deny。

其中Condition可以包括两部分的限制,一个是要求请求有指定的header项,另一个是要求请求带有指定的路径参数,在验证用户请求时,前者在rgw_build_iam_environment函数中被存入req_state::env 中;后者先被存入RGWListBucket(或RGWListBucketMultiparts等其他需要验证这些参数的对象)的成员变量中,在RGWListBucket::verify_permission()函数调用时才被存入req_state::envreq_state::env则在Condition.eval(...)中被用于比较。

有关Condition需要的参数准备过程的代码,见附录后面几个函数。

简单验证下:
创建一个名为testbucket的桶,使用s3cmd为其设置policy,发现该桶对应的bucket.instance对象的xattr中增加了相关的属性user.rgw.iam-policy,可以使用下面列出对象的所有xattr。

$ ./bin/rados -p default.rgw.meta --namespace=root listxattr .bucket.meta.testbucket:f52fe9ac-581e-432f-a8d2-363748a54fa8.4167.1

然后使用下面的命令来获得指定key的属性值,你会发现,里面存储的直接就是我们上传的json文本。

$ ./bin/rados -p default.rgw.meta --namespace=root getxattr .bucket.meta.testbucket:f52fe9ac-581e-432f-a8d2-363748a54fa8.4167.1 user.rgw.iam-policy

功能测试

基本的PUT Policy和DELETE Policy通过s3cmd测试没有问题。

下面测试了几个常用的场景用法。在测试前,先创建几个用户:
属于默认tenant(即为空)的testid 和 testid2
属于tenantone的userone和usertwo
属于tenanttwo的userthree

下面使用s3cmd测试,仅在第一个case列出完整命令,之后省略。

给所有用户授予指定权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            }
        }
    ]
}

▸ cmh@ubuntu:~$ cp .s3cfg_userone .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd setpolicy policy.json s3://bucketone

▸ cmh@ubuntu:~$ cp .s3cfg_usertwo .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://bucketone
2017-12-07 07:53       977   s3://bucketone/objone
2017-12-07 07:53       977   s3://bucketone/objtwo

▸ cmh@ubuntu:~$ cp .s3cfg_userthree .s3cfg
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://bucketone
ERROR: Bucket 'bucketone' does not exist
ERROR: S3 error: 404 (NoSuchBucket)
▸ cmh@ubuntu:~/code/files$ s3cmd ls s3://tenantone:bucketone
2017-12-07 07:53       977   s3://tenantone:bucketone/objone
2017-12-07 07:53       977   s3://tenantone:bucketone/objtwo

给指定用户授予指定权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": ["s3:ListBucket","s3:GetObject"],
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": ["arn:aws:iam:::user/testid2","arn:aws:iam::tenanttwo:user/userthree"]
            }
        }
    ]
}

给指定用户授予所有权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": ["arn:aws:iam:::user/testid2","arn:aws:iam::tenanttwo:user/userthree"]
            }
        }
    ]
}

给所有用户授予所有权限

▸ cmh@ubuntu:~/code/files$ cat policy.json 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS":"*" 
            }
        }
    ]
}

配合Condition,给指定用户授予指定权限,并要求请求中带有指定header

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::tenantone:user/usertwo"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::bucketone"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:UserAgent": "cmh-test"
                }
            }
        },
        {
            "Sid": "statement2",
            "Effect": "Deny",
            "Principal": {
                "AWS": "arn:aws:iam::tenantone:user/usertwo"
            },
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::bucketone"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:UserAgent": "cmh-test"
                }
            }
        }
    ]
}

### 配合Condition,给指定用户授予指定权限,并要求请求带有指定路径参数

目前只支持ListBucket的s3:prefix 、 s3:delimiter 和 s3:max-keys 。

L版本验证失败,Master分支代码验证通过。

用户1设置policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::bucketone",
                "arn:aws:s3:::bucketone/*"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS":"*" 
            },
            "Condition":{
                "NumericEquals": {
                    "s3:max-keys": "10"
                }
            }
        }
    ]
}

使用用户2发起请求

#!/bin/bash

access_key="usertwo123"
secret_key="usertwo123"
date=$(date -R -u)
string_to_sign="GET\n\n\n${date}\n/bucketone/"
signature=$(echo -en ${string_to_sign} | openssl sha1 -hmac ${secret_key} -binary | base64)

curl "http://127.0.0.1:8000/bucketone/?max-keys=10"              \
    -H "Date: ${date}"                                              \
    -H "User-Agent: cmh-test"                                       \
    -H "Authorization: AWS ${access_key}:${signature}"              \
    -X GET  -v

附录:代码片段

注:以下代码为master分支代码,不是L版本

RGWPutBucketPolicy::execute()

上传policy的请求执行函数

void RGWPutBucketPolicy::execute() {
  op_ret = get_params();
  if (op_ret < 0) {
    return;
  }

  bufferlist in_data = bufferlist::static_from_mem(data, len);

  if (!store->is_meta_master()) {
    op_ret = forward_request_to_master(s, NULL, store, in_data, nullptr);
    if (op_ret < 0) {
      ldout(s->cct, 20) << "forward_request_to_master returned ret=" << op_ret << dendl;
      return;
    }
  }

  try {
    Policy p(s->cct, s->bucket_tenant, in_data);
    // 将bucket原有的policy删除,将新的加入进去
    auto attrs = s->bucket_attrs;
    attrs[RGW_ATTR_IAM_POLICY].clear();
    attrs[RGW_ATTR_IAM_POLICY].append(p.text);
    op_ret = rgw_bucket_set_attrs(store, s->bucket_info, attrs,
                  &s->bucket_info.objv_tracker);
    if (op_ret == -ECANCELED) {
      op_ret = 0; /* lost a race, but it's ok because policies are immutable */
    }
  } catch (rgw::IAM::PolicyParseException& e) {
    ldout(s->cct, 20) << "failed to parse policy: " << e.what() << dendl;
    op_ret = -EINVAL;
  }
}

rgw_process_authenticated(...)

进行权限认证到执行的入口函数

int rgw_process_authenticated(RGWHandler_REST * const handler,
                              RGWOp *& op,
                              RGWRequest * const req,
                              req_state * const s,
                              const bool skip_retarget)
{
  req->log(s, "init permissions");
  
  // init_permissions 将acl、policy等信息从xattr读入内存
  // 它调用了do_init_permissions函数
  // do_init_permissions又调用了rgw_build_bucket_policies
  // rgw_build_bucket_policies的末尾部分,调用了get_iam_policy_from_attr函数
  // 将bucket policy存入了req_state.iam_policy变量中
  int ret = handler->init_permissions(op);
  if (ret < 0) {
    return ret;
  }

  /**
   * Only some accesses support website mode, and website mode does NOT apply
   * if you are using the REST endpoint either (ergo, no authenticated access)
   */
  if (! skip_retarget) {
    req->log(s, "recalculating target");
    ret = handler->retarget(op, &op);
    if (ret < 0) {
      return ret;
    }
    req->op = op;
  } else {
    req->log(s, "retargeting skipped because of SubOp mode");
  }

  /* If necessary extract object ACL and put them into req_state. */
  req->log(s, "reading permissions");
  // 该函数同样调用了get_iam_policy_from_attr函数
  // 将bucket policy存入了req_state.iam_policy变量中
  ret = handler->read_permissions(op);
  if (ret < 0) {
    return ret;
  }

  req->log(s, "init op");
  ret = op->init_processing();
  if (ret < 0) {
    return ret;
  }

  req->log(s, "verifying op mask");
  ret = op->verify_op_mask();
  if (ret < 0) {
    return ret;
  }

  req->log(s, "verifying op permissions");
  // 最终验证
  ret = op->verify_permission();
  if (ret < 0) {
    if (s->system_request) {
      dout(2) << "overriding permissions due to system operation" << dendl;
    } else if (s->auth.identity->is_admin_of(s->user->user_id)) {
      dout(2) << "overriding permissions due to admin operation" << dendl;
    } else {
      return ret;
    }
  }

  req->log(s, "verifying op params");
  ret = op->verify_params();
  if (ret < 0) {
    return ret;
  }
  // 执行具体的请求并返回结果给客户端
  req->log(s, "pre-executing");
  op->pre_exec();

  req->log(s, "executing");
  op->execute();

  req->log(s, "completing");
  op->complete();

  return 0;
}

rgw_build_iam_environment(...)

根据请求中的header,将Condition支持的header项存入req_state::env

rgw::IAM::Environment rgw_build_iam_environment(RGWRados* store,
                        struct req_state* s)
{
  rgw::IAM::Environment e;
  const auto& m = s->info.env->get_map();
  auto t = ceph::real_clock::now();
  e.emplace("aws:CurrentTime", std::to_string(ceph::real_clock::to_time_t(t)));
  e.emplace("aws:EpochTime", ceph::to_iso_8601(t));
  // TODO: This is fine for now, but once we have STS we'll need to
  // look and see. Also this won't work with the IdentityApplier
  // model, since we need to know the actual credential.
  e.emplace("aws:PrincipalType", "User");

  auto i = m.find("HTTP_REFERER");
  if (i != m.end()) {
    e.emplace("aws:Referer", i->second);
  }

  // These seem to be the semantics, judging from rest_rgw_s3.cc
  i = m.find("SERVER_PORT_SECURE");
  if (i != m.end()) {
    e.emplace("aws:SecureTransport", "true");
  }

  i = m.find("HTTP_HOST");
  if (i != m.end()) {
    e.emplace("aws:SourceIp", i->second);
  }

  i = m.find("HTTP_USER_AGENT"); {
  if (i != m.end())
    e.emplace("aws:UserAgent", i->second);
  }

  if (s->user) {
    // What to do about aws::userid? One can have multiple access
    // keys so that isn't really suitable. Do we have a durable
    // identifier that can persist through name changes?
    e.emplace("aws:username", s->user->user_id.id);
  }
  return e;
}

RGWListBucket_ObjStore_S3::get_params()

从请求中解析出prefix、marker、max_keys、delimiter等参数,存入RGWListBucket的成员变量中。

int RGWListBucket_ObjStore_S3::get_params()
{
  list_versions = s->info.args.exists("versions");
  prefix = s->info.args.get("prefix");
  if (!list_versions) {
    marker = s->info.args.get("marker");
  } else {
    marker.name = s->info.args.get("key-marker");
    marker.instance = s->info.args.get("version-id-marker");
  }
  max_keys = s->info.args.get("max-keys");
  op_ret = parse_max_keys();
  if (op_ret < 0) {
    return op_ret;
  }
  delimiter = s->info.args.get("delimiter");
  encoding_type = s->info.args.get("encoding-type");
  if (s->system_request) {
    s->info.args.get_bool("objs-container", &objs_container, false);
    const char *shard_id_str = s->info.env->get("HTTP_RGWX_SHARD_ID");
    if (shard_id_str) {
      string err;
      shard_id = strict_strtol(shard_id_str, 10, &err);
      if (!err.empty()) {
        ldout(s->cct, 5) << "bad shard id specified: " << shard_id_str << dendl;
        return -EINVAL;
      }
    } else {
      shard_id = s->bucket_instance_shard_id;
    }
  }
  return 0;
}

RGWListBucket::verify_permission()

将RGWListBucket成员变量中的prefix、delimiter、max-keys三者被Condition的参数,存入req_state::env中,用于之后的Condition::eval()

int RGWListBucket::verify_permission()
{
  op_ret = get_params();
  if (op_ret < 0) {
    return op_ret;
  }
  if (!prefix.empty())
    s->env.emplace("s3:prefix", prefix);

  if (!delimiter.empty())
    s->env.emplace("s3:delimiter", delimiter);

  s->env.emplace("s3:max-keys", std::to_string(max));

  if (!verify_bucket_permission(s,
                list_versions ?
                rgw::IAM::s3ListBucketVersions :
                rgw::IAM::s3ListBucket)) {
    return -EACCES;
  }

  return 0;
}
<think>好的,用户之前问过存储领域的学习路线,现在又具体问到对象存储及其原理。我需要先回想一下之前给他的回答里有没有提到对象存储,比如在分布式存储部分可能提到过CephRGW或者S3之类的。用户现在可能想深入了解对象存储的基本概念和工作原理,作为学习路线中的一个具体知识点。 首先,用户可能是刚入门的存储领域学习者,或者是想补充自己知识体系的人。他们可能已经了解了一些基础存储类型,比如块存储和文件存储,现在想弄清楚对象存储的不同之处。需要从基础讲起,对比其他存储类型,突出对象存储的特点。 接下来,用户需要知道对象存储的原理,包括架构、核心组件如对象、桶、元数据,还有数据分布和一致性模型。可能需要解释为什么对象存储适合非结构化数据,比如图片、视频,以及它在云存储中的应用,比如AWS S3。 另外,用户可能对技术细节感兴趣,比如数据持久性如何保证,如何通过冗余和纠删码实现高可用性。同时,他们可能想知道对象存储的访问方式,比如RESTful API,以及版本控制、生命周期管理这些高级功能。 还要考虑用户的实际应用场景,比如是否需要构建自己的对象存储,或者使用云服务。可能需要给出一些实际用例,比如大数据分析、备份归档,帮助用户理解何时选择对象存储。 最后,要确保回答结构清晰,分点说明,避免技术术语过于复杂,同时结合之前的存储学习路线,让用户能够将对象存储纳入到整体知识框架中。可能还需要推荐一些学习资源,比如官方文档或书籍,帮助用户进一步学习。</think>以下为对象存储的详细原理介绍,结合架构设计与关键技术点: --- ### **一、对象存储定义** 对象存储(Object Storage)是一种**非结构化数据存储范式**,将数据封装为独立对象(Object),每个对象包含: 1. **数据本体**:任意类型的二进制内容(如图片、视频、日志文件) 2. **全局唯一标识符**:对象ID(如UUID或哈希值) 3. **元数据**:描述性标签(如创建时间、用户自定义属性) **对比传统存储**: - **块存储**(如硬盘):以固定大小块为单位,无元数据,适用于数据库 - **文件存储**(如NAS):树状目录结构,适合频繁修改的小文件 - **对象存储**:扁平命名空间,强扩展性,适合海量静态数据 --- ### **二、核心架构原理** 1. **存储节点拓扑** - 采用分布式架构,数据通过**一致性哈希**或CRUSH算法分布在多个节点 - 示例:Ceph对象存储(RADOS Gateway)的数据分布公式: $$P = h(object\_name) \mod N_{pg}$$ ($P$为归置组PG,$N_{pg}$为PG总数) 2. **数据持久性机制** - **多副本策略**:默认3副本,数据同时写入多个节点(如AWS S3) - **纠删码(Erasure Coding)**:将对象拆分为$k$数据块+$m$校验块,空间利用率提升至$\frac{k}{k+m}$(如6+3纠删码) 3. **一致性模型** - **最终一致性**:新对象写入后可能存在短暂读取不一致(如跨区域复制) - **强一致性**:通过Quorum机制保证读写一致性(如要求多数节点确认) --- ### **三、关键技术特性** 1. **RESTful API访问** - 通过HTTP协议操作对象(PUT/GET/DELETE),例如S3协议: ```bash curl -X PUT https://bucket.s3.amazonaws.com/object.txt \ -H "x-amz-storage-class: STANDARD_IA" \ --data-binary @localfile.txt ``` 2. **版本控制** - 每个对象修改生成新版本,防止误删(如保留所有历史版本) 3. **生命周期管理** - 自动迁移冷数据到低成本存储层(如从标准存储转归档存储) 4. **数据安全性** - 客户端加密(SSE-C)与服务端加密(SSE-S3) - 访问控制策略(如S3 Bucket Policy与IAM角色绑定) --- ### **四、典型应用场景** 1. **云存储服务**:AWS S3/Aliyun OSS存储用户上传的图片、视频 2. **大数据分析**:作为Hadoop/Spark的数据湖底层存储 3. **备份与归档**:替代磁带库,支持PB级数据长期保存 4. **CDN源站**:存储静态资源并通过边缘节点加速分发 --- ### **五、开源实现案例** 1. **MinIO**:轻量级S3兼容对象存储,单节点部署命令: ```bash minio server /data --console-address :9090 ``` 2. **Ceph RGW**:集成于Ceph集群,支持多租户与配额管理 3. **OpenStack Swift**:适合构建私有云对象存储 --- ### **六、性能优化实践** 1. **并行上传**:大文件分块(如S3 Multipart Upload提升吞吐量) 2. **元数据缓存**:使用Redis缓存高频访问对象的元数据 3. **存储分层**:热点数据存放NVMe SSD,冷数据转存至HDD --- **学习建议**:通过AWS S3免费层实操API调用,同时阅读《对象存储实战指南》深入理解设计细节。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值