No prefix! operator is Ok!

本文探讨了ECMAScript中私有属性语法的改进方案,对比了原有提案与新提议的区别,重点讨论了符号选择、私有属性的存取机制及语法一致性问题。

所有在我对原有提案的修改中,核心不是“不用#字符”,而是将它从一个前缀字符,变成了一个操作符。这一方面是使“声明语法”与“表达式运算”分开,另一方面也让这些修改与ECMAScript规范保持在语法上的一致性。
本文是对一份tc39提案的讨论。原有提案:https://github.com/tc39/proposal-class-fields
修改的提议:#issuecomment-429533532

本文是一个系列,参见:

  • No prefix! operator is Ok! (本文)
  • 私有属性的实现 - 在这里
  • (未完待续)

1. 为什么是":“而不是”="?

在所有类、对象等声明性质的语法中,":"是表明"key/value pair"的,既然这里的私有字段仍然是“key/value pair”,那么仍然建议使用该符号。而原提案建议使用=并且与TypeScript保持一致,却忽略了TypeScript中完整的语法x: number = 100中,:是指示类型的,而不是用于指示值。——这与ECMAScript的一般规则并不一致。

ECMAScript规范事实上是沿用了“旧式的对象字面量”的属性声明与初值的语法,亦即是:

obj = {
    x: 100,
    y: 100
}

注意在这个语法中”有或没有,号都是接受的,但如果有,号则称为一个List,且整个声明是以一个“没有,号的单个属性声明”结束的。——这与传统的对象字面量声明语法一致。

在ECMAScript中,类声明一定程度上沿用和扩展了这一语法。一方面,把“方法声明”给到了对象字面量声明;另一方面,从对象字面量那里把“get/setter声明”拿了过来。而与TypeScript类似的= xxxx语法,尽管也是一个称为Initializer的语法组件,也的确出现在了对象初始化语法中,但是作为错误语法来识别的(CoverInitializedName):

PropertyDefinition: CoverInitializedName
     Always throw a Syntax Error if code matches this production.

所以,回到最前面的说明,推荐的语法设计是:

// Because
obj = {
    data: 100,
    foo() {
        ....
    }
}

// YES
class f {
    data: 100,
    foo() {
        ...
    }
}

// NO!!!
class f {
    data = 100;
    ...
}

这样使用f()类构造出来的对象实例,与用相似语法声明的对象字面量是类似的。不存在语法设计上的“例外(unexpected)”。

2. 为什么是"private x"而不是"#"?

在现有类、对象等声明语法中是使用限定词来指示成员性质的,例如staticget等。除了生成器函数之外,并不使用限定符号或“前缀(prefix)”,因此我建议用限定词private来扩展类声明语法,而反对原提案中使用#作为限定符来声明类私有字段(private field of class)。例如:

// YES
class f {
    private data: 100  
}

// NO!
class f {
    #data = 100;
}

3.为什么是",“而不是”;"?

在所有的列表(List)中,ECMAScript采用的分隔符都是,号,例如参数列表、对象/数组成员列表,以及导入导出的名字列表,还有变量var声明的名字列表等等。而;在语法中惯例是用作语句结束符或分隔符,包括你所见的各种(任意的)语句,以及for(;;)中的子句等等。我们既然是在声明类或对象的“成员列表”,那么显然应该是按“列表(List)”的规则处理成,为好,怎么会用到;号了呢?

TypeScript中在这个位置是这样声明的:

class f {
    data: string = 'in typescript';
    ...
}

留意这里的语法特点:符号:是用于指示类型的,因此初值声明使用了=。这个声明与var语句声明类似,TypeScript将这里处理成了;是可以理解的。但ECMAScript何必要绕开现成的,号不用,去使用在这里毫无意义的;号呢?

4.最后一个","号的问题

ECMAScript既然已经接受了对象字面量声明的末尾逗号(object literal trailing commas),那么下面两种声明都可以是合法的了:

// 推荐
class f {
    data: 100,
    foo() {
        ...
    }
}

// (可接受)
class f {
    ...,  // more, a list with ','
    data: 100
    foo() {
        ...
    }
}

5.其它的声明示例

包括:

var data = 'outer';

class f {
  // reference
  data,  // outer reference, no computed

  // public
  data: 100,  // for object, equ: f.prototype.data = 100
  static data: 100, // for class, equ: f.data = 100
  ["da"+"ta"]: 100, // computed
  static ["da"+"ta"]: 100, // computed

  // private
  private data: 100,  // normal private object properties
  private ["da"+"ta"]: 100,  // computed private object properties, and symbol keys
  private static data: 100, // for class static field
  private static ["da" + "ta"]: 100, // computed

  // get&setter, etc
  private get data() { ... }
  ...
}

6.私有属性的存取

私有属性的存取是一个大问题,也是#语法争议的焦点。首先必须确定的是:私有属性存取的语义是什么?

在原提案中,私有属性存取是将#作为一个前缀(prefix),而存取运算仍然是.[]运算符。因此,本质上来说,存取运算的操作没变,但是需要在存取中判断属性名的前缀是否是#字符。如下例:

// 原提案
class f {
    #data = 100;
    foo() {
        console.log(this.#data);  // "#"是前缀,而"."是存取运算符
    }
}

根本原因出在.运算检测data是否是私有成员的成本过高。例如:

// 示例:一个不太可行的方案
class f{
    data = 100; // 假设这里是私有成员
    foo() {
        console.log(this.data);  // 假设这里的this.data指向私有成员
    }
}
x = new f;
x.data = '200';
f(); // 应该显示200还是100?

那么在上面这个例子中,x.data添加了一个公开的属性时,foo()方法是无法识别用户代码的意图的。

所以在旧的提案中需要使用#前缀。但是,仔细思考这个问题:

  • 私有字段列表与自有成员列表必须是同一个吗?

当然不需要。那么为什么不为私有字段列表安排一个专门的运算符呢?只要“像使用super一样”限定它使用的上下文就好。因此,新的语法设计如下:

class f{
    data: 100,
    foo() {
        console.log(this#data);
    }
}

也就是说#现在是当作一个运算符(operator )在用,而不是一个前缀(prefix)。

但为什么是#

答案是:老实说,我也想不到更好的了。如果你能找一个大家都满意的,我接受。

NOTE:

一个备选的运算符可以是->,但老实说我认为比#更差劲。

7.其它

  • 在本方案中,是默认用“对象或类”的自有成员列表来实现的,这意味着总是需要用类似this#xxx的语法来存取它。不过这并非唯一的方案。

  • 关于类似于”采用词法上下文“来实现私有成员的问题,我另写文章来讨论吧。

Gerrit Code Review - Searching Changes version v3.3.3 Table of Contents Default Searches Basic Change Search Search Operators Argument Quoting Boolean Operators Negation AND OR Labels Magical Operators Default Searches Most basic searches can be viewed by clicking on a link along the top menu bar. The link will prefill the search box with a common search query, execute it, and present the results. Description Default Query All > Open status:open '(or is:open)' All > Merged status:merged All > Abandoned status:abandoned My > Watched Changes is:watched is:open My > Starred Changes is:starred My > Draft Comments has:draft Open changes in Foo status:open project:Foo Basic Change Search Similar to many popular search engines on the web, just enter some text and let Gerrit figure out the meaning: Description Examples Legacy numerical id 15183 Full or abbreviated Change-Id Ic0ff33 Full or abbreviated commit SHA-1 d81b32ef Email address user@example.com For change searches (i.e. those using a numerical id, Change-Id, or commit SHA1), if the search results in a single change that change will be presented instead of a list. For more predictable results, use explicit search operators as described in the following section. Search Operators Operators act as restrictions on the search. As more operators are added to the same query string, they further restrict the returned results. Search can also be performed by typing only a text with no operator, which will match against a variety of fields. age:'AGE' Amount of time that has expired since the change was last updated with a review comment or new patch set. The age must be specified to include a unit suffix, for example -age:2d: s, sec, second, seconds m, min, minute, minutes h, hr, hour, hours d, day, days w, week, weeks (1 week is treated as 7 days) mon, month, months (1 month is treated as 30 days) y, year, years (1 year is treated as 365 days) age can be used both forward and backward looking: age:2d means 'everything older than 2 days' while -age:2d means 'everything with an age of at most 2 days'. assignee:'USER' Changes assigned to the given user. attention:'USER' Changes whose attention set includes the given user. before:'TIME'/until:'TIME' Changes modified before the given 'TIME', inclusive. Must be in the format 2006-01-02[ 15:04:05[.890][ -0700]]; omitting the time defaults to 00:00:00 and omitting the timezone defaults to UTC. after:'TIME'/since:'TIME' Changes modified after the given 'TIME', inclusive. Must be in the format 2006-01-02[ 15:04:05[.890][ -0700]]; omitting the time defaults to 00:00:00 and omitting the timezone defaults to UTC. change:'ID' Either a legacy numerical 'ID' such as 15183, or a newer style Change-Id that was scraped out of the commit message. conflicts:'ID' Changes that conflict with change 'ID'. Change 'ID' can be specified as a legacy numerical 'ID' such as 15183, or a newer style Change-Id that was scraped out of the commit message. destination:'NAME' Changes which match the current user’s destination named 'NAME'. (see Named Destinations). owner:'USER', o:'USER' Changes originally submitted by 'USER'. The special case of owner:self will find changes owned by the caller. ownerin:'GROUP' Changes originally submitted by a user in 'GROUP'. query:'NAME' Changes which match the current user’s query named 'NAME' (see Named Queries). reviewer:'USER', r:'USER' Changes that have been, or need to be, reviewed by 'USER'. The special case of reviewer:self will find changes where the caller has been added as a reviewer. cc:'USER' Changes that have the given user CC’ed on them. The special case of cc:self will find changes where the caller has been CC’ed. revertof:'ID' Changes that revert the change specified by the numeric 'ID'. submissionid:'ID' Changes that have the specified submission 'ID'. reviewerin:'GROUP' Changes that have been, or need to be, reviewed by a user in 'GROUP'. commit:'SHA1' Changes where 'SHA1' is one of the patch sets of the change. project:'PROJECT', p:'PROJECT' Changes occurring in 'PROJECT'. If 'PROJECT' starts with ^ it matches project names by regular expression. The dk.brics.automaton library is used for evaluation of such patterns. projects:'PREFIX' Changes occurring in projects starting with 'PREFIX'. parentproject:'PROJECT' Changes occurring in 'PROJECT' or in one of the child projects of 'PROJECT'. repository:'REPOSITORY', repo:'REPOSITORY' Changes occurring in 'REPOSITORY'. If 'REPOSITORY' starts with ^ it matches repository names by regular expression. The dk.brics.automaton library is used for evaluation of such patterns. repositories:'PREFIX', repos:'PREFIX' Changes occurring in repositories starting with 'PREFIX'. parentrepository:'REPOSITORY', parentrepo:'REPOSITORY' Changes occurring in 'REPOSITORY' or in one of the child repositories of 'REPOSITORY'. branch:'BRANCH' Changes for 'BRANCH'. The branch name is either the short name shown in the web interface or the full name of the destination branch with the traditional 'refs/heads/' prefix. If 'BRANCH' starts with ^ it matches branch names by regular expression patterns. The dk.brics.automaton library is used for evaluation of such patterns. intopic:'TOPIC' Changes whose designated topic contains 'TOPIC', using a full-text search. If 'TOPIC' starts with ^ it matches topic names by regular expression patterns. The dk.brics.automaton library is used for evaluation of such patterns. topic:'TOPIC' Changes whose designated topic matches 'TOPIC' exactly. This is often combined with 'branch:' and 'project:' operators to select all related changes in a series. hashtag:'HASHTAG' Changes whose hashtag matches 'HASHTAG'. The match is case-insensitive. cherrypickof:'CHANGE[,PATCHSET]' Changes which were created using the 'cherry-pick' functionality and whose source change number matches 'CHANGE' and source patchset number matches 'PATCHSET'. Note that 'PATCHSET' is optional. For example, a cherrypickof:12345 matches all changes which were cherry-picked from change 12345 and cherrypickof:12345,2 matches all changes which were cherry-picked from the 2nd patchset of change 12345. ref:'REF' Changes where the destination branch is exactly the given 'REF' name. Since 'REF' is absolute from the top of the repository it must start with 'refs/'. If 'REF' starts with ^ it matches reference names by regular expression patterns. The dk.brics.automaton library is used for evaluation of such patterns. tr:'ID', bug:'ID' Search for changes whose commit message contains 'ID' and matches one or more of the trackingid sections in the server’s configuration file. This is typically used to search for changes that fix a bug or defect by the issue tracking system’s issue identifier. label:'VALUE' Matches changes where the approval score 'VALUE' has been set during a review. See labels below for more detail on the format of the argument. message:'MESSAGE' Changes that match 'MESSAGE' arbitrary string in the commit message body. comment:'TEXT' Changes that match 'TEXT' string in any comment left by a reviewer. path:'PATH' Matches any change touching file at 'PATH'. By default exact path matching is used, but regular expressions can be enabled by starting with ^. For example, to match all XML files use file:^.*\.xml$. The dk.brics.automaton library is used for the evaluation of such patterns. The ^ required at the beginning of the regular expression not only denotes a regular expression, but it also has the usual meaning of anchoring the match to the start of the string. To match all Java files, use file:^.*\.java. The entire regular expression pattern, including the ^ character, should be double quoted when using more complex construction (like ones using a bracket expression). For example, to match all XML files named like 'name1.xml', 'name2.xml', and 'name3.xml' use file:"^name[1-3].xml". Slash ('/') is used path separator. More examples: * -file:^path/. - changes that do not modify files from path/, * file:{^~(path/.)} - changes that modify files not from path/ (but may contain files from path/). file:'NAME', f:'NAME' Matches any change touching a file containing the path component 'NAME'. For example a file:src will match changes that modify files named gerrit-server/src/main/java/Foo.java. Name matching is exact match, file:Foo.java finds any change touching a file named exactly Foo.java and does not match AbstractFoo.java. Regular expression matching can be enabled by starting the string with ^. In this mode file: is an alias of path: (see above). extension:'EXT', ext:'EXT' Matches any change touching a file with extension 'EXT', case-insensitive. The extension is defined as the portion of the filename following the final .. Files with no . in their name have no extension and can be matched by an empty string. onlyextensions:'EXT_LIST', onlyexts:'EXT_LIST' Matches any change touching only files with extensions that are listed in 'EXT_LIST' (comma-separated list). The matching is done case-insensitive. An extension is defined as the portion of the filename following the final .. Files with no . in their name have no extension and can be matched by an empty string. directory:'DIR', dir:'DIR' Matches any change where the current patch set touches a file in the directory 'DIR'. The matching is done case-insensitive. 'DIR' can be a full directory name, a directory prefix or any combination of intermediate directory segments. E.g. a change that touches a file in the directory 'a/b/c' matches for 'a/b/c', 'a', 'a/b', 'b', 'b/c' and 'c'. Slash ('/') is used path separator. Leading and trailing slashes are allowed but are not mandatory. If 'DIR' starts with ^ it matches directories and directory segments by regular expression. The dk.brics.automaton library is used for evaluation of such patterns. footer:'FOOTER' Matches any change that has 'FOOTER' as footer in the commit message of the current patch set. 'FOOTER' can be specified verbatim ('<key>: <value>', must be quoted) or as '<key>=<value>'. The matching is done case-insensitive. star:'LABEL' Matches any change that was starred by the current user with the label 'LABEL'. E.g. if changes that are not interesting are marked with an ignore star, they could be filtered out by '-star:ignore'. 'star:star' is the same as 'has:star' and 'is:starred'. has:draft True if there is a draft comment saved by the current user. has:star Same as 'is:starred' and 'star:star', true if the change has been starred by the current user with the default label. has:stars True if the change has been starred by the current user with any label. has:edit True if the change has inline edit created by the current user. has:unresolved True if the change has unresolved comments. is:assigned True if the change has an assignee. is:starred Same as 'has:star', true if the change has been starred by the current user with the default label. is:unassigned True if the change does not have an assignee. is:watched True if this change matches one of the current user’s watch filters, and thus is likely to notify the user when it updates. is:reviewed True if any user has commented on the change more recently than the last update (comment or patch set) from the change owner. is:owner True on any change where the current user is the change owner. Same as owner:self. is:reviewer True on any change where the current user is a reviewer. Same as reviewer:self. is:cc True on any change where the current user is in CC. Same as cc:self. is:open, is:pending, is:new True if the change is open. is:closed True if the change is either merged or abandoned. is:merged, is:abandoned Same as status:'STATE'. is:submittable True if the change is submittable according to the submit rules for the project, for example if all necessary labels have been voted on. This operator only takes into account one change at a time, not any related changes, and does not guarantee that the submit button will appear for matching changes. To check whether a submit button appears, use the Get Revision Actions API. Equivalent to submittable:ok. is:mergeable True if the change has no merge conflicts and could be merged into its destination branch. Mergeability of abandoned changes is not computed. This operator will not find any abandoned but mergeable changes. This operator only works if Gerrit indexes 'mergeable'. See indexMergeable for details. is:ignored True if the change is ignored. Same as star:ignore. is:private True if the change is private, ie. only visible to owner and its reviewers. is:wip True if the change is Work In Progress. is:merge True if the change is a merge commit. status:open, status:pending, status:new True if the change state is 'review in progress'. status:reviewed Same as 'is:reviewed', matches if any user has commented on the change more recently than the last update (comment or patch set) from the change owner. status:closed True if the change is either 'merged' or 'abandoned'. status:merged Change has been merged into the branch. status:abandoned Change has been abandoned. added:'RELATION''LINES', deleted:'RELATION''LINES', delta/size:'RELATION''LINES' True if the number of lines added/deleted/changed satisfies the given relation for the given number of lines. For example, added:>50 will be true for any change which adds at least 50 lines. Valid relations are >=, >, <=, <, or no relation, which will match if the number of lines is exactly equal. commentby:'USER' Changes containing a top-level or inline comment by 'USER'. The special case of commentby:self will find changes where the caller has commented. from:'USER' Changes containing a top-level or inline comment by 'USER', or owned by 'USER'. Equivalent to (owner:USER OR commentby:USER). reviewedby:'USER' Changes where 'USER' has commented on the change more recently than the last update (comment or patch set) from the change owner. author:'AUTHOR' Changes where 'AUTHOR' is the author of the current patch set. 'AUTHOR' may be the author’s exact email address, or part of the name or email address. committer:'COMMITTER' Changes where 'COMMITTER' is the committer of the current patch set. 'COMMITTER' may be the committer’s exact email address, or part of the name or email address. submittable:'SUBMIT_STATUS' Changes having the given submit record status after applying submit rules. Valid statuses are in the status field of SubmitRecord. This operator only applies to the top-level status; individual label statuses can be searched by label. unresolved:'RELATION''NUMBER' True if the number of unresolved comments satisfies the given relation for the given number. For example, unresolved:>0 will be true for any change which has at least one unresolved comment while unresolved:0 will be true for any change which has all comments resolved. Valid relations are >=, >, <=, <, or no relation, which will match if the number of unresolved comments is exactly equal. Argument Quoting Operator values that are not bare words (roughly A-Z, a-z, 0-9, @, hyphen, dot and underscore) must be quoted for the query parser. Quoting is accepted as either double quotes (e.g. message:"the value") or as matched curly braces (e.g. message:{the value}). Boolean Operators Unless otherwise specified, operators are joined using the AND boolean operator, thereby restricting the search results. Parentheses can be used to force a particular precedence on complex operator expressions, otherwise OR has higher precedence than AND. Negation Any operator can be negated by prefixing it with -, for example -is:starred is the exact opposite of is:starred and will therefore return changes that are not starred by the current user. The operator NOT (in all caps) is a synonym. AND The boolean operator AND (in all caps) can be used to join two other operators together. This results in a restriction of the results, returning only changes that match both operators. OR The boolean operator OR (in all caps) can be used to find changes that match either operator. This increases the number of results that are returned, as more changes are considered. Labels Label operators can be used to match approval scores given during a code review. The specific set of supported labels depends on the server configuration, however the Code-Review label is provided out of the box. A label name is any of the following: The label name. Example: label:Code-Review. The label name followed by a ',' followed by a reviewer id or a group id. To make it clear whether a user or group is being looked for, precede the value by a user or group argument identifier ('user=' or 'group='). If an LDAP group is being referenced make sure to use 'ldap/<groupname>'. A label name must be followed by either a score with optional operator, or a label status. The easiest way to explain this is by example. First, some examples of scores with operators: label:Code-Review=2 label:Code-Review=+2 label:Code-Review+2 Matches changes where there is at least one +2 score for Code-Review. The + prefix is optional for positive score values. If the + is used, the = operator is optional. label:Code-Review=-2 label:Code-Review-2 Matches changes where there is at least one -2 score for Code-Review. Because the negative sign is required, the = operator is optional. label:Code-Review=1 Matches changes where there is at least one +1 score for Code-Review. Scores of +2 are not matched, even though they are higher. label:Code-Review>=1 Matches changes with either a +1, +2, or any higher score. Instead of a numeric vote, you can provide a label status corresponding to one of the fields in the SubmitRecord REST API entity. label:Non-Author-Code-Review=need Matches changes where the submit rules indicate that a label named Non-Author-Code-Review is needed. (See the Prolog Cookbook for how this label can be configured.) label:Code-Review=+2,aname label:Code-Review=ok,aname Matches changes with a +2 code review where the reviewer or group is aname. label:Code-Review=2,user=jsmith Matches changes with a +2 code review where the reviewer is jsmith. label:Code-Review=+2,user=owner label:Code-Review=ok,user=owner label:Code-Review=+2,owner label:Code-Review=ok,owner The special "owner" parameter corresponds to the change owner. Matches all changes that have a +2 vote from the change owner. label:Code-Review=+1,group=ldap/linux.workflow Matches changes with a +1 code review where the reviewer is in the ldap/linux.workflow group. label:Code-Review<=-1 Matches changes with either a -1, -2, or any lower score. is:open label:Code-Review+2 label:Verified+1 NOT label:Verified-1 NOT label:Code-Review-2 is:open label:Code-Review=ok label:Verified=ok Matches changes that are ready to be submitted according to one common label configuration. (For a more general check, use submittable:ok.) is:open (label:Verified-1 OR label:Code-Review-2) is:open (label:Verified=reject OR label:Code-Review=reject) Changes that are blocked from submission due to a blocking score. Magical Operators Most of these operators exist to support features of Gerrit Code Review, and are not meant to be accessed by the average end-user. However, they are recognized by the query parser, and may prove useful in limited contexts to administrators or power-users. visibleto:'USER-or-GROUP' Matches changes that are visible to 'USER' or to anyone who is a member of 'GROUP'. Here group names may be specified as either an internal group name, or if LDAP is being used, an external LDAP group name. The value may be wrapped in double quotes to include spaces or other special characters. For example, to match an LDAP group: visibleto:"CN=Developers, DC=example, DC=com". This operator may be useful to test access control rules, however a change can only be matched if both the current user and the supplied user or group can see it. This is due to the implicit 'is:visible' clause that is always added by the server. is:visible Magical internal flag to prove the current user has access to read the change. This flag is always added to any query. starredby:'USER' Matches changes that have been starred by 'USER' with the default label. The special case starredby:self applies to the caller. watchedby:'USER' Matches changes that 'USER' has configured watch filters for. The special case watchedby:self applies to the caller. draftby:'USER' Matches changes that 'USER' has left unpublished draft comments on. Since the drafts are unpublished, it is not possible to see the draft text, or even how many drafts there are. The special case of draftby:self will find changes where the caller has created a draft comment. limit:'CNT' Limit the returned results to no more than 'CNT' records. This is automatically set to the page size configured in the current user’s preferences. Including it in a web query may lead to unpredictable results with regards to pagination. Part of Gerrit Code Review Search Version v3.3.3 这是gerrit上的搜索方法提示,我现在想在SHA为5998a8317b060e97ce3904782d50a7fa666e15b8的更改中查找里面更改内容带maxuser的内容怎么查找
最新发布
12-10
import json import os import asyncio import logging import shutil import traceback import aiohttp import fnmatch from aiohttp import ClientSession, FormData from typing import Tuple, Optional, List from pydantic import BaseModel, Field, ConfigDict from pathlib import Path UPLOAD_FILE_API = "api/v2/file" SAVE_FILENAME = "upload_info.json" logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(filename)s:%(lineno)d - %(levelname)s - %(message)s") logger = logging.getLogger(__name__) class UploadFileReq(BaseModel): model_config = ConfigDict(populate_by_name=True) storage_name: str = Field(alias="storageName") bucket_name: str = Field(alias="bucketName") object_path: str = Field(alias="objectPath") class UploadFileResp(BaseModel): code: int message: str class DataUploadContext(BaseModel): input_dir: Path output_dir: Path share_dir: Path share_mode: bool = True include_dirs: Tuple = () exclude_dirs: Tuple = () worker_number: int = 20 class Config: frozen = True class UploadInfo(BaseModel): local_path: Path relative_path: str base_url: str storage_name: str bucket_name: str prefix: str class Config: frozen = True async def dir_upload_operator(base_url: str, storage_name: str, bucket_name: str, prefix: str, context: DataUploadContext): try: async with aiohttp.ClientSession() as session: await upload_data(session, base_url, storage_name, bucket_name, prefix, context) except Exception: stack_trace = traceback.format_exc() logger.error(f"upload {bucket_name}-{prefix} get failed: [\n{stack_trace}]") exit(1) async def uploader(session: ClientSession, upload_info: UploadInfo): url = f"{upload_info.base_url}/{UPLOAD_FILE_API}" data = UploadFileReq(storage_name=upload_info.storage_name, bucket_name=upload_info.bucket_name, object_path=f"{upload_info.prefix}{upload_info.relative_path}") logger.debug(f"{url=}, {data=}") form_data = FormData() for key, value in data.model_dump(by_alias=True).items(): form_data.add_field(key, value) with open(upload_info.local_path, "rb") as file: form_data.add_field("file", file, filename=upload_info.local_path.name) async with session.post(url, data=form_data) as resp: if resp.status != 200: logger.error(f"Failed to upload file {upload_info}. Status code: {resp.status}") return else: result = UploadFileResp(**await resp.json()) if result.code != 200: logger.error(f"Failed to upload file {upload_info}. Error code: {result.code}, Message: {result.message}") async def worker(name: str, session: ClientSession, queue: asyncio.Queue): while True: try: upload_info: Optional[UploadInfo] = await queue.get() if upload_info is None: break await uploader(session, upload_info) except Exception: stack_trace = traceback.format_exc() logger.error(f"workerget failed: [\n{stack_trace}]") queue.task_done() def remove_prefix(s: str, prefix: str) -> str: if s.startswith(prefix): result = s[len(prefix) :] else: result = s return result def gen_relative_path(upload_dir: Path, item_path: Path) -> str: # 确保已posix格式化,避免路径分隔符问题 result = remove_prefix(item_path.absolute().as_posix(), upload_dir.absolute().as_posix()) if result[0] == "/": result = result[1:] # 移除可能的前导斜杠 return result def is_filter_file(check_path: str, include_dirs: Tuple, exclude_dirs: Tuple) -> bool: if len(include_dirs) != 0: for include_dir in include_dirs: if fnmatch.fnmatch(check_path, include_dir): return False if check_path.startswith(include_dir): return False return True for exclude_dir in exclude_dirs: if fnmatch.fnmatch(check_path, include_dir): return True if check_path.startswith(exclude_dir): return True return False def save_update_files(update_files: List[str], context: DataUploadContext): update_info = {"update_files": update_files} with open(context.output_dir / SAVE_FILENAME, "w") as f: json.dump(update_info, f) if context.share_mode: shutil.copy(context.output_dir / SAVE_FILENAME, context.share_dir / SAVE_FILENAME) async def upload_data(session: ClientSession, base_url: str, storage_name, bucket_name: str, prefix: str, context: DataUploadContext): upload_dir: Path = context.input_dir if context.share_mode: upload_dir: Path = context.share_dir queue = asyncio.Queue() # 启动协程上传器 workers: List[asyncio.Task[None]] = [asyncio.create_task(worker(f"uploader-{i}", session, queue)) for i in range(context.worker_number)] asyncio.gather(*workers, return_exceptions=True) update_files = [] for item in upload_dir.absolute().rglob("*"): if item.is_dir(): continue logger.debug(f"before filter {item}") relative_path: str = gen_relative_path(upload_dir, item) if is_filter_file(relative_path, context.include_dirs, context.exclude_dirs): continue logger.debug(f"after filter {item}") update_path: Path = Path(storage_name) / bucket_name / prefix / relative_path update_files.append(update_path.as_posix()) upload_info = UploadInfo(local_path=item, relative_path=relative_path, base_url=base_url, storage_name=storage_name, bucket_name=bucket_name, prefix=prefix) await queue.put(upload_info) # 等待队列中的所有任务完成 await queue.join() # 停止协程工作者 for _ in range(context.worker_number): queue.put_nowait(None) # 发送停止信号 # 等待所有工作者完成 await asyncio.gather(*workers) # 保存上传信息 save_update_files(update_files, context) def string_to_tuple(s: str) -> Tuple: """将逗号分隔的字符串转换为元组""" s = s.strip() if not s: return () # 如果字符串为空,返回空元组 elif "," not in s: return (s,) # 如果没有逗号,返回单个元素的元组 else: # 否则,分割并转换为元组 return tuple(s.split(",")) if __name__ == "__main__": # 基于环境变量 获取参数 storage_name = os.environ.get("OUT_STORAGE_NAME", "") bucket_name = os.environ.get("OUT_BUCKET_NAME", "") prefix = os.environ.get("OUT_DIR_PATH", "") # storage_name = os.environ.get("OUT_STORAGE_NAME", "obscenter") # bucket_name = os.environ.get("OUT_BUCKET_NAME", "ad-data") # prefix = os.environ.get("OUT_DIR_PATH", "HAD/filtered_data/sensor_bag_cut/test_demo/899_20240920_102420_2_cut_60_70/") base_url = os.environ.get("BASE_URL", "http://fuga.freetech.com/prod-api/object-storage") include_dirs = string_to_tuple(os.environ.get("UPLOAD_INCLUDE_DIRS", "")) exclude_dirs = string_to_tuple(os.environ.get("UPLOAD_EXCLUDE_DIRS", "")) share_mode = bool(int(os.environ.get("SHARE_MODE", "0"))) worker_number = int(os.environ.get("WORKER_NUMBER", "5")) # 工作流自动注入的环境变量 input_dir = Path(os.environ.get("INPUT_DIR", "./input_dir/")) output_dir = Path(os.environ.get("OUTPUT_DIR", "./output_dir/")) share_dir = Path(os.environ.get("SHARE_DIR", "./share_dir/")) # 本地调试方便 os.makedirs(input_dir, exist_ok=True) os.makedirs(output_dir, exist_ok=True) os.makedirs(share_dir, exist_ok=True) # 算子核心逻辑 prefix = prefix if prefix[-1] == "/" else prefix + "/" context = DataUploadContext( input_dir=input_dir, output_dir=output_dir, share_dir=share_dir, share_mode=share_mode, include_dirs=include_dirs, exclude_dirs=exclude_dirs if len(include_dirs) == 0 else exclude_dirs, # include_prefixs和exclude_prefixs只能用一种,同时存在默认include_prefixs worker_number=worker_number, ) logger.info(f"start upload operator with \n{base_url=}\n{bucket_name=}\n{prefix=}\n{context=}") asyncio.run(dir_upload_operator(base_url, storage_name, bucket_name, prefix, context)),这是文件夹内容上传脚本,现在需要改成使用分片接口将目录下的文件进行上传 ,分片上传接口是api/v3/file/upload/chunks,post请求,请求入参是这个格式@NotNull(message = "当前分片不能为空") private Integer chunkNumber; @NotNull(message = "分片大小不能为空") private Long chunkSize; @NotNull(message = "当前分片大小不能为空") private Float currentChunkSize; @NotNull(message = "文件总数不能为空") private Integer totalChunks; @NotBlank(message = "logId不能为空") private String logId; @NotBlank(message = "文件名不能为空") private String filename; @NotNull(message = "md5不能为空") private String md5; private String relativePath; @NotNull(message = "文件总大小不能为空") private Long totalSize;以及MultipartFile文件对象,参照上面的python代码,写一份脚本
06-04
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值