421.Simplify Path-简化路径(中等题)

本文介绍了一个用于简化Unix风格路径的算法实现。通过分析输入路径,该算法能够正确处理包括相对路径、绝对路径、多余的斜杠及返回上级目录等复杂情况,并提供了一段Java代码示例。

简化路径

  1. 题目

    给定一个文档(Unix-style)的完全路径,请进行路径简化。

  2. 样例

    “/home/”, => “/home”
    “/a/./b/../../c/”, => “/c”

  3. 挑战

    你是否考虑了 路径 = “/../” 的情况?
    在这种情况下,你需返回”/”。
    此外,路径中也可能包含双斜杠’/’,如 “/home//foo/”。
    在这种情况下,可忽略多余的斜杠,返回 “/home/foo”。

  4. 题解

    先用split函数以”/”分隔字符串,遍历路径数组,遇到..表示返回上一层路径则进行出栈,遇到.表示当前路径不变则什么都不用做,遇到文件名则压栈。最后把栈中路径进行拼接。

public class Solution {
    /**
     * @param path the original path
     * @return the simplified path
     */
    public String simplifyPath(String path) {
        String result = "/";
        String[] arr = path.split("/");
        ArrayList<String> paths = new ArrayList<String>();
        for (String s : arr)
        {
            if(s.equals(".."))
            {
                if(paths.size() > 0)
                {
                    paths.remove(paths.size() - 1);
                }
            }
            else if (!s.equals(".") && !s.equals(""))
            {
                paths.add(s);
            }
        }
        for (String s : paths)
        {
            result += s + "/";
        }
        if (result.length() > 1)
        {
            result = result.substring(0, result.length() - 1);
        }
        return result;
    }
}

Last Update 2016.11.17

💻 Usage Instructions & Steps to reproduce We structure the code available in this replication package based on the stages involved in the LLM-based annotation process. 🤖 LLM-based annotation The folder contains the code used to generate the LLM-based annotations.llm_annotation There are two main scripts: create_assistant.py is used to create a new assistant with a particular provider and model. This class includes the definition of a common system prompt across all agents, using the file as the basis.data/guidelines.txt annotate_emotions.py is used to annotate a set of emotions using a previously created assistant. This script includes the assessment of the output format, as well as some common metrics for cost-efficiency analysis and output file generation. Our research includes an LLM-based annotation experimentation with 3 LLMs: GPT-4o, Mistral Large 2, and Gemini 2.0 Flash. To illustrate the usage of the code, in this README we refer to the code execution for generating annotations using GPT-4o. However, full code is provided for all LLMs. 🔑 Step 1: Add your API key If you haven't done this already, add your API key to the file in the root folder. For instance, for OpenAI, you can add the following:.env OPENAI_API_KEY=sk-proj-... 🛠️ Step 2: Create an assistant Create an assistant using the script. For instance, for GPT-4o, you can run the following command:create_assistant.py python ./code/llm_annotation/create_assistant_openai.py --guidelines ./data/guidelines.txt --model gpt-4o This will create an assistant loading the file and using the GPT-4o model.data/guidelines.txt 📝 Step 3: Annotate emotions Annotate emotions using the script. For instance, for GPT-4o, you can run the following command using a small subset of 100 reviews from the ground truth as an example:annotate_emotions.py python ./code/llm_annotation/annotate_emotions_openai.py --input ./data/ground-truth-small.xlsx --output ./data/annotations/llm/temperature-00/ --batch_size 10 --model gpt-4o --temperature 0 --sleep_time 10 For annotating the whole dataset, run the following command (IMPORTANT: this will take more than 60 minutes due to OpenAI, Mistral and Gemini consumption times!): python ./code/llm_annotation/annotate_emotions_openai.py --input ./data/ground-truth.xlsx --output ./data/annotations/llm/temperature-00/ --batch_size 10 --model gpt-4o --temperature 0 --sleep_time 10 Parameters include: input: path to the input file containing the set of reviews to annotate (e.g., ).data/ground-truth.xlsx output: path to the output folder where annotations will be saved (e.g., ).data/annotations/llm/temperature-00/ batch_size: number of reviews to annotate for each user request (e.g., 10). model: model to use for the annotation (e.g., ).gpt-4o temperature: temperature for the model responses (e.g., 0). sleep_time: time to wait between batches, in seconds (e.g., 10). This will annotate the emotions using the assistant created in the previous step, creating a new file with the same format as in the file.data/ground-truth.xlsx 🔄 Data processing In this stage, we refactor all files into iterations and we consolidate the agreement between multiple annotators or LLM runs. These logic serves both for human and LLM annotations. Parameters can be updated to include more annotators or LLM runs. ✂️ Step 4: Split annotations into iterations We split the annotations into iterations based on the number of annotators or LLM runs. For instance, for GPT-4o (run 0), we can run the following command: python code/data_processing/split_annotations.py --input_file data/annotations/llm/temperature-00/gpt-4o-0-annotations.xlsx --output_dir data/annotations/iterations/ This facilitates the Kappa analysis and agreement in alignment with each human iteration. 🤝 Step 5: Analyse agreement We consolidate the agreement between multiple annotators or LLM runs. For instance, for GPT-4o, we can run the following command to use the run from Step 3 (run 0) and three additional annotations (run 1, 2, and 3) already available in the replication package (NOTE: we simplify the process to speed up the analysis and avoid delays in annotation): python code/evaluation/agreement.py --input-folder data/annotations/iterations/ --output-folder data/agreements/ --annotators gpt-4o-0 gpt-4o-1 gpt-4o-2 gpt-4o-3 For replicating our original study, run the following: python code/evaluation/agreement.py --input-folder data/annotations/iterations/ --output-folder data/agreements/ --annotators gpt-4o-1 gpt-4o-2 gpt-4o-3 📊 Evaluation After consolidating agreements, we can evaluate both the Cohen's Kappa agreement and correctness between the human and LLM-based annotations. Our code allows any combination of annotators and LLM runs. 📈 Step 6: Emotion statistics We evaluate the statistics of the emotions in the annotations, including emotion frequency, distribution, and correlation between emotions. For instance, for GPT-4o and the example in this README file, we can run the following command: python code/evaluation/emotion_statistics.py --input-file data/agreements/agreement_gpt-4o-0-gpt-4o-1-gpt-4o-2-gpt-4o-3.xlsx --output-dir data/evaluation/statistics/gpt-4o-0123 For replicating our original study, run the following: python code/evaluation/emotion_statistics.py --input-file data/agreements/agreement_gpt-4o-1-gpt-4o-2-gpt-4o-3.xlsx --output-dir data/evaluation/statistics/gpt-4o ⚖️ Step 7: Cohen's Kappa pairwise agreement We measure the average pairwise Cohen's Kappa agreement between annotators or LLM runs. For instance, for GPT-4o and the example in this README file, we can run the following command: python code/evaluation/kappa.py --input_folder data/annotations/iterations/ --output_folder data/evaluation/kappa/ --annotators gpt-4o-0,gpt-4o-1,gpt-4o-2,gpt-4o-3 For replicating our original study, run the following: python code/evaluation/kappa.py --input_folder data/annotations/iterations/ --output_folder data/evaluation/kappa/ --annotators gpt-4o-1,gpt-4o-2,gpt-4o-3 --exclude 0,1,2 In our analysis, we exclude iterations 0, 1 and 2 as they were used for guidelines refinement. ✅ Step 8: LLM-based annotation correctness We measure the correctness (accuracy, precision, recall, and F1 score) between a set of annotated reviews and a given ground truth. For instance, for GPT-4o agreement and the example in this README file, we can run the following command: python code/evaluation/correctness.py --ground_truth data/ground-truth.xlsx --predictions data/agreements/agreement_gpt-4o-0-gpt-4o-1-gpt-4o-2-gpt-4o-3.xlsx --output_dir data/evaluation/correctness/gpt-4o For replicating our original study, run the following: python code/evaluation/correctness.py --ground_truth data/ground-truth.xlsx --predictions data/agreements/agreement_gpt-4o-1-gpt-4o-2-gpt-4o-3.xlsx --output_dir data/evaluation/correctness/gpt-4o 📝 Step 8: Check results After completing these steps, you will be able to check all generated artefacts, including: LLM annotations: available at data\annotations\llm\ Agreement between LLM annotations and humans: available at data\evaluation\kappa Correctness of LLM annotations with respect to Human agreement: available at data\evaluation\correctness 📜 License
11-10
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值