228. Summary Ranges

本文介绍了一种算法,用于处理已排序且无重复元素的整数数组,返回数组中连续元素的范围表示。通过实例展示了如何将连续的数字序列转换为简洁的范围字符串,如[0,1,2]转换为0->2。该算法适用于中等难度的数据处理任务。

Given a sorted integer array without duplicates, return the summary of its ranges.
Example 1:

Input:  [0,1,2,4,5,7]
Output: ["0->2","4->5","7"]
Explanation: 0,1,2 form a continuous range; 4,5 form a continuous range.

Example 2:

Input:  [0,2,3,4,6,8,9]
Output: ["0","2->4","6","8->9"]
Explanation: 2,3,4 form a continuous range; 8,9 form a continuous range.

难度:medium

题目:给定排序且无重复元素的整数数组,返回其连续元素的范围。

思路:用以变量记录连续元素的开始。

Runtime: 5 ms, faster than 7.57% of Java online submissions for Summary Ranges.
Memory Usage: 37.5 MB, less than 5.02% of Java online submissions for Summary Ranges.

class Solution {
    public List<String> summaryRanges(int[] nums) {
        List<String> result = new ArrayList<>();
        if (null == nums || nums.length < 1) {
            return result;
        }
        
        for (int i = 1, start = 0; i <= nums.length; i++) {
            if (i == nums.length || nums[i] - nums[i - 1] != 1) {
                result.add((i - 1 == start) ? String.format("%s", nums[start]) 
                    : String.format("%s->%s", nums[start], nums[i - 1]));
                start = i;
            }
        }
        
        return result;
    }
}
import pandas as pd import matplotlib.pyplot as plt import numpy as np from scipy import stats plt.rcParams['font.sans-serif'] = ['SimHei'] # 使用黑体显示中文 plt.rcParams['axes.unicode_minus'] = False # 解决保存图像时负号 '-' 显示为方块的问题 def data_preprocessing(df): df['监测时间'] = pd.to_datetime(df['监测时间'], format='%Y-%m-%d %H') df = df.set_index('监测时间') full_range = pd.date_range(start=df.index.min(), end=df.index.max(), freq='h') df_clean = df.reindex(full_range) valid_ranges = { 'PH 值': (6.0, 9.0), '氨氮排放量': (0.01, 1.5), '化学需氧量排放量': (0.05, 5.0) } for col, (min_val, max_val) in valid_ranges.items(): mask = df_clean[col].between(min_val, max_val) df_clean[col] = df_clean[col].where(mask).interpolate(method='spline', order=3) return df_clean def timing_drawing(df): plt.figure(figsize=(15, 9)) plt.subplot(3, 1, 1) plt.plot(df['PH 值'], color='#1f77b4') plt.title('PH 值时序变化趋势', fontproperties='SimHei') plt.ylabel('PH 值', fontproperties='SimHei') plt.subplot(3, 1, 2) plt.plot(df['氨氮排放量'], color='#2ca02c') plt.title('氨氮排放量时序变化趋势', fontproperties='SimHei') plt.ylabel('mg/L', fontproperties='SimHei') plt.subplot(3, 1, 3) plt.plot(df['化学需氧量排放量'], color='#d62728') plt.title('化学需氧量排放量时序变化趋势', fontproperties='SimHei') plt.ylabel('mg/L', fontproperties='SimHei') plt.tight_layout() plt.savefig('时间序列图.png', dpi=300) plt.show() def statistical_analysis(df): stats_summary = df.describe().transpose() median_values = df.median() stats_summary['median'] = median_values.values stats_summary = stats_summary[['mean', 'median', 'std', 'max', 'min']] # 确保输出的表格有正确的列名和指标名字 stats_summary.index.name = '指标' stats_summary.reset_index(inplace=True) stats_summary.columns = ['指标', '均值', '中位数', '标准差', '最大值', '最小值'] print("统计量分析表:") print(stats_summary.to_markdown(index=False)) if __name__ == "__main__": raw_df = pd.read_excel('B 题附件.xls') processed_df = data_preprocessing(raw_df) processed_df.reset_index().to_excel('预处理数据.xlsx', index=True) timing_drawing(processed_df) statistical_analysis(processed_df) 上述代码只需要PH值、氨氮排放量、化学需氧量排放量
03-24
Demo: Exploring Data with SAS Procedures In this demonstration, we use the PRINT, MEANS, UNIVARIATE, and FREQ procedures to explore and validate our data. Reminder: If you restarted your SAS session,you must recreate the PG1 library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the EPG1V2 folder. In Enterprise Guide, run the Autoexec process flow. Note: If you did not create the libname.sas program, complete the Activity: Create a Library for This Course (REQUIRED) in Lesson 2. Open p103d01.sas from the demos folder and locate the Demo section of the program. Complete the PROC PRINT statement to list the data in pg1.storm_summary. Use the DATA= option to specify the table name. Use the OBS= option to print only the first 10 observations. Highlight the step and run the selected code. Open code or syntax in a separate window. proc print data=pg1.storm_summary (obs=10); run; Review the output. The first 10 rows are displayed and all columns are included. Let's select the columns to include, and add a comment to document the program. Add a VAR statement to include only the following columns: Season, Name, Basin, MaxWindMPH, MinPressure, StartDate, and EndDate. Enterprise Guide: To easily add column names, use the autocomplete prompts to view and select columns. You can either double-click on a column to add it in the program, or start to type the column name and press the spacebar when the correct column is highlighted. SAS Studio: To easily add column names, place your cursor after the keyword VAR. Use the Library section of the navigation pane to find the pg1 library. Expand the storm_summary table to see a list of column names. Hold down the Ctrl key and select the columns in the order in which you want them to appear in the statement. Drag the selected columns to the VAR statement. Add the comment, list first 10 rows, before the PROC PRINT statement. Highlight the step and run the selected code. Open code or syntax in a separate window. /*list first 10 rows*/ proc print data=pg1.storm_summary(obs=10); var Season Name Basin MaxWindMPH MinPressure StartDate EndDate; run; Review the output. The output contains10 rows, but we've limited the columns and changed their order in the report. Remember we're using these procedures to validate our data. As we look at the values we notice that there are missing values for Name, and that Basin includes both lowercase and uppercase NA. There are also some missing values for MaxWindMPH and MinPressure. And that's just in the first 10 rows. Let's use other procedures to validate our data. Next we'll use PROC MEANS to compute summary statistics. Copy the PROC PRINT step, paste it at the end of the program and change PRINT to MEANS Remove the OBS= data set option to analyze all observations. Modify the VAR statement to include only MaxWindMPH and MinPressure. The columns on the VAR statement must be numeric. Add calculate summary statistics as a comment before the PROC MEANS statement. Highlight the step and run the selected code. Open code or syntax in a separate window. /*calculate summary statistics*/ proc means data=pg1.storm_summary; var MaxWindMPH MinPressure; run; Review the output. The report includes basic summary statistics for these two numeric columns. The frequency count, N, has different values for each column. This indicates that there are quite a few missing values for MinPressure compared to MaxWindMPH. You might want to look at the Minimum and Maximum to see if those ranges appear valid. Six is a pretty small value for MaxWindMPH. We might want to investigate further to see if this is a valid value. And negative 9,999 is definitely not valid for MinPressure. Next we'll use PROC UNIVARIATE to compute summary statistics including the 5 extreme low and high values. Copy the PROC MEANS step, paste it at the end of the program, and change MEANS to UNIVARIATE. Add examine extreme values as a comment before the PROC UNIVARIATE statement. Highlight the step and run the selected code. Open code or syntax in a separate window. /*examine extreme values*/ proc univariate data=pg1.storm_summary; var MaxWindMPH MinPressure; run; Review the output. Scroll down to the Extreme Observations table. It includes the observation number and the value for the low and high MaxWindMPH values. We might be interested later on in learning which storm had a maximum wind speed of 213 miles per hour. Scroll down to the Extreme Observations report for MinPressure. There are only two rows where the value for MinPressure was negative 9,999. That gives us a little more insight as to what we're dealing with in our data. Let's use PROC FREQ to generate one-way frequency reports. Copy the PROC UNIVARIATE step and paste it at the end of the program. Copy the PROC UNIVARIATE step, paste it at the end of the program, and change UNIVARIATE to FREQ. Change the VAR statement to a TABLES statement to produce frequency tables for Basin, Type, and Season. Add list unique values and frequencies as a comment before the PROC FREQ statement. Highlight the step and run the selected code. Open code or syntax in a separate window. /*list unique values and frequencies*/ proc freq data=pg1.storm_summary; tables Basin Type Season; run; Review the output. Recall that our PROC PRINT output showed that there were some lower case values for Basin. The Frequency table for Basin indicates that 16 rows have a lowercase na value. We'll need to correct that later on. The Type frequency table shows no case inconsistencies or incorrect values. The frequency table for Season shows how many storms there were per season. We can identify those seasons that had the most storms, such as 2005 with 102. These procedures give us insight to our data. They help us to learn about the data values and what we may need to fix as we prepare our data. They also help us think about questions we may want to answer as we analyze our data further.
最新发布
09-06
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值