B. Filling the Grid

这是一个关于如何在给定的宽为w,高为h的网格中填充单元格的问题,以满足每行从左侧开始连续填充的全满单元格数ri,以及每列从顶部开始连续填充的全满单元格数cj。题目要求找到满足条件的填充方式数量,并对10^9 + 7取模。给定输入包括h、w、ri数组和cj数组,需要求解可能的填充方案数。

B. Filling the Grid

input

standard input

output

standard output

Suppose there is a h×wh×w grid consisting of empty or full cells. Let's make some definitions:

  • riri is the number of consecutive full cells connected to the left side in the ii-th row (1≤i≤h1≤i≤h). In particular, ri=0ri=0 if the leftmost cell of the ii-th row is empty.
  • cjcj is the number of consecutive full cells connected to the top end in the jj-th column (1≤j≤w1≤j≤w). In particular, cj=0cj=0 if the topmost cell of the jj-th column is empty.

In other words, the ii-th row starts exactly with riri full cells. Similarly, the jj-th column starts exactly with cjcj full cells.

These are the rr and cc values of some 3×43×4 grid. Black cells are full and white cells are empty.

You have values of rr and cc. Initially, all cells are empty. Find the number of ways to fill grid cells to satisfy values of rr and cc. Since the answer can be very large, find the answer modulo 1000000007(109+7)1000000007(109+7). In other words, find the remainder after division of the answer by 1000000007(109+7)1000000007(109+7).

Input

The first line contains two integers hh and ww (1≤h,w≤1031≤h,w≤103) — the height and width of the grid.

The second line contains hh integers r1,r2,…,rhr1,r2,…,rh (0≤ri≤w0≤ri≤w) — the values of rr.

The third line contains ww integers c1,c2,…,cwc1,c2,…,cw (0≤cj≤h0≤cj≤h) — the values of cc.

Output

Print the answer modulo 1000000007(109+7)1000000007(109+7).

Examples

input

Copy

3 4
0 3 1
0 2 3 0

output

Copy

2

input

Copy

1 1
0
1

output

Copy

0

input

Copy

19 16
16 16 16 16 15 15 0 5 0 4 9 9 1 4 4 0 8 16 12
6 12 19 15 8 6 19 19 14 6 9 16 10 11 15 4

output

Copy

797922655

Note

In the first example, this is the other possible case.

In the second example, it's impossible to make a grid to satisfy such rr, cc values.

In the third example, make sure to print answer modulo (109+7)(109+7).

代码:

#include <bits/stdc++.h>
using namespace std;
const int mod=1e9+7;
long long r[1005];
long long c[1005],h,w;
int a[1005][1005]={0};
long long s=0;
int main()
{
	cin>>h>>w;
	for(int i=1;i<=h;i++)
	cin>>r[i];
	for(int j=1;j<=w;j++)
	cin>>c[j];
	int fl=1;
	for(int i=1;i<=h;i++)
	{
		if(fl==0)
		break;
		for(int j=1;j<r[i]+2&&j<=w;j++)
		{
			if(j<=r[i])
			{
				if(a[i][j]>=0)
				a[i][j]=1;
				else
				{
					fl=0;
					break;
				}
			}
			if(j==r[i]+1)
			{
				if(a[i][j]==1)
				{
					fl=0;
					break;
				}
				else
				a[i][j]=-1;
			}
		}
	}
	for(int i=1;i<=w;i++)
	{
		if(fl==0)
		break;
		for(int j=1;j<c[i]+2&&j<=h;j++)
		{
			if(j<=c[i])
			{
				if(a[j][i]>=0)
				a[j][i]=1;
				else
				{
					fl=0;
					break;
				}
			}
			if(j==c[i]+1)
			{
				if(a[j][i]==1)
				{
					fl=0;
					break;
				}
				else
				a[j][i]=-1;
			}
		}
	}
	if(fl==0)
	{
		cout<<0;
	}
	else
	{
		s=1;
		for(int i=1;i<=h;i++)
		{
			for(int j=1;j<=w;j++)
			{
				if(a[i][j]==0)
				{
					s=s*2;
					s=s%mod;
				}
			}
		}
		cout<<s%mod;
	}
	
}

 

**方法陈述(5分/5%)** * 将根据所陈述方法在客观上是否连贯一致来评估并给予相应分数; * 方法陈述中术语使用不当将被扣分; * 如果报告中声称所采用的方法与求解器的实际表现之间存在不一致,将被扣分。 **核心算法描述(5分/5%)** * 对所实现算法的理解和阐述程度如何? * 解释是否在概念上进行了适当的概括,或者相反,过于底层、仅根据自身实现的功能来详述算法? * 核心算法的解释是否与优化措施分开说明? **优化措施(5分/5%)** * 对优化措施的解释说明程度如何? * 所采用的优化措施是否复杂精密且数量可观? * 优化措施的成功程度如何?请结合经验性的求解器性能结果,从定性和定量两方面着重说明每项优化措施对降低整体搜索复杂度的贡献。如果尝试通过形式化的理论复杂度降低分析来争取额外加分,请确保在报告中包含完全数学形式化并推导的陈述:对于优化解决方案的复杂度指标,任何未经完整推导而引用的定量陈述将不予给分。 **反思与进一步工作建议(5分/5%)** * 对自身方法成功程度的评估是否到位? * 是否明确指出了其优势与局限性? * 是否提出了合理且有趣的未来工作,以解决具体的局限性? * 所提议的未来工作方法是否在概念上进行了定义?Statement of approach: The Sudoku problem can be regarded as a Constraint Satisfaction Problem.The 81 cells in the Sudoku are variables, with domains ranging from 1 to 9. The constraints are that each row, each column, and each sub-grid can only contain the numbers 1 to 9, and no repetition is allowed.The solver uses backtracking search based on constraint propagation and finding the minimum possible value.It identifies the possible values for the empty cells, then finds the cell with the fewest possible values which is beneficial for constraining the propagation process., and proceeds to solve them one by one. If it fails during the process, it will immediately backtrack. Core algorithm description: The solver employs a backtracking search based on constraint propagation. The set of variables is X={}, represents the cell in which row number and column number are specified. For each variable, possible values are .Firstly, traverse the same row, column, and sub-grid to remove the values that violate the constraints from each variable's domain, obtaining the possible number. This can reduce the search space. During the search process, if a variable is found to be empty or violates a constraint, immediately backtrack and try other values Optimisations: The solver uses a algorithm for finding the minimum number of possible number. At each step, it selects the variable with the fewest possible number for assignment. This helps to find the value of this variable as quickly as possible, because when selecting the cells with a limited possible number, if filling in a number violates the constraints, it can quickly backtrack and reduce the number of invalid assignments. So when solving hard problems, the average time taken is less than one second. Reflections and suggestions for further work: The average solving time of the Sudoku solver is 0.24 seconds, with an accuracy rate of approximately 98%. The performance of the solver is acceptable. It can quickly narrow down the search space and improve the search efficiency. However, the constraint propagation of the solver is weak.Future work will need to enhance the process of restricting the propagation. This can be achieved by immediately checking the variables that are related to a given variable when assigning a value to it, and eliminating the impossible numbers.是否符合要求
11-25
from __future__ import annotations import datetime import functools import pytz import io import math import os from collections import namedtuple import re import numpy as np import piexif import piexif.helper from PIL import Image, ImageFont, ImageDraw, ImageColor, PngImagePlugin, ImageOps # pillow_avif needs to be imported somewhere in code for it to work import pillow_avif # noqa: F401 import string import json import hashlib from modules import sd_samplers, shared, script_callbacks, errors from modules.paths_internal import roboto_ttf_file from modules.shared import opts LANCZOS = (Image.Resampling.LANCZOS if hasattr(Image, 'Resampling') else Image.LANCZOS) def get_font(fontsize: int): try: return ImageFont.truetype(opts.font or roboto_ttf_file, fontsize) except Exception: return ImageFont.truetype(roboto_ttf_file, fontsize) def image_grid(imgs, batch_size=1, rows=None): if rows is None: if opts.n_rows > 0: rows = opts.n_rows elif opts.n_rows == 0: rows = batch_size elif opts.grid_prevent_empty_spots: rows = math.floor(math.sqrt(len(imgs))) while len(imgs) % rows != 0: rows -= 1 else: rows = math.sqrt(len(imgs)) rows = round(rows) if rows > len(imgs): rows = len(imgs) cols = math.ceil(len(imgs) / rows) params = script_callbacks.ImageGridLoopParams(imgs, cols, rows) script_callbacks.image_grid_callback(params) w, h = map(max, zip(*(img.size for img in imgs))) grid_background_color = ImageColor.getcolor(opts.grid_background_color, 'RGB') grid = Image.new('RGB', size=(params.cols * w, params.rows * h), color=grid_background_color) for i, img in enumerate(params.imgs): img_w, img_h = img.size w_offset, h_offset = 0 if img_w == w else (w - img_w) // 2, 0 if img_h == h else (h - img_h) // 2 grid.paste(img, box=(i % params.cols * w + w_offset, i // params.cols * h + h_offset)) return grid class Grid(namedtuple("_Grid", ["tiles", "tile_w", "tile_h", "image_w", "image_h", "overlap"])): @property def tile_count(self) -> int: """ The total number of tiles in the grid. """ return sum(len(row[2]) for row in self.tiles) def split_grid(image: Image.Image, tile_w: int = 512, tile_h: int = 512, overlap: int = 64) -> Grid: w, h = image.size non_overlap_width = tile_w - overlap non_overlap_height = tile_h - overlap cols = math.ceil((w - overlap) / non_overlap_width) rows = math.ceil((h - overlap) / non_overlap_height) dx = (w - tile_w) / (cols - 1) if cols > 1 else 0 dy = (h - tile_h) / (rows - 1) if rows > 1 else 0 grid = Grid([], tile_w, tile_h, w, h, overlap) for row in range(rows): row_images = [] y = int(row * dy) if y + tile_h >= h: y = h - tile_h for col in range(cols): x = int(col * dx) if x + tile_w >= w: x = w - tile_w tile = image.crop((x, y, x + tile_w, y + tile_h)) row_images.append([x, tile_w, tile]) grid.tiles.append([y, tile_h, row_images]) return grid def combine_grid(grid): def make_mask_image(r): r = r * 255 / grid.overlap r = r.astype(np.uint8) return Image.fromarray(r, 'L') mask_w = make_mask_image(np.arange(grid.overlap, dtype=np.float32).reshape((1, grid.overlap)).repeat(grid.tile_h, axis=0)) mask_h = make_mask_image(np.arange(grid.overlap, dtype=np.float32).reshape((grid.overlap, 1)).repeat(grid.image_w, axis=1)) combined_image = Image.new("RGB", (grid.image_w, grid.image_h)) for y, h, row in grid.tiles: combined_row = Image.new("RGB", (grid.image_w, h)) for x, w, tile in row: if x == 0: combined_row.paste(tile, (0, 0)) continue combined_row.paste(tile.crop((0, 0, grid.overlap, h)), (x, 0), mask=mask_w) combined_row.paste(tile.crop((grid.overlap, 0, w, h)), (x + grid.overlap, 0)) if y == 0: combined_image.paste(combined_row, (0, 0)) continue combined_image.paste(combined_row.crop((0, 0, combined_row.width, grid.overlap)), (0, y), mask=mask_h) combined_image.paste(combined_row.crop((0, grid.overlap, combined_row.width, h)), (0, y + grid.overlap)) return combined_image class GridAnnotation: def __init__(self, text='', is_active=True): self.text = text self.is_active = is_active self.size = None def draw_grid_annotations(im, width, height, hor_texts, ver_texts, margin=0): color_active = ImageColor.getcolor(opts.grid_text_active_color, 'RGB') color_inactive = ImageColor.getcolor(opts.grid_text_inactive_color, 'RGB') color_background = ImageColor.getcolor(opts.grid_background_color, 'RGB') def wrap(drawing, text, font, line_length): lines = [''] for word in text.split(): line = f'{lines[-1]} {word}'.strip() if drawing.textlength(line, font=font) <= line_length: lines[-1] = line else: lines.append(word) return lines def draw_texts(drawing, draw_x, draw_y, lines, initial_fnt, initial_fontsize): for line in lines: fnt = initial_fnt fontsize = initial_fontsize while drawing.multiline_textsize(line.text, font=fnt)[0] > line.allowed_width and fontsize > 0: fontsize -= 1 fnt = get_font(fontsize) drawing.multiline_text((draw_x, draw_y + line.size[1] / 2), line.text, font=fnt, fill=color_active if line.is_active else color_inactive, anchor="mm", align="center") if not line.is_active: drawing.line((draw_x - line.size[0] // 2, draw_y + line.size[1] // 2, draw_x + line.size[0] // 2, draw_y + line.size[1] // 2), fill=color_inactive, width=4) draw_y += line.size[1] + line_spacing fontsize = (width + height) // 25 line_spacing = fontsize // 2 fnt = get_font(fontsize) pad_left = 0 if sum([sum([len(line.text) for line in lines]) for lines in ver_texts]) == 0 else width * 3 // 4 cols = im.width // width rows = im.height // height assert cols == len(hor_texts), f'bad number of horizontal texts: {len(hor_texts)}; must be {cols}' assert rows == len(ver_texts), f'bad number of vertical texts: {len(ver_texts)}; must be {rows}' calc_img = Image.new("RGB", (1, 1), color_background) calc_d = ImageDraw.Draw(calc_img) for texts, allowed_width in zip(hor_texts + ver_texts, [width] * len(hor_texts) + [pad_left] * len(ver_texts)): items = [] + texts texts.clear() for line in items: wrapped = wrap(calc_d, line.text, fnt, allowed_width) texts += [GridAnnotation(x, line.is_active) for x in wrapped] for line in texts: bbox = calc_d.multiline_textbbox((0, 0), line.text, font=fnt) line.size = (bbox[2] - bbox[0], bbox[3] - bbox[1]) line.allowed_width = allowed_width hor_text_heights = [sum([line.size[1] + line_spacing for line in lines]) - line_spacing for lines in hor_texts] ver_text_heights = [sum([line.size[1] + line_spacing for line in lines]) - line_spacing * len(lines) for lines in ver_texts] pad_top = 0 if sum(hor_text_heights) == 0 else max(hor_text_heights) + line_spacing * 2 result = Image.new("RGB", (im.width + pad_left + margin * (cols-1), im.height + pad_top + margin * (rows-1)), color_background) for row in range(rows): for col in range(cols): cell = im.crop((width * col, height * row, width * (col+1), height * (row+1))) result.paste(cell, (pad_left + (width + margin) * col, pad_top + (height + margin) * row)) d = ImageDraw.Draw(result) for col in range(cols): x = pad_left + (width + margin) * col + width / 2 y = pad_top / 2 - hor_text_heights[col] / 2 draw_texts(d, x, y, hor_texts[col], fnt, fontsize) for row in range(rows): x = pad_left / 2 y = pad_top + (height + margin) * row + height / 2 - ver_text_heights[row] / 2 draw_texts(d, x, y, ver_texts[row], fnt, fontsize) return result def draw_prompt_matrix(im, width, height, all_prompts, margin=0): prompts = all_prompts[1:] boundary = math.ceil(len(prompts) / 2) prompts_horiz = prompts[:boundary] prompts_vert = prompts[boundary:] hor_texts = [[GridAnnotation(x, is_active=pos & (1 << i) != 0) for i, x in enumerate(prompts_horiz)] for pos in range(1 << len(prompts_horiz))] ver_texts = [[GridAnnotation(x, is_active=pos & (1 << i) != 0) for i, x in enumerate(prompts_vert)] for pos in range(1 << len(prompts_vert))] return draw_grid_annotations(im, width, height, hor_texts, ver_texts, margin) def resize_image(resize_mode, im, width, height, upscaler_name=None): """ Resizes an image with the specified resize_mode, width, and height. Args: resize_mode: The mode to use when resizing the image. 0: Resize the image to the specified width and height. 1: Resize the image to fill the specified width and height, maintaining the aspect ratio, and then center the image within the dimensions, cropping the excess. 2: Resize the image to fit within the specified width and height, maintaining the aspect ratio, and then center the image within the dimensions, filling empty with data from image. im: The image to resize. width: The width to resize the image to. height: The height to resize the image to. upscaler_name: The name of the upscaler to use. If not provided, defaults to opts.upscaler_for_img2img. """ upscaler_name = upscaler_name or opts.upscaler_for_img2img def resize(im, w, h): if upscaler_name is None or upscaler_name == "None" or im.mode == 'L': return im.resize((w, h), resample=LANCZOS) scale = max(w / im.width, h / im.height) if scale > 1.0: upscalers = [x for x in shared.sd_upscalers if x.name == upscaler_name] if len(upscalers) == 0: upscaler = shared.sd_upscalers[0] print(f"could not find upscaler named {upscaler_name or '<empty string>'}, using {upscaler.name} as a fallback") else: upscaler = upscalers[0] im = upscaler.scaler.upscale(im, scale, upscaler.data_path) if im.width != w or im.height != h: im = im.resize((w, h), resample=LANCZOS) return im if resize_mode == 0: res = resize(im, width, height) elif resize_mode == 1: ratio = width / height src_ratio = im.width / im.height src_w = width if ratio > src_ratio else im.width * height // im.height src_h = height if ratio <= src_ratio else im.height * width // im.width resized = resize(im, src_w, src_h) res = Image.new("RGB", (width, height)) res.paste(resized, box=(width // 2 - src_w // 2, height // 2 - src_h // 2)) else: ratio = width / height src_ratio = im.width / im.height src_w = width if ratio < src_ratio else im.width * height // im.height src_h = height if ratio >= src_ratio else im.height * width // im.width resized = resize(im, src_w, src_h) res = Image.new("RGB", (width, height)) res.paste(resized, box=(width // 2 - src_w // 2, height // 2 - src_h // 2)) if ratio < src_ratio: fill_height = height // 2 - src_h // 2 if fill_height > 0: res.paste(resized.resize((width, fill_height), box=(0, 0, width, 0)), box=(0, 0)) res.paste(resized.resize((width, fill_height), box=(0, resized.height, width, resized.height)), box=(0, fill_height + src_h)) elif ratio > src_ratio: fill_width = width // 2 - src_w // 2 if fill_width > 0: res.paste(resized.resize((fill_width, height), box=(0, 0, 0, height)), box=(0, 0)) res.paste(resized.resize((fill_width, height), box=(resized.width, 0, resized.width, height)), box=(fill_width + src_w, 0)) return res if not shared.cmd_opts.unix_filenames_sanitization: invalid_filename_chars = '#<>:"/\\|?*\n\r\t' else: invalid_filename_chars = '/' invalid_filename_prefix = ' ' invalid_filename_postfix = ' .' re_nonletters = re.compile(r'[\s' + string.punctuation + ']+') re_pattern = re.compile(r"(.*?)(?:\[([^\[\]]+)\]|$)") re_pattern_arg = re.compile(r"(.*)<([^>]*)>$") max_filename_part_length = shared.cmd_opts.filenames_max_length NOTHING_AND_SKIP_PREVIOUS_TEXT = object() def sanitize_filename_part(text, replace_spaces=True): if text is None: return None if replace_spaces: text = text.replace(' ', '_') text = text.translate({ord(x): '_' for x in invalid_filename_chars}) text = text.lstrip(invalid_filename_prefix)[:max_filename_part_length] text = text.rstrip(invalid_filename_postfix) return text @functools.cache def get_scheduler_str(sampler_name, scheduler_name): """Returns {Scheduler} if the scheduler is applicable to the sampler""" if scheduler_name == 'Automatic': config = sd_samplers.find_sampler_config(sampler_name) scheduler_name = config.options.get('scheduler', 'Automatic') return scheduler_name.capitalize() @functools.cache def get_sampler_scheduler_str(sampler_name, scheduler_name): """Returns the '{Sampler} {Scheduler}' if the scheduler is applicable to the sampler""" return f'{sampler_name} {get_scheduler_str(sampler_name, scheduler_name)}' def get_sampler_scheduler(p, sampler): """Returns '{Sampler} {Scheduler}' / '{Scheduler}' / 'NOTHING_AND_SKIP_PREVIOUS_TEXT'""" if hasattr(p, 'scheduler') and hasattr(p, 'sampler_name'): if sampler: sampler_scheduler = get_sampler_scheduler_str(p.sampler_name, p.scheduler) else: sampler_scheduler = get_scheduler_str(p.sampler_name, p.scheduler) return sanitize_filename_part(sampler_scheduler, replace_spaces=False) return NOTHING_AND_SKIP_PREVIOUS_TEXT class FilenameGenerator: replacements = { 'basename': lambda self: self.basename or 'img', 'seed': lambda self: self.seed if self.seed is not None else '', 'seed_first': lambda self: self.seed if self.p.batch_size == 1 else self.p.all_seeds[0], 'seed_last': lambda self: NOTHING_AND_SKIP_PREVIOUS_TEXT if self.p.batch_size == 1 else self.p.all_seeds[-1], 'steps': lambda self: self.p and self.p.steps, 'cfg': lambda self: self.p and self.p.cfg_scale, 'width': lambda self: self.image.width, 'height': lambda self: self.image.height, 'styles': lambda self: self.p and sanitize_filename_part(", ".join([style for style in self.p.styles if not style == "None"]) or "None", replace_spaces=False), 'sampler': lambda self: self.p and sanitize_filename_part(self.p.sampler_name, replace_spaces=False), 'sampler_scheduler': lambda self: self.p and get_sampler_scheduler(self.p, True), 'scheduler': lambda self: self.p and get_sampler_scheduler(self.p, False), 'model_hash': lambda self: getattr(self.p, "sd_model_hash", shared.sd_model.sd_model_hash), 'model_name': lambda self: sanitize_filename_part(shared.sd_model.sd_checkpoint_info.name_for_extra, replace_spaces=False), 'date': lambda self: datetime.datetime.now().strftime('%Y-%m-%d'), 'datetime': lambda self, *args: self.datetime(*args), # accepts formats: [datetime], [datetime<Format>], [datetime<Format><Time Zone>] 'job_timestamp': lambda self: getattr(self.p, "job_timestamp", shared.state.job_timestamp), 'prompt_hash': lambda self, *args: self.string_hash(self.prompt, *args), 'negative_prompt_hash': lambda self, *args: self.string_hash(self.p.negative_prompt, *args), 'full_prompt_hash': lambda self, *args: self.string_hash(f"{self.p.prompt} {self.p.negative_prompt}", *args), # a space in between to create a unique string 'prompt': lambda self: sanitize_filename_part(self.prompt), 'prompt_no_styles': lambda self: self.prompt_no_style(), 'prompt_spaces': lambda self: sanitize_filename_part(self.prompt, replace_spaces=False), 'prompt_words': lambda self: self.prompt_words(), 'batch_number': lambda self: NOTHING_AND_SKIP_PREVIOUS_TEXT if self.p.batch_size == 1 or self.zip else self.p.batch_index + 1, 'batch_size': lambda self: self.p.batch_size, 'generation_number': lambda self: NOTHING_AND_SKIP_PREVIOUS_TEXT if (self.p.n_iter == 1 and self.p.batch_size == 1) or self.zip else self.p.iteration * self.p.batch_size + self.p.batch_index + 1, 'hasprompt': lambda self, *args: self.hasprompt(*args), # accepts formats:[hasprompt<prompt1|default><prompt2>..] 'clip_skip': lambda self: opts.data["CLIP_stop_at_last_layers"], 'denoising': lambda self: self.p.denoising_strength if self.p and self.p.denoising_strength else NOTHING_AND_SKIP_PREVIOUS_TEXT, 'user': lambda self: self.p.user, 'vae_filename': lambda self: self.get_vae_filename(), 'none': lambda self: '', # Overrides the default, so you can get just the sequence number 'image_hash': lambda self, *args: self.image_hash(*args) # accepts formats: [image_hash<length>] default full hash } default_time_format = '%Y%m%d%H%M%S' def __init__(self, p, seed, prompt, image, zip=False, basename=""): self.p = p self.seed = seed self.prompt = prompt self.image = image self.zip = zip self.basename = basename def get_vae_filename(self): """Get the name of the VAE file.""" import modules.sd_vae as sd_vae if sd_vae.loaded_vae_file is None: return "NoneType" file_name = os.path.basename(sd_vae.loaded_vae_file) split_file_name = file_name.split('.') if len(split_file_name) > 1 and split_file_name[0] == '': return split_file_name[1] # if the first character of the filename is "." then [1] is obtained. else: return split_file_name[0] def hasprompt(self, *args): lower = self.prompt.lower() if self.p is None or self.prompt is None: return None outres = "" for arg in args: if arg != "": division = arg.split("|") expected = division[0].lower() default = division[1] if len(division) > 1 else "" if lower.find(expected) >= 0: outres = f'{outres}{expected}' else: outres = outres if default == "" else f'{outres}{default}' return sanitize_filename_part(outres) def prompt_no_style(self): if self.p is None or self.prompt is None: return None prompt_no_style = self.prompt for style in shared.prompt_styles.get_style_prompts(self.p.styles): if style: for part in style.split("{prompt}"): prompt_no_style = prompt_no_style.replace(part, "").replace(", ,", ",").strip().strip(',') prompt_no_style = prompt_no_style.replace(style, "").strip().strip(',').strip() return sanitize_filename_part(prompt_no_style, replace_spaces=False) def prompt_words(self): words = [x for x in re_nonletters.split(self.prompt or "") if x] if len(words) == 0: words = ["empty"] return sanitize_filename_part(" ".join(words[0:opts.directories_max_prompt_words]), replace_spaces=False) def datetime(self, *args): time_datetime = datetime.datetime.now() time_format = args[0] if (args and args[0] != "") else self.default_time_format try: time_zone = pytz.timezone(args[1]) if len(args) > 1 else None except pytz.exceptions.UnknownTimeZoneError: time_zone = None time_zone_time = time_datetime.astimezone(time_zone) try: formatted_time = time_zone_time.strftime(time_format) except (ValueError, TypeError): formatted_time = time_zone_time.strftime(self.default_time_format) return sanitize_filename_part(formatted_time, replace_spaces=False) def image_hash(self, *args): length = int(args[0]) if (args and args[0] != "") else None return hashlib.sha256(self.image.tobytes()).hexdigest()[0:length] def string_hash(self, text, *args): length = int(args[0]) if (args and args[0] != "") else 8 return hashlib.sha256(text.encode()).hexdigest()[0:length] def apply(self, x): res = '' for m in re_pattern.finditer(x): text, pattern = m.groups() if pattern is None: res += text continue pattern_args = [] while True: m = re_pattern_arg.match(pattern) if m is None: break pattern, arg = m.groups() pattern_args.insert(0, arg) fun = self.replacements.get(pattern.lower()) if fun is not None: try: replacement = fun(self, *pattern_args) except Exception: replacement = None errors.report(f"Error adding [{pattern}] to filename", exc_info=True) if replacement == NOTHING_AND_SKIP_PREVIOUS_TEXT: continue elif replacement is not None: res += text + str(replacement) continue res += f'{text}[{pattern}]' return res def get_next_sequence_number(path, basename): """ Determines and returns the next sequence number to use when saving an image in the specified directory. The sequence starts at 0. """ result = -1 if basename != '': basename = f"{basename}-" prefix_length = len(basename) for p in os.listdir(path): if p.startswith(basename): parts = os.path.splitext(p[prefix_length:])[0].split('-') # splits the filename (removing the basename first if one is defined, so the sequence number is always the first element) try: result = max(int(parts[0]), result) except ValueError: pass return result + 1 def save_image_with_geninfo(image, geninfo, filename, extension=None, existing_pnginfo=None, pnginfo_section_name='parameters'): """ Saves image to filename, including geninfo as text information for generation info. For PNG images, geninfo is added to existing pnginfo dictionary using the pnginfo_section_name argument as key. For JPG images, there's no dictionary and geninfo just replaces the EXIF description. """ if extension is None: extension = os.path.splitext(filename)[1] image_format = Image.registered_extensions()[extension] if extension.lower() == '.png': existing_pnginfo = existing_pnginfo or {} if opts.enable_pnginfo: existing_pnginfo[pnginfo_section_name] = geninfo if opts.enable_pnginfo: pnginfo_data = PngImagePlugin.PngInfo() for k, v in (existing_pnginfo or {}).items(): pnginfo_data.add_text(k, str(v)) else: pnginfo_data = None image.save(filename, format=image_format, quality=opts.jpeg_quality, pnginfo=pnginfo_data) elif extension.lower() in (".jpg", ".jpeg", ".webp"): if image.mode == 'RGBA': image = image.convert("RGB") elif image.mode == 'I;16': image = image.point(lambda p: p * 0.0038910505836576).convert("RGB" if extension.lower() == ".webp" else "L") image.save(filename, format=image_format, quality=opts.jpeg_quality, lossless=opts.webp_lossless) if opts.enable_pnginfo and geninfo is not None: exif_bytes = piexif.dump({ "Exif": { piexif.ExifIFD.UserComment: piexif.helper.UserComment.dump(geninfo or "", encoding="unicode") }, }) piexif.insert(exif_bytes, filename) elif extension.lower() == '.avif': if opts.enable_pnginfo and geninfo is not None: exif_bytes = piexif.dump({ "Exif": { piexif.ExifIFD.UserComment: piexif.helper.UserComment.dump(geninfo or "", encoding="unicode") }, }) else: exif_bytes = None image.save(filename,format=image_format, quality=opts.jpeg_quality, exif=exif_bytes) elif extension.lower() == ".gif": image.save(filename, format=image_format, comment=geninfo) else: image.save(filename, format=image_format, quality=opts.jpeg_quality) def save_image(image, path, basename, seed=None, prompt=None, extension='png', info=None, short_filename=False, no_prompt=False, grid=False, pnginfo_section_name='parameters', p=None, existing_info=None, forced_filename=None, suffix="", save_to_dirs=None): """Save an image. Args: image (`PIL.Image`): The image to be saved. path (`str`): The directory to save the image. Note, the option `save_to_dirs` will make the image to be saved into a sub directory. basename (`str`): The base filename which will be applied to `filename pattern`. seed, prompt, short_filename, extension (`str`): Image file extension, default is `png`. pngsectionname (`str`): Specify the name of the section which `info` will be saved in. info (`str` or `PngImagePlugin.iTXt`): PNG info chunks. existing_info (`dict`): Additional PNG info. `existing_info == {pngsectionname: info, ...}` no_prompt: TODO I don't know its meaning. p (`StableDiffusionProcessing`) forced_filename (`str`): If specified, `basename` and filename pattern will be ignored. save_to_dirs (bool): If true, the image will be saved into a subdirectory of `path`. Returns: (fullfn, txt_fullfn) fullfn (`str`): The full path of the saved imaged. txt_fullfn (`str` or None): If a text file is saved for this image, this will be its full path. Otherwise None. """ namegen = FilenameGenerator(p, seed, prompt, image, basename=basename) # WebP and JPG formats have maximum dimension limits of 16383 and 65535 respectively. switch to PNG which has a much higher limit if (image.height > 65535 or image.width > 65535) and extension.lower() in ("jpg", "jpeg") or (image.height > 16383 or image.width > 16383) and extension.lower() == "webp": print('Image dimensions too large; saving as PNG') extension = "png" if save_to_dirs is None: save_to_dirs = (grid and opts.grid_save_to_dirs) or (not grid and opts.save_to_dirs and not no_prompt) if save_to_dirs: dirname = namegen.apply(opts.directories_filename_pattern or "[prompt_words]").lstrip(' ').rstrip('\\ /') path = os.path.join(path, dirname) os.makedirs(path, exist_ok=True) if forced_filename is None: if short_filename or seed is None: file_decoration = "" elif opts.save_to_dirs: file_decoration = opts.samples_filename_pattern or "[seed]" else: file_decoration = opts.samples_filename_pattern or "[seed]-[prompt_spaces]" file_decoration = namegen.apply(file_decoration) + suffix add_number = opts.save_images_add_number or file_decoration == '' if file_decoration != "" and add_number: file_decoration = f"-{file_decoration}" if add_number: basecount = get_next_sequence_number(path, basename) fullfn = None for i in range(500): fn = f"{basecount + i:05}" if basename == '' else f"{basename}-{basecount + i:04}" fullfn = os.path.join(path, f"{fn}{file_decoration}.{extension}") if not os.path.exists(fullfn): break else: fullfn = os.path.join(path, f"{file_decoration}.{extension}") else: fullfn = os.path.join(path, f"{forced_filename}.{extension}") pnginfo = existing_info or {} if info is not None: pnginfo[pnginfo_section_name] = info params = script_callbacks.ImageSaveParams(image, p, fullfn, pnginfo) script_callbacks.before_image_saved_callback(params) image = params.image fullfn = params.filename info = params.pnginfo.get(pnginfo_section_name, None) def _atomically_save_image(image_to_save, filename_without_extension, extension): """ save image with .tmp extension to avoid race condition when another process detects new image in the directory """ temp_file_path = f"{filename_without_extension}.tmp" save_image_with_geninfo(image_to_save, info, temp_file_path, extension, existing_pnginfo=params.pnginfo, pnginfo_section_name=pnginfo_section_name) filename = filename_without_extension + extension if shared.opts.save_images_replace_action != "Replace": n = 0 while os.path.exists(filename): n += 1 filename = f"{filename_without_extension}-{n}{extension}" os.replace(temp_file_path, filename) fullfn_without_extension, extension = os.path.splitext(params.filename) if hasattr(os, 'statvfs'): max_name_len = os.statvfs(path).f_namemax fullfn_without_extension = fullfn_without_extension[:max_name_len - max(4, len(extension))] params.filename = fullfn_without_extension + extension fullfn = params.filename _atomically_save_image(image, fullfn_without_extension, extension) image.already_saved_as = fullfn oversize = image.width > opts.target_side_length or image.height > opts.target_side_length if opts.export_for_4chan and (oversize or os.stat(fullfn).st_size > opts.img_downscale_threshold * 1024 * 1024): ratio = image.width / image.height resize_to = None if oversize and ratio > 1: resize_to = round(opts.target_side_length), round(image.height * opts.target_side_length / image.width) elif oversize: resize_to = round(image.width * opts.target_side_length / image.height), round(opts.target_side_length) if resize_to is not None: try: # Resizing image with LANCZOS could throw an exception if e.g. image mode is I;16 image = image.resize(resize_to, LANCZOS) except Exception: image = image.resize(resize_to) try: _atomically_save_image(image, fullfn_without_extension, ".jpg") except Exception as e: errors.display(e, "saving image as downscaled JPG") if opts.save_txt and info is not None: txt_fullfn = f"{fullfn_without_extension}.txt" with open(txt_fullfn, "w", encoding="utf8") as file: file.write(f"{info}\n") else: txt_fullfn = None script_callbacks.image_saved_callback(params) return fullfn, txt_fullfn IGNORED_INFO_KEYS = { 'jfif', 'jfif_version', 'jfif_unit', 'jfif_density', 'dpi', 'exif', 'loop', 'background', 'timestamp', 'duration', 'progressive', 'progression', 'icc_profile', 'chromaticity', 'photoshop', } def read_info_from_image(image: Image.Image) -> tuple[str | None, dict]: items = (image.info or {}).copy() geninfo = items.pop('parameters', None) if "exif" in items: exif_data = items["exif"] try: exif = piexif.load(exif_data) except OSError: # memory / exif was not valid so piexif tried to read from a file exif = None exif_comment = (exif or {}).get("Exif", {}).get(piexif.ExifIFD.UserComment, b'') try: exif_comment = piexif.helper.UserComment.load(exif_comment) except ValueError: exif_comment = exif_comment.decode('utf8', errors="ignore") if exif_comment: geninfo = exif_comment elif "comment" in items: # for gif if isinstance(items["comment"], bytes): geninfo = items["comment"].decode('utf8', errors="ignore") else: geninfo = items["comment"] for field in IGNORED_INFO_KEYS: items.pop(field, None) if items.get("Software", None) == "NovelAI": try: json_info = json.loads(items["Comment"]) sampler = sd_samplers.samplers_map.get(json_info["sampler"], "Euler a") geninfo = f"""{items["Description"]} Negative prompt: {json_info["uc"]} Steps: {json_info["steps"]}, Sampler: {sampler}, CFG scale: {json_info["scale"]}, Seed: {json_info["seed"]}, Size: {image.width}x{image.height}, Clip skip: 2, ENSD: 31337""" except Exception: errors.report("Error parsing NovelAI image generation parameters", exc_info=True) return geninfo, items def image_data(data): import gradio as gr try: image = read(io.BytesIO(data)) textinfo, _ = read_info_from_image(image) return textinfo, None except Exception: pass try: text = data.decode('utf8') assert len(text) < 10000 return text, None except Exception: pass return gr.update(), None def flatten(img, bgcolor): """replaces transparency with bgcolor (example: "#ffffff"), returning an RGB mode image with no transparency""" if img.mode == "RGBA": background = Image.new('RGBA', img.size, bgcolor) background.paste(img, mask=img) img = background return img.convert('RGB') def read(fp, **kwargs): image = Image.open(fp, **kwargs) image = fix_image(image) return image def fix_image(image: Image.Image): if image is None: return None try: image = ImageOps.exif_transpose(image) image = fix_png_transparency(image) except Exception: pass return image def fix_png_transparency(image: Image.Image): if image.mode not in ("RGB", "P") or not isinstance(image.info.get("transparency"), bytes): return image image = image.convert("RGBA") return image详细解释一下
10-31
function [B,FB] = fillmissing(A,fillMethod,varargin) %FILLMISSING Fill missing entries % First argument must be numeric, datetime, duration, calendarDuration, % string, categorical, character array, cell array of character vectors, % a table, or a timetable. % Standard missing data is defined as: % NaN - for double and single floating-point arrays % NaN - for duration and calendarDuration arrays % NaT - for datetime arrays % <missing> - for string arrays % <undefined> - for categorical arrays % empty character {''} - for cell arrays of character vectors % % B = FILLMISSING(A,'constant',C) fills missing entries in A with the % constant scalar value C. You can also use a vector C to specify % different fill constants for each column (or table variable) in A: C(i) % represents the fill constant used for the i-th column of A. For tables % A, C can also be a cell containing fill constants of different types. % % B = FILLMISSING(A,INTERP) fills standard missing entries using the % interpolation method specified by INTERP, which must be: % 'previous' - Previous non-missing entry. % 'next' - Next non-missing entry. % 'nearest' - Nearest non-missing entry. % 'linear' - Linear interpolation of non-missing entries. % 'spline' - Piecewise cubic spline interpolation. % 'pchip' - Shape-preserving piecewise cubic spline interpolation. % 'makima' - modified Akima cubic interpolation. % % B = FILLMISSING(A,MOV,K) fills standard missing entries using a % centered moving window formed from neighboring non-missing entries. % K specifies the window length and must be a positive integer scalar. % MOV specifies the moving window method, which must be: % 'movmean' - Moving average of neighboring non-missing entries. % 'movmedian' - Moving median of neighboring non-missing entries. % % B = FILLMISSING(A,MOV,[NB NF]) uses a moving window defined by the % previous NB elements, the current element, and the next NF elements. % % B = FILLMISSING(A,'knn') fills standard missing entries with the % corresponding element from the nearest neighbor row, calculated based % on the Euclidean distance between rows. % % B = FILLMISSING(A,'knn',k) fills standard missing entries with the % mean of the corresponding entries in the k-nearest neighbor rows, % calculated based on the Euclidean distance between rows. % % B = FILLMISSING(A,fillfun,K) fills standard missing entries using the % function handle fillfun and a centered fixed window formed from % neighboring non-missing entries. K specifies the window length and must % be a positive scalar. The function handle fillfun requires % three input arguments, (xs, ts, tq), which are vectors containing the % sample data xs of length K, the sample data locations ts of length K, % and the missing data locations tq. The vectors ts and tq are subsets of % the 'SamplePoints' vector. The output of fillfun must be either a % scalar or a vector with the same length as tq. % % B = FILLMISSING(A,fillfun,[NB NF]) uses a fixed window defined by the % NB elements before a gap of missing values and the NF elements after % the gap when specifying a function handle fillfun. % % Optional arguments: % % B = FILLMISSING(A,METHOD,...,'MissingLocations',M) specifies the % missing data locations. Elements of M that are true indicate missing % data in the corresponding elements of A. % % B = FILLMISSING(A,METHOD,...,'EndValues',E) also specifies how to % extrapolate leading and trailing missing values. E must be: % 'extrap' - (default) Use METHOD to also extrapolate missing data. % 'previous' - Previous non-missing entry. % 'next' - Next non-missing entry. % 'nearest' - Nearest non-missing entry. % 'none' - No extrapolation of missing values. % VALUE - Use an extrapolation constant. VALUE must be a scalar % or a vector of type numeric, duration, or datetime. % 'EndValues' is not supported for the 'knn' method. % % B = FILLMISSING(A,METHOD,...,'SamplePoints',X) also specifies the % sample points X used by the fill method. X must be a floating-point, % duration, or datetime vector. If the first input A is a table, X can % also specify a table variable in A. X must be sorted and contain unique % points. You can use X to specify time stamps for the data. By default, % FILLMISSING uses data sampled uniformly at points X = [1 2 3 ... ]. Not % supported for the 'knn' method. % % B = FILLMISSING(A,...,'MaxGap',G) specifies a maximum gap size to fill. % Gaps larger than G will not be filled. A gap is a set of consecutive % missing data points whose size is the distance between the known values % at the ends of the gap. Here, distance is relative to the Sample % Points. Not supported for the 'knn' method. % % B = FILLMISSING(A,'knn',...,'Distance',D) specifies the distance metric % used to calculate the nearest neighbors. D must be: % 'euclidean' - (default) Euclidean distance % 'seuclidean' - Scaled Euclidean distance % function handle - A distance function % A distance function must accept two inputs: a 2xn matrix, table, or % timetable containing two vectors to be compared, and a 2xn logical % matrix indicating the locations of missing values in the vectors. It % must return the distance as a real, scalar double. % % B = FILLMISSING(A,METHOD,DIM,...) also specifies a dimension DIM to % operate along. A must be an array. % % [B,FB] = FILLMISSING(A,...) also returns a logical array FB indicating % the filled entries in B that were previously missing. FB has the same % size as B. % % Arguments supported only for table inputs: % % B = FILLMISSING(A,...,'DataVariables',DV) fills missing data only in % the table variables specified by DV. The default is all table variables % in A. DV must be a table variable name, a cell array of table variable % names, a vector of table variable indices, a logical vector, a function % handle that returns a logical scalar (such as @isnumeric), or a table % vartype subscript. Output table B has the same size as input table A. % % B = FILLMISSING(...,'ReplaceValues',TF) specifies how the filled data % is returned. TF must be one of the following: % true - (default) replace table variables with the filled data % false - append the filled data as additional table variables % % Examples: % % % Linear interpolation of NaN entries % a = [NaN 1 2 NaN 4 NaN] % b = fillmissing(a,'linear') % % % Quadratic fitting using a custom function handle % t = linspace(0,1,10); % a = sin(2*pi*t); a(a > 0.7 | a < -0.7) = NaN % fn = @(xs, ts, tq) polyval(polyfit(ts, xs, 2), tq) % b = fillmissing(a, fn, [2 2]) % % % Fill leading and trailing NaN entries with their nearest neighbors % a = [NaN 1 2 NaN 4 NaN] % b = fillmissing(a,'linear','EndValues','nearest') % % % Fill NaN entries with their previous neighbors (zero-order-hold) % A = [1000 1 -10; NaN 1 NaN; NaN 1 NaN; -1 77 5; NaN(1,3)] % B = fillmissing(A,'previous') % % % Fill NaN entries with the mean of each column % A = [NaN(1,3); 13 1 -20; NaN(4,1) (1:4)' NaN(4,1); -1 7 -10; NaN(1,3)] % C = mean(A,'omitnan'); % B = fillmissing(A,'constant',C) % % % Linear interpolation of NaN entries for non-uniformly spaced data % x = [linspace(-3,1,120) linspace(1.1,7,30)]; % a = exp(-0.1*x).*sin(2*x); a(a > -0.2 & a < 0.2) = NaN; % [b,id] = fillmissing(a,'linear','SamplePoints',x); % plot(x,a,'.', x(id),b(id),'o') % title('''linear'' fill') % xlabel('Sample points x'); % legend('original data','filled missing data') % % % Fill missing entries in tables with their previous neighbors % temperature = [21.1 21.5 NaN 23.1 25.7 24.1 25.3 NaN 24.1 25.5]'; % windSpeed = [12.9 13.3 12.1 13.5 10.9 NaN NaN 12.2 10.8 17.1]'; % windDirection = categorical({'W' 'SW' 'SW' '' 'SW' 'S' ... % 'S' 'SW' 'SW' 'SW'})'; % conditions = {'PTCLDY' '' '' 'PTCLDY' 'FAIR' 'CLEAR' ... % 'CLEAR' 'FAIR' 'PTCLDY' 'MOSUNNY'}'; % T = table(temperature,windSpeed,windDirection,conditions) % U = fillmissing(T,'previous') % % % Fill NaN entries with the corresponding entry from the most similar % % row (based on the Euclidean distance between rows): % A = [1 NaN 3 2; 7 2 3 2; NaN 1 3 2; 1 3 2 2] % B = fillmissing(A,'knn') % % % Fill NaN entries with the corresponding entry from the most similar % % row (based on the city block distance, ignoring all NaNs): % A = [1 NaN 3 2; 7 2 3 2; NaN 1 3 2; 1 3 2 2] % cityBlockDist = @(x,~) sum(abs(diff(x)),'omitmissing'); % B = fillmissing(A,'knn','Distance',cityBlockDist) % % % Fill only the entries specified by the logical mask % a = [1 NaN 3 4 5] % mask = [false false false true false] % fillmissing(a,'constant',10,'MissingLocations',mask) % % % Fill missing entries only in gaps less than or equal to 3 % a = [20 NaN NaN NaN NaN 10 8 NaN NaN 2] % b = fillmissing(a,'linear','MaxGap',3) % % See also ISMISSING, STANDARDIZEMISSING, RMMISSING, ISNAN, ISNAT % ISOUTLIER, FILLMISSING2, FILLOUTLIERS, RMOUTLIERS, SMOOTHDATA % Copyright 2015-2023 The MathWorks, Inc. [A,AisTable,intM,intConstOrWindowSizeOrK,extM,x,dim,dataVars,ma,maxgap,replace,distance] = parseInputs(A,fillMethod,varargin{:}); if strcmp(intM,'knn') [B,FB] = knnFill(A,AisTable,intConstOrWindowSizeOrK,dataVars,ma,distance,dim,replace); return end if ~AisTable [intConstOrWindowSizeOrK,extM] = checkArrayType(A,intM,intConstOrWindowSizeOrK,extM,x,false,ma); if nargout < 2 B = fillArray(A,intM,intConstOrWindowSizeOrK,extM,x,dim,false,ma,maxgap); else [B,FB] = fillArray(A,intM,intConstOrWindowSizeOrK,extM,x,dim,false,ma,maxgap); end else if nargout < 2 B = fillTable(A,intM,intConstOrWindowSizeOrK,extM,x,dataVars,ma,maxgap,replace); else [B,FB] = fillTable(A,intM,intConstOrWindowSizeOrK,extM,x,dataVars,ma,maxgap,replace); end end end %-------------------------------------------------------------------------- function [B,FA] = fillTable(A,intMethod,intConst,extMethod,x,dataVars,ma,maxgap,replace) % Fill table according to DataVariables if replace B = A; else B = A(:,dataVars); dataVars = 1:width(B); end if nargout > 1 FA = false(size(B)); end useJthFillConstant = strcmp(intMethod,'constant') && ~isscalar(intConst) && ~ischar(intConst); useJthExtrapConstant = ~ischar(extMethod) && ~isscalar(extMethod); indVj = 1; if istabular(ma) % Convert names in cell arrays to string to allow direct dot index % Need to get the actual names to allow random order of % variable names in tabular MissingLocations tnames = string(A.Properties.VariableNames); end for vj = dataVars if isempty(ma) mavj = ma; % Need to call ismissing else % 'MissingLocations' provided if istabular(ma) name = tnames(vj); mavj = ma.(name); else mavj = ma(:,vj); end end if nargout < 2 B.(vj) = fillTableVar(indVj,B.(vj),intMethod,intConst,extMethod,x,useJthFillConstant,useJthExtrapConstant,mavj,maxgap,B,vj); else [B.(vj),FA(:,vj)] = fillTableVar(indVj,B.(vj),intMethod,intConst,extMethod,x,useJthFillConstant,useJthExtrapConstant,mavj,maxgap,B,vj); end indVj = indVj+1; end if ~replace % alternate FA output: % FA = array2table(FA); % FA.Properties.VariableNames = B.Properties.VariableNames; B = matlab.internal.math.appendDataVariables(A,B,"filled"); if nargout > 1 FA = [false(size(A)) FA]; end end end % fillTable %-------------------------------------------------------------------------- function [Bvj,FAvj] = fillTableVar(indVj,Avj,intMethod,intConst,extMethod,x,useJthFillConstant,useJthExtrapConstant,ma,maxgap,A,vj) % Fill each table variable intConstVj = intConst; extMethodVj = extMethod; if useJthFillConstant intConstVj = intConst(indVj); end if iscell(intConstVj) intConstVj = checkConstantsSize(Avj,false,true,intConstVj{1},1,[],''); end if useJthExtrapConstant extMethodVj = extMethod(indVj); end % Validate types of array and fill constants [intConstVj,extMethodVj] = checkArrayType(Avj,intMethod,intConstVj,extMethodVj,x,true,ma,A,vj); % Treat row in a char table variable as a string AisCharTableVar = ischar(Avj); if AisCharTableVar AvjCharInit = Avj; Avj = matlab.internal.math.charRows2string(Avj); if strcmp(intMethod,'constant') intConstVj = matlab.internal.math.charRows2string(intConstVj,true); end end % Fill if nargout < 2 Bvj = fillArray(Avj,intMethod,intConstVj,extMethodVj,x,1,true,ma,maxgap); else [Bvj,FAvj] = fillArray(Avj,intMethod,intConstVj,extMethodVj,x,1,true,ma,maxgap); end % Convert back to char table variable if AisCharTableVar if all(ismissing(Avj(:))) % For completely blank char table variables, force B to equal A Bvj = AvjCharInit; else Bvj = matlab.internal.math.string2charRows(Bvj); end end end % fillTableVar %-------------------------------------------------------------------------- function [B,FA] = fillArray(A,intMethod,intConstOrWindowSizeOrK,extMethod,x,dim,AisTableVar,ma,maxgap) % Perform FILLMISSING of standard missing entries in an array A B = A; didIsmissing = isempty(ma); if didIsmissing FA = ismissing(A); else % 'MissingLocations' provided if AisTableVar FA = repmat(ma,1,prod(size(A,2:ndims(A)))); else FA = ma; end end ndimsBin = ndims(B); % Quick return if ~AisTableVar && dim > ndimsBin && ~isa(intMethod,'function_handle') if isnumeric(B) && ~isreal(B) B(true(size(B))) = B; end if ~isfinite(maxgap) if nargout < 2 B = extrapolateWithConstant(B,intMethod,intConstOrWindowSizeOrK,extMethod,FA,FA); else [B,FA] = extrapolateWithConstant(B,intMethod,intConstOrWindowSizeOrK,extMethod,FA,FA); end end % else consider the gap too large, don't fill return end % Permute and reshape into a matrix permNeeded = dim ~= 1 || ndimsBin > 2; if permNeeded dim = min(ndimsBin + 1, dim); % all dim > ndimsBin behave the same way, this avoids errors for arbitrarily large dim perm = [dim, 1:(dim-1), (dim+1):ndimsBin]; sizeBperm = size(B, perm); ncolsB = prod(sizeBperm(2:end)); nrowsB = sizeBperm(1); B = reshape(permute(B, perm),[nrowsB, ncolsB]); % permute errors expectedly for ND sparse matrix FA = reshape(permute(FA, perm),[nrowsB, ncolsB]); else ncolsB = size(B,2); end % Fill each column if didIsmissing || nargout < 2 % For ismissing, compute the filled mask at the very end. This ensures % that tall/fillmissing takes 2 passes instead of 3 for two outputs. for jj = 1:ncolsB B(:,jj) = fillArrayColumn(jj,B(:,jj),FA(:,jj),intMethod,intConstOrWindowSizeOrK,extMethod,x,maxgap,didIsmissing); end else % For 'MissingLocations', also compute the filled mask for jj = 1:ncolsB [B(:,jj),FA(:,jj)] = fillArrayColumn(jj,B(:,jj),FA(:,jj),intMethod,intConstOrWindowSizeOrK,extMethod,x,maxgap,didIsmissing); end end % Reshape and permute back to original size if AisTableVar && nargout > 1 FA = any(FA,2); if didIsmissing FA = xor(FA,any(ismissing(B),2)); % Compute the filled mask end end if permNeeded B = ipermute(reshape(B,sizeBperm), perm); end if ~AisTableVar && nargout > 1 if permNeeded FA = ipermute(reshape(FA,sizeBperm), perm); end if didIsmissing FA(FA) = xor(FA(FA),ismissing(B(FA))); % Compute the filled mask end end end % fillArray %-------------------------------------------------------------------------- function [b,ma] = fillArrayColumn(jj,a,ma,intMethod,intConstOrWindowSizeOrK,extMethod,x,maxgap,didIsmissing) % Fill one column. Do not error if we cannot fill all missing entries. % jj = j-th column numeric index. Used to select the j-th fill constant. % a = the j-th column itself. Can be numeric, logical, duration, datetime, % calendarDuration, char, string, cellstr, or categorical. % ma = logical mask of missing entries found in a. % intMethod = interpolation method. % intConstOrWindowSizeOrK = interpolation constant for 'constant' or window size % for 'movmean'. [] if intMethod is not 'constant'/'mov*'. % extMethod = extrap method. If not a char, it holds the extrap constant. % x = the abscissa ('SamplePoints'). Can be float, duration, or datetime. b = a; % Quick return nma = ~ma; numNonMissing = nnz(nma); useDefaultX = isempty(x); spFlag = ~useDefaultX; % whether sample points are used % Default sample points only need to be generated when MaxGap is used, the % input data is non-numeric, or the method is a function handle or an % interpolation method that uses sample points. Note that "knn" also needs % default sample points, but does not use this function. if useDefaultX && (~isnumeric(a) || isfinite(maxgap) || ... isa(intMethod,'function_handle') || ... ~matches(intMethod,["constant","next","previous","movmean","movmedian"])) x = (1:size(a,1)).'; end if numNonMissing == 0 % Column is full of missing data: if ~isfinite(maxgap) % Fill with constant if nargout > 1 if isa(intMethod,'function_handle') && strcmp(extMethod,'extrap') b = handlefill(b,ma,intMethod,intConstOrWindowSizeOrK,spFlag,x); ma = ~ismissing(b); else [b,ma] = extrapolateWithConstant(b,intMethod,intConstOrWindowSizeOrK,extMethod,ma,jj); end else if isa(intMethod,'function_handle') && strcmp(extMethod,'extrap') b = handlefill(b,ma,intMethod,intConstOrWindowSizeOrK,spFlag,x); else b = extrapolateWithConstant(b,intMethod,intConstOrWindowSizeOrK,extMethod,ma,jj); end end end % else, column is a "large gap": do not fill return end % Ignore gaps of missing data bigger than maxgap ma = removeLargeGaps(ma,maxgap,x); maBeforeInterp = ma; % (1) Interpolate if issparse(b) b = full(b); end if strcmp(intMethod,'constant') b = assignConstant(b,intConstOrWindowSizeOrK,ma,jj); elseif strcmp(intMethod,'movmean') if didIsmissing if useDefaultX newb = movmean(b,intConstOrWindowSizeOrK,'omitnan'); else newb = movmean(b,intConstOrWindowSizeOrK,'omitnan','SamplePoints',x); end b(ma) = newb(ma); else % 'MissingLocations' case b(ma) = missing; if useDefaultX newb = movmean(b,intConstOrWindowSizeOrK,'omitnan'); else newb = movmean(b,intConstOrWindowSizeOrK,'omitnan','SamplePoints',x); end b(ma) = newb(ma); ma(ma) = xor(ma(ma),ismissing(b(ma))); end elseif strcmp(intMethod,'movmedian') if didIsmissing if useDefaultX newb = movmedian(b,intConstOrWindowSizeOrK,'omitnan'); else newb = movmedian(b,intConstOrWindowSizeOrK,'omitnan','SamplePoints',x); end b(ma) = newb(ma); else % 'MissingLocations' case b(ma) = missing; if useDefaultX newb = movmedian(b,intConstOrWindowSizeOrK,'omitnan'); else newb = movmedian(b,intConstOrWindowSizeOrK,'omitnan','SamplePoints',x); end b(ma) = newb(ma); ma(ma) = xor(ma(ma),ismissing(b(ma))); end elseif isnumeric(b) && strcmp(intMethod,'next') if numNonMissing > 1 b = fillWithNext(b,ma); end elseif isnumeric(b) && strcmp(intMethod,'previous') if numNonMissing > 1 b = fillWithPrevious(b,ma); end elseif ~isa(intMethod,'function_handle') % function handle case handled below % griddedInterpolant/interp1 require at least 2 grid points. % Do not error if we cannot fill. Instead, return the original array. % For example, fillmissing([NaN 1 NaN],'linear') returns [NaN 1 NaN]. if numNonMissing > 1 isfloatb = isfloat(b); if isfloatb && isfloat(x) G = griddedInterpolant(x(nma),b(nma),intMethod); b(ma) = G(x(ma)); % faster than interp1 elseif isfloatb || isduration(b) || isdatetime(b) b(ma) = interp1(x(nma),b(nma),x(ma),intMethod,'extrap'); else % calendarDuration, char, string, cellstr, or categorical: % No griddedInterpolant because x may be datetime/duration vq = interp1(x(nma),find(nma),x(ma),intMethod,'extrap'); indvq = ~isnan(vq); % vq may have leading or trailing NaN iatmp = find(ma); b(iatmp(indvq)) = b(vq(indvq)); % copy non-missing to missing end end end % (2) Correct for EndValues, including the logical mask of what got filled % use ma to find non-missing for correct maxgap behavior % ma has at least one false, all-missing case was quick returned if maBeforeInterp(1) indBeg = find(~maBeforeInterp,1); else indBeg = 1; end if maBeforeInterp(end) indEnd = find(~maBeforeInterp,1,'last'); else indEnd = numel(a); end if indBeg > 1 || indEnd < numel(a) if ischar(extMethod) || (isstring(extMethod) && isscalar(extMethod)) if strcmp(extMethod,'none') b(1:indBeg-1) = a(1:indBeg-1); b(indEnd+1:end) = a(indEnd+1:end); if nargout > 1 % 'MissingLocations' case ma(1:indBeg-1) = false; ma(indEnd+1:end) = false; end elseif strcmp(extMethod,'nearest') || (strcmp(extMethod,'extrap') && strcmp(intMethod,'nearest')) b(1:indBeg-1) = a(indBeg); b(indEnd+1:end) = a(indEnd); if nargout > 1 % 'MissingLocations' case ma(1:indBeg-1) = true; ma(indEnd+1:end) = true; end elseif strcmp(extMethod,'previous') || (strcmp(extMethod,'extrap') && strcmp(intMethod,'previous')) b(1:indBeg-1) = a(1:indBeg-1); b(indEnd+1:end) = a(indEnd); if nargout > 1 % 'MissingLocations' case ma(1:indBeg-1) = false; ma(indEnd+1:end) = true; end elseif strcmp(extMethod,'next') || (strcmp(extMethod,'extrap') && strcmp(intMethod,'next')) b(1:indBeg-1) = a(indBeg); b(indEnd+1:end) = a(indEnd+1:end); if nargout > 1 % 'MissingLocations' case ma(1:indBeg-1) = true; ma(indEnd+1:end) = false; end end else % Extrapolate with given value(s) if isscalar(extMethod) b([1:indBeg-1, indEnd+1:end]) = extMethod; elseif ~isa(intMethod,'function_handle') % function handle has separate implementation (directly below) b([1:indBeg-1, indEnd+1:end]) = extMethod(jj); end if nargout > 1 ma([1:indBeg-1, indEnd+1:end]) = true; end end end if isa(intMethod,'function_handle') isExtrap = strcmp(extMethod,'extrap'); if ~isExtrap ma([1:indBeg-1, indEnd+1:end]) = false; end if nargout < 2 % one output case newb = handlefill(b,ma,intMethod,intConstOrWindowSizeOrK,spFlag,x); b(ma) = newb(ma); else % two output case b(ma) = missing; newb = handlefill(b,ma,intMethod,intConstOrWindowSizeOrK,spFlag,x); b(ma) = newb(ma); ma(ma) = xor(ma(ma),ismissing(b(ma))); if ~isExtrap ma([1:indBeg-1, indEnd+1:end]) = true; end end end end % fillArrayColumn %-------------------------------------------------------------------------- function MItoBeFilled = removeLargeGaps(MI,maxgap,x) % set elements in the given missing indicator within large gaps to false % MI is a vector, maxgap is either numeric or duration scalar MItoBeFilled = MI; % x has at least 1 element, empties are already special cased if ~isfinite(maxgap) % no gaps will be too large to fill, don't change MI return end % find all segments in the missing indicator vector segmentLengths = diff([0; find(diff(MI(:))); numel(MI)]); % gaps span x_j to x_k k = cumsum(segmentLengths); j = k - segmentLengths + 1; % The gap size is defined as x_(k+1)-x_(j-1) % If the segment is at the end of a vector, we use the nearest sample point x = [x(1); x(:); x(end)]; % for this x, the size of a gap is x_(k+2)-x_(j) for idx =1:numel(segmentLengths) % only act on segments of missing data if MI(j(idx)) % check to see if the segment is small enough to fill doFill = x(j(idx)) + maxgap >= x(k(idx)+2); if ~doFill % if it is too large, don't fill, i.e. treat as nonmissing MItoBeFilled(j(idx):k(idx)) = false; end end end end % removeLargeGaps %-------------------------------------------------------------------------- function [B,FA] = extrapolateWithConstant(B,intMethod,intConst,extMethod,lhsIndex,rhsIndex) % Fill all missings with a constant. Used if B is full of missing data, or % for array B with dim > ndims(B). rhsIndex may be logical or numeric. % Fill only when we have specified an extrapolation constant: if nargout > 1 FA = lhsIndex; end if ~ischar(extMethod) && ~(isstring(extMethod) && isscalar(extMethod)) % Either through EndValues: % fillmissing(A,METHOD,'EndValues',ConstVals) B = assignConstant(B,extMethod,lhsIndex,rhsIndex); elseif strcmp(intMethod,'constant') && strcmp(extMethod,'extrap') % Or through the 'constant' fill method: % fillmissing(A,'constant',ConstVals) % fillmissing(A,'constant',ConstVals,'EndValues','extrap') B = assignConstant(B,intConst,lhsIndex,rhsIndex); elseif nargout > 1 FA(:) = false; end end % extrapolateWithConstant %-------------------------------------------------------------------------- function B = assignConstant(B,ConstVals,lhsIndex,rhsIndex) if isscalar(ConstVals) B(lhsIndex) = ConstVals; else B(lhsIndex) = ConstVals(rhsIndex); end end %-------------------------------------------------------------------------- function [A,AisTable,intMethod,intConstOrWindowSizeOrK,extMethod,x,dim,dataVars,ma,maxgap,replace,distance] = ... parseInputs(A,fillMethod,varargin) % Parse FILLMISSING inputs AisTable = istabular(A); if ~AisTable && ~isSupportedArray(A) error(message('MATLAB:fillmissing:FirstInputInvalid')); end % Parse fill method. Empty '' or [] fill method is not allowed. validIntMethods = {'constant','previous','next','nearest','linear',... 'spline','pchip','movmean','movmedian','makima','knn'}; if ischar(fillMethod) || isstring(fillMethod) indIntMethod = matlab.internal.math.checkInputName(fillMethod,validIntMethods); if sum(indIntMethod) ~= 1 % Also catch ambiguities for fillmissing(A,'ne') and fillmissing(A,'p') error(message('MATLAB:fillmissing:MethodInvalid')); end intMethod = validIntMethods{indIntMethod}; indIntMethod = find(indIntMethod); if indIntMethod == 11 && ~ismatrix(A) % tables and timetables return TRUE for ismatrix error(message('MATLAB:fillmissing:knnMustBeMatrixTableOrTimetable')) end intConstOrWindowSizeOrK = []; % Parse fillmissing(A,'constant',c) and fillmissing(A,MOVFUN,windowSize) intConstOffset = 0; if any(indIntMethod == [1 8 9]) if nargin > 2 intConstOrWindowSizeOrK = varargin{1}; else error(message(['MATLAB:fillmissing:',intMethod,'Input'])); end intConstOffset = 1; elseif indIntMethod == 11 if nargin > 2 && isnumeric(varargin{1}) intConstOrWindowSizeOrK = varargin{1}; intConstOffset = 1; else intConstOrWindowSizeOrK = 1; end end elseif isa(fillMethod,'function_handle') if nargin(fillMethod) < 3 error(message('MATLAB:fillmissing:FunctionHandleNumberOfArguments')); end intMethod = fillMethod; intConstOffset = 1; indIntMethod = []; if nargin < 3 error(message('MATLAB:fillmissing:FunctionHandleInput')); end intConstOrWindowSizeOrK = varargin{1}; else error(message('MATLAB:fillmissing:MethodInvalid')); end % Parse optional inputs extMethod = 'extrap'; x = []; ma = []; maxgap = []; dataVarsProvided = false; missingLocationProvided = false; replace = true; distance = 'euclidean'; if ~AisTable dim = matlab.internal.math.firstNonSingletonDim(A); dataVars = []; % not supported for arrays else dim = 1; % Fill each table variable separately dataVars = 1:width(A); end if nargin > 2+intConstOffset % Third input can be a constant, a window size, the dimension, or an % argument Name from a Name-Value pair: % fillmissing(A,'constant',C,...) and C may be a char itself % fillmissing(A,'movmean',K,...) with K numeric, numel(K) == 1 or 2 % fillmissing(A,'linear',DIM,...) % fillmissing(A,'linear','EndValues',...) firstOptionalInput = varargin{1+intConstOffset}; % The dimension dimOffset = 0; if isnumeric(firstOptionalInput) || islogical(firstOptionalInput) if AisTable error(message('MATLAB:fillmissing:DimensionTable')); end dimOffset = 1; dim = firstOptionalInput; if ~isscalar(dim) || ~isreal(dim) || fix(dim) ~= dim || dim < 1 || ~isfinite(dim) error(message('MATLAB:fillmissing:DimensionInvalid')); end end % Trailing N-V pairs indNV = (1+intConstOffset+dimOffset):numel(varargin); if rem(length(indNV),2) ~= 0 error(message('MATLAB:fillmissing:NameValuePairs')); end spvar = []; for i = indNV(1:2:end) if matlab.internal.math.checkInputName(varargin{i},'EndValues') if indIntMethod == 11 error(message('MATLAB:fillmissing:unsupportedNVPair','EndValues','''knn''')) end extMethod = varargin{i+1}; if ischar(extMethod) || (isstring(extMethod) && isscalar(extMethod)) validExtMethods = {'extrap','previous','next','nearest','none'}; indExtMethod = matlab.internal.math.checkInputName(extMethod,validExtMethods); if sum(indExtMethod) ~= 1 % Also catch ambiguities between nearest and next error(message('MATLAB:fillmissing:EndValuesInvalidMethod')); end extMethod = validExtMethods{indExtMethod}; end elseif matlab.internal.math.checkInputName(varargin{i},'DataVariables') if AisTable dataVars = matlab.internal.math.checkDataVariables(A,varargin{i+1},'fillmissing'); dataVarsProvided = true; else error(message('MATLAB:fillmissing:DataVariablesArray')); end elseif matlab.internal.math.checkInputName(varargin{i},'ReplaceValues') if AisTable replace = matlab.internal.datatypes.validateLogical(varargin{i+1},'ReplaceValues'); else error(message('MATLAB:fillmissing:ReplaceValuesArray')); end elseif matlab.internal.math.checkInputName(varargin{i},'SamplePoints') if indIntMethod == 11 error(message('MATLAB:fillmissing:unsupportedNVPair','SamplePoints','''knn''')); end if istimetable(A) error(message('MATLAB:samplePoints:SamplePointsTimeTable')); end [x,spvar] = matlab.internal.math.checkSamplePoints(varargin{i+1},A,AisTable,false,dim); elseif matlab.internal.math.checkInputName(varargin{i},'MissingLocations',2) ma = varargin{i+1}; missingLocationProvided = true; elseif matlab.internal.math.checkInputName(varargin{i},'MaxGap',2) if indIntMethod == 11 error(message('MATLAB:fillmissing:unsupportedNVPair','MaxGap','knn')) end maxgap = varargin{i+1}; if ~isscalar(maxgap) || ~(isnumeric(maxgap) || isduration(maxgap) || iscalendarduration(maxgap)) ||... ~isreal(maxgap) || isnan(maxgap) || (~iscalendarduration(maxgap) && maxgap <= 0) error(message('MATLAB:fillmissing:MaxGapInvalid')) end elseif matlab.internal.math.checkInputName(varargin{i},'Distance') if indIntMethod ~= 11 error(message('MATLAB:fillmissing:DistanceNonKNNMethod')) end distance = varargin{i+1}; if ischar(distance) || (isstring(distance) && isscalar(distance)) validDistanceMetrics = {'euclidean','seuclidean'}; distMask = matlab.internal.math.checkInputName(distance,validDistanceMetrics); if ~any(distMask) error(message('MATLAB:fillmissing:InvalidDistance')); end distance = validDistanceMetrics{distMask}; elseif ~isa(distance,'function_handle') error(message('MATLAB:fillmissing:InvalidDistance')) end else error(message('MATLAB:fillmissing:NameValueNames')); end end if ~isempty(spvar) dataVars(dataVars == spvar) = []; % remove sample points var from data vars end if missingLocationProvided if istabular(ma) && AisTable dataVars = validateTabularMissingLocations(A,ma,dataVars,dataVarsProvided); else if AisTable sizedv = size(A(:,dataVars)); sizev = size(ma); if ~islogical(ma) || (~isequal(sizedv(2),sizev(2)) && ~isequal(size(A),size(ma))) error(message('MATLAB:fillmissing:MissingLocationsInvalid')); end else if ~islogical(ma) || ~isequal(size(A),size(ma)) error(message('MATLAB:fillmissing:MissingLocationsInvalid')); end end end end % Ensure not both MaxGap and MissingLocations specified if ~isempty(ma) && ~isempty(maxgap) error(message('MATLAB:fillmissing:MaxGapMissingLocations')) end end % Validate fill constants size if indIntMethod == 1 % 'constant' fill method intConstOrWindowSizeOrK = checkConstantsSize(A,AisTable,false,intConstOrWindowSizeOrK,dim,dataVars,''); elseif indIntMethod == 11 % Validate number of nearest neighbors if ~isnumeric(intConstOrWindowSizeOrK) || ~isscalar(intConstOrWindowSizeOrK) || ... fix(intConstOrWindowSizeOrK) ~= intConstOrWindowSizeOrK || ... intConstOrWindowSizeOrK < 1 || ~isreal(intConstOrWindowSizeOrK) error(message('MATLAB:fillmissing:InvalidK')); end end if ~ischar(extMethod) && ~(isstring(extMethod) && isscalar(extMethod)) extMethod = checkConstantsSize(A,AisTable,false,extMethod,dim,dataVars,'Extrap'); end % Default abscissa if isempty(x) && istimetable(A) x = matlab.internal.math.checkSamplePoints(A.Properties.RowTimes,A,false,true,dim); end % Default Sample Points if isa(intMethod,'function_handle') if isempty(x) checkHandleWindow(A,intConstOrWindowSizeOrK,false,1:numel(A)); else checkHandleWindow(A,intConstOrWindowSizeOrK,true,x); end end % Default maxgap/check datatype against abscissa if isempty(maxgap) maxgap = inf; elseif (isnumeric(x) && ~isnumeric(maxgap)) || (~isnumeric(x) && isnumeric(maxgap)) ||... (isduration(x) && iscalendarduration(maxgap)) error(message('MATLAB:fillmissing:MaxGapDurationInvalid')) end end % parseInputs %-------------------------------------------------------------------------- function tf = isSupportedArray(A) % Check if array type is supported tf = isnumeric(A) || islogical(A) || ... isstring(A) || iscategorical(A) || iscellstr(A) || ischar(A) || ... isdatetime(A) || isduration(A) || iscalendarduration(A); end % isSupportedArray %-------------------------------------------------------------------------- function C = checkConstantsSize(A,AisTable,AisTableVar,C,dim,dataVars,eid) % Validate the size of the fill constant. We can fill all columns with the % same scalar, or use a different scalar for each column. if ischar(C) && (~ischar(A) || AisTableVar) % A char fill constant is treated as a scalar for string, categorical % and cellstr (arrays or table variables), and char table variables if ~isrow(C) && ~isempty(C) % '' is not a row error(message('MATLAB:fillmissing:CharRowVector')); end elseif ~isscalar(C) sizeA = size(A); if AisTable % numel(constant) must equal numel 'DataVariables' value sizeA(2) = length(dataVars); end if dim <= ndims(A) sizeA(dim) = []; nVects = prod(sizeA); else % fillmissing(A,'constant',c) supported % fillmissing(A,METHOD,'EndValues',constant_value) supported nVects = numel(A); end if (numel(C) ~= nVects) if nVects <= 1 error(message(['MATLAB:fillmissing:SizeConstantScalar',eid])); else error(message(['MATLAB:fillmissing:SizeConstant',eid],nVects)); end end C = C(:); end end % checkConstantsSize %-------------------------------------------------------------------------- function [intConst,extMethod] = checkArrayType(A,intMethod,intConst,extMethod,x,AisTableVar,ma,T,vj) % Check if array types match if AisTableVar && ~isSupportedArray(A) error(message('MATLAB:fillmissing:UnsupportedTableVariable',class(A))); end if ~(isnumeric(A) || islogical(A) || isduration(A) || isdatetime(A)) && ... ~any(strcmp(intMethod,{'nearest','next','previous','constant'})) && ... ~isa(intMethod,'function_handle') if AisTableVar error(message('MATLAB:fillmissing:InterpolationInvalidTableVariable',intMethod)); else error(message('MATLAB:fillmissing:InterpolationInvalidArray',intMethod,class(A))); end end % 'MissingLocations' doesn't work with all methods for integer and logical if ~isempty(ma) && (isinteger(A) || islogical(A)) && ... ~(any(strcmp(intMethod,{'nearest','next','previous','constant','knn'})) || ... isa(intMethod,'function_handle')) error(message('MATLAB:fillmissing:MissingLocationsInteger')); end try if strcmp(intMethod,'constant') intConst = checkConstantType(A,intConst,''); end if ~ischar(extMethod) && ~(isstring(extMethod) && isscalar(extMethod)) extMethod = checkConstantType(A,extMethod,'Extrap'); end catch ME if AisTableVar && matlab.internal.math.checkInputName('MATLAB:fillmissing:Constant',ME.identifier) % Generic error message for tables varNames = T.Properties.VariableNames; error(message('MATLAB:fillmissing:ConstantInvalidTypeForTableVariable',varNames{vj})); else % Specific error message for arrays throw(ME); end end if isa(x,'single') && (isduration(A) || isdatetime(A)) error(message('MATLAB:samplePoints:SamplePointsSingle')); end end % checkArrayType %-------------------------------------------------------------------------- function C = checkConstantType(A,C,eid) % Check if constant type matches the array type if ~isempty(eid) && ~isnumeric(C) && ~islogical(C) && ... ~isdatetime(C) && ~isduration(C) && ~iscalendarduration(C) error(message('MATLAB:fillmissing:ConstantInvalidTypeExtrap')); end if isnumeric(A) && ~isnumeric(C) && ~islogical(C) error(message(['MATLAB:fillmissing:ConstantNumeric',eid])); elseif isdatetime(A) && ~isdatetime(C) error(message(['MATLAB:fillmissing:ConstantDatetime',eid])); elseif isduration(A) && ~isduration(C) error(message(['MATLAB:fillmissing:ConstantDuration',eid])); elseif iscalendarduration(A) && ~iscalendarduration(C) error(message(['MATLAB:fillmissing:ConstantCalendarDuration',eid])); elseif iscategorical(A) if ischar(C) C = string(C); % make char a scalar string elseif iscategorical(C) && (isordinal(A) ~= isordinal(C)) error(message('MATLAB:fillmissing:ConstantCategoricalOrdMismatch')); elseif iscategorical(C) && isordinal(C) && ~isequal(categories(C),categories(A)) error(message('MATLAB:fillmissing:ConstantCategoricalCatMismatch')); elseif (~iscellstr(C) && ~isstring(C) && ~iscategorical(C)) error(message(['MATLAB:fillmissing:ConstantCategorical',eid])); end elseif ischar(A) && ~ischar(C) error(message(['MATLAB:fillmissing:ConstantChar',eid])); elseif iscellstr(A) if ischar(C) C = {C}; % make char a scalar cellstr elseif ~iscellstr(C) %#ok<ISCLSTR> % string constants not supported error(message(['MATLAB:fillmissing:ConstantCellstr',eid])); end elseif isstring(A) && ~isstring(C) % char and cellstr constants not supported error(message(['MATLAB:fillmissing:ConstantString',eid])); end end % checkConstantType function datavariables = validateTabularMissingLocations(a,loc,datavariables,dataVarsProvided) vnames = loc.Properties.VariableNames; tnames = a.Properties.VariableNames; if dataVarsProvided if ~all(ismember(tnames(datavariables),vnames)) % DataVariable names must be present in loc table error(message('MATLAB:fillmissing:InvalidLocationsWithDataVars')); end else try datavariables = matlab.internal.math.checkDataVariables(a, vnames, 'fillmissing'); catch error(message('MATLAB:fillmissing:InvalidTabularLocationsFirstInput')); end end vnames = string(vnames); for ii=vnames if ~islogical(loc.(ii)) || ~isequal(size(a.(ii)),size(loc.(ii))) error(message('MATLAB:fillmissing:LogicalVarsRequired')); end end end %-------------------------------------------------------------------------- function checkHandleWindow(A,intConstOrWindowSizeOrK,spFlag,t) needDuration = (~spFlag && istimetable(A)) || ... (spFlag && (isduration(t) || isdatetime(t))); if (isduration(intConstOrWindowSizeOrK) || isnumeric(intConstOrWindowSizeOrK)) && ... isreal(intConstOrWindowSizeOrK) && any(numel(intConstOrWindowSizeOrK) == [1 2]) && ... allfinite(intConstOrWindowSizeOrK) && any(intConstOrWindowSizeOrK > 0) && ... all(intConstOrWindowSizeOrK >= 0) if needDuration && ~isduration(intConstOrWindowSizeOrK) error(message('MATLAB:fillmissing:FunctionHandleInvalidWindowDuration')); elseif ~needDuration && isduration(intConstOrWindowSizeOrK) error(message('MATLAB:fillmissing:FunctionHandleInvalidWindow')); end else error(message('MATLAB:fillmissing:FunctionHandleInvalidWindow')); end end % checkHandleWindow %-------------------------------------------------------------------------- function ide = getMissingIntervals(MI) % get the 2-column array of first and last indices of each gap segmentLengths = diff([0; find(diff(MI(:))); numel(MI)]); % This assumes MI is not empty, which cannot happen when this is called k = cumsum(segmentLengths); % last index of each interval alt = MI(k); % which intervals are missing vs non-missing ide = zeros(sum(alt),2); if alt(1) % if the first interval is missing ide(:,2) = k(1:2:end); ide(:,1) = k(1:2:end) - segmentLengths(1:2:end) + 1; elseif numel(alt) >= 2 && alt(2) % if the second interval is missing ide(:,2) = k(2:2:end); ide(:,1) = k(2:2:end) - segmentLengths(2:2:end) + 1; end end % getMissingIntervals %-------------------------------------------------------------------------- function Y = handlefill(A,MI,fillfun,intConstOrWindowSizeOrK,spFlag,t) A = A(:); t = t(:); tidx = 1:numel(t); % Initialize the output Y = A; % Quick return if isempty(MI) return end ide = getMissingIntervals(MI); % array of first and last indices of each gap % Split into left and right window values if numel(intConstOrWindowSizeOrK) == 2 a = intConstOrWindowSizeOrK(1); b = intConstOrWindowSizeOrK(2); elseif ~spFlag a = floor(intConstOrWindowSizeOrK/2); b = a; else a = intConstOrWindowSizeOrK/2; b = a; end % Call fillfun on each interval of missing data skip filling gaps when xin % is empty nide = size(ide,1); for i = 1:nide if spFlag ind = find(((t <= t(ide(i+nide)) + b) & (t > t(ide(i+nide)))) | ... ((t >= t(ide(i)) - a) & (t < t(ide(i))))); else ind = [max(1, ceil(ide(i) - a)):ide(i)-1, ide(i+nide)+1:min(numel(A), floor(ide(i+nide) + b))]; end xin = A(ind); tin = t(ind); toutidx = tidx(ide(i,1):ide(i,2)); tout = t(toutidx); try ytmp = fillfun(xin, tin, tout); catch ME if isempty(xin) m = message('MATLAB:fillmissing:FunctionHandleEmptyInput'); throw(addCause(MException(m.Identifier,'%s',getString(m)),ME)); else throw(ME); end end if isscalar(ytmp) || (isvector(ytmp) && isequal(numel(ytmp),numel(toutidx))) Y(toutidx) = ytmp; else % Bad output size error(message('MATLAB:fillmissing:FunctionHandleInvalidOutputSize')); end end end % handlefill %-------------------------------------------------------------------------- function [B,FB] = knnFill(A,AisTable,k,dataVars,ma,distance,dim,replace) if isempty(ma) ma = ismissing(A); end % Quick return when no filling is needed if isempty(A) || ~any(ma,'all') || dim > 2 if replace B = A; else B = matlab.internal.math.appendDataVariables(A,A(:,dataVars),"filled"); end FB = false(size(B)); return end if ~isa(distance,'function_handle') [B,FB] = knnFillBuiltInDistances(A,AisTable,k,dataVars,ma,distance,dim,replace); else [B,FB] = knnFillCustomDistances(A,AisTable,k,dataVars,ma,distance,dim,replace); end end % knnFill %-------------------------------------------------------------------------- function [B,FB] = knnFillBuiltInDistances(A,AisTable,k,dataVars,ma,distance,dim,replace) if istimetable(A) error(message('MATLAB:fillmissing:DistanceTimetableNotSupported')); end % matlab.internal.math.fillmissingKNN fills using the k-nearest columns, so % A is transposed unless dim == 2 transposeData = dim == 1; if AisTable AT = checkAndExtractTableVars(A,dataVars); ma = ma(:,dataVars).'; elseif transposeData AT = A.'; ma = ma.'; else % dim == 2 AT = A; end if ~isfloat(AT) error(message('MATLAB:fillmissing:DistanceNonFloatsNotSupported')); end if strcmp(distance,'seuclidean') if transposeData && ~AisTable scalingVector = double(std(A,'omitnan')); else scalingVector = double(std(AT,[],2,'omitnan')); end [BOutTmp,FBTmp] = matlab.internal.math.fillmissingKNN(full(AT),full(ma),k,full(scalingVector)); else [BOutTmp,FBTmp] = matlab.internal.math.fillmissingKNN(full(AT),full(ma),k); end if issparse(A) BOutTmp = sparse(BOutTmp); FBTmp = sparse(FBTmp); end if AisTable if replace B = A; BDataVars = dataVars; else B = A(:,dataVars); BDataVars = 1:width(B); end FB = false(size(B)); if ~any(FBTmp,'all') return end % Note that FBTmp and BOutTmp are based on the transpose of the data varsToReplace = find(any(FBTmp,2)).'; for ii = varsToReplace B.(BDataVars(ii)) = cast(BOutTmp(ii,:)',class(B.(ii))); end if replace FB(:,dataVars) = FBTmp'; else B = matlab.internal.math.appendDataVariables(A,B,"filled"); FB = [false(size(A)),FBTmp']; end else if transposeData B = BOutTmp'; FB = FBTmp'; else B = BOutTmp; FB = FBTmp; end end end % knnFillBuiltInDistances %-------------------------------------------------------------------------- function [B,FB] = knnFillCustomDistances(A,AisTable,k,dataVars,ma,distance,dim,replace) if dim == 2 A = A.'; ma = ma'; end if AisTable ADataVars = A(:,dataVars); ma = ma(:,dataVars); else ADataVars = A; dataVars = 1:size(A,2); end if replace B = A; BDataVars = dataVars; else % can only be hit for tables and timetables B = A(:,dataVars); BDataVars = 1:width(B); end if issparse(A) FB = logical(sparse(size(B,1),size(B,2))); else FB = false(size(B)); end numOfVectors = size(A,1); vectorDistances = zeros(numOfVectors,1); % Walk through the rows (vectors) of BIn, calculating nearest neighbors as needed for ii = 1:numOfVectors maCurrentVector = ma(ii,:); if any(maCurrentVector) && ~all(maCurrentVector) % all-missing vectors cannot be filled % Calculate nearest neighbors for jj = 1:numOfVectors if ii ~= jj try vectorDistances(jj) = double(distance(ADataVars([ii,jj],:),ma([ii,jj],:))); catch ME if strcmp(ME.identifier,'MATLAB:invalidConversion') baseException = MException(message('MATLAB:fillmissing:InvalidCustomDistance')); else baseException = MException(message('MATLAB:fillmissing:DistanceCalculationFailed')); end baseException = addCause(baseException, ME); throw(baseException); end else vectorDistances(jj) = NaN; end end if ~isreal(vectorDistances) error(message('MATLAB:fillmissing:InvalidCustomDistance')); end % vectorDistances is sorted before NaNs are removed to preserve indices [sortedDistances,sortedIndices] = sort(vectorDistances); sortedIndices = sortedIndices(~isnan(sortedDistances)); % Don't use NaN distances if ~isempty(sortedIndices) % Only fill if there are non-NaN values to fill with for jj = find(maCurrentVector) kIndices = sortedIndices(~ma(sortedIndices,jj)); % Don't try to fill with a missing value if ~isempty(kIndices) kIndices = kIndices(1:min(numel(kIndices),k)); % Keep the k lowest distances (the k nearest neighbors) % Note that if n > k vectors are the same distance from the % current vector, the k vectors with the smallest index are % used. if isscalar(kIndices) fillValue = ADataVars(kIndices,jj); B(ii,BDataVars(jj)) = fillValue; else if AisTable try fillValue = mean(ADataVars{kIndices,jj},1,'native'); catch ME baseException = MException(message('MATLAB:fillmissing:AggregationFailed')); baseException = addCause(baseException, ME); throw(baseException); end B{ii,BDataVars(jj)} = fillValue; else try fillValue = mean(ADataVars(kIndices,jj),1,'native'); catch ME baseException = MException(message('MATLAB:fillmissing:AggregationFailed')); baseException = addCause(baseException, ME); throw(baseException); end B(ii,jj) = fillValue; % dataVars are unnecessary for the non-tabular case end end FB(ii,BDataVars(jj)) = ~ismissing(fillValue); end end end end end if ~replace B = matlab.internal.math.appendDataVariables(A,B,"filled"); FB = [false(size(A)),FB]; end if dim == 2 B = B.'; FB = FB.'; end end % knnFillCustomDistances %-------------------------------------------------------------------------- function AMatrixT = checkAndExtractTableVars(ATable,dataVars) AMatrixT = zeros(numel(dataVars),height(ATable)); % This will hold the transpose of the data in ATable for ii = 1:numel(dataVars) tmp = ATable.(dataVars(ii)); if ~isfloat(tmp) error(message('MATLAB:fillmissing:DistanceNonFloatsNotSupported')); end if size(tmp,2) ~= 1 error(message('MATLAB:fillmissing:DistanceMultiColumnTableVars')) end AMatrixT(ii,:) = tmp'; end end % checkAndExtractTableVars %-------------------------------------------------------------------------- function b = fillWithNext(b,ma) maInds = flip(find(ma)); if ma(end) % Last value is missing firstFillableElement = 2; % Walk through maInds to find the first non-consecutive element % This will correspond to the last missing value that is followed by a % non-missing value while(firstFillableElement <= numel(maInds) && ... maInds(firstFillableElement) + 1 == maInds(firstFillableElement - 1)) firstFillableElement = firstFillableElement + 1; end maInds = maInds(firstFillableElement:end); end for ii = maInds.' b(ii) = b(ii + 1); end end % fillWithNext %-------------------------------------------------------------------------- function b = fillWithPrevious(b,ma) maInds = find(ma); if ma(1) % First value is missing firstFillableElement = 2; % Walk through maInds to find the first non-consecutive element % This will correspond to the first missing value that is preceded by a % non-missing value while(firstFillableElement <= numel(maInds) && ... maInds(firstFillableElement) - 1 == maInds(firstFillableElement - 1)) firstFillableElement = firstFillableElement + 1; end maInds = maInds(firstFillableElement:end); end for ii = maInds.' b(ii) = b(ii - 1); end end % Fill with previous 我的数据集就是就是之前我告诉你的那些变量名称,请你帮我修改以上代码
05-11
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值