不看死不瞑目的文档 :Solaris container - Dynamic Resource Pools

本文探讨了通过资源池和动态资源分配技术来管理和优化大型服务器系统资源的方法。介绍了如何使用资源池来创建灵活且持久的资源集合,以及如何利用动态资源分配自动调整资源分配以满足不同的需求。

Some situations may be best served by partitioning available system resources, such as processors, into a number of discrete resource partitions. There are several reasons why this may be useful:

_ Enforcing hard limits on the use of a resource. For instance, by creating a processor set and binding a process, project or zone to it, the CPU usage of the bound processes is effectively limited to the CPUs in the processor set. These processes cannot use processors outside of their set.

_ Providing a guaranteed quantity of a resource. If an application requires a certain amount of CPU resources at all times, a processor set can be created for use by the application, thereby reserving the CPUs for application processes. Processes not bound to the set are unable to run on the processors in that set.

_ Setting expectations. When deploying applications on a large server in phases, users may become accustomed to having fast response times as all resources are available to the application. As more applications are deployed, users may perceive a performance degradation. By partitioning the system so that the application received only the resources it needs, expectations can be set correctly from the start

_ Partitioning by function, such as creating a partition for interactive users and a partition for batch jobs.

Processor Sets

The ability to partition a server using processor sets has been available since version 2.6 of the Solaris Operating System. Every system has at least one processor set, the system or default processor set that contains all of the processors in the system. Additional processor sets can be dynamically created and removed on a running system using the psrset(1M) command, provided that at least one CPU remains for the system processor set. Processes are bound to the default processor set by default, and can be bound to other processor sets on-the-fly. It is important to note that partitioning a system using processor sets may lead to under utilization of the server since only processes bound to the processor set may use the processors in the set. If these processes do not use all of available CPU resources, the remaining CPU capacity in the set remains unused.

While processor sets are very useful, managing them can be a little umbersome. System administrators must specify the physical CPU ID of the processor to add to a processor set. Since the physical ID of a CPU is hardware dependent, it varies between different hardware platforms, creating a close coupling between the processor set definition and the underlying hardware. Also, on systems that support Dynamic Reconfiguration, processors can be added and removed while the system is on-line. If a processor to be removed is used in a processor set, the system administrator must manually remove that processor from the set before the processor can be removed from the system. This requires the system administrator to have intimate knowledge of the configured processor sets and the hardware. Processor sets are referenced by a system generated ID, making it hard to remember what a specific set is used for, especially when multiple processor sets are present.

Resource Pools

The introduction of Resource Pools in the Solaris 9 OS significantly enhanced the ability to partition the system. Resource Pools provide a mechanism to create a persistent configuration of resource sets such as processor sets. The Resource Pools framework removes the link between the intention of the system administrator and the underlying hardware. Instead of creating a processor set by specifying physical CPU IDs, system administrators can now

create a processor set with a chosen name by specifying the number of processors required, rather than their physical IDs. As a result, the definition of the processor set is no longer tied to a particular type of hardware.

System administrators can also specify a minimum and maximum number of processors for a set. The system assigns a number of processors between these values when creating the processor set on a specific system. This allows for more generic definitions that can be shared between systems. A configuration defining a set with at least one CPU and a maximum of three CPUs could be instantiated on a two-way system as well as on a larger server with more processors. Moving the definition to the larger server does not require any adjustment by the system administrator. The number of processors in the set on the larger server could be higher, depending on other processor sets defined in the system. The Resource Pools framework balances the number of processors in the set within the constraints set by the administrator.

On systems that support Dynamic Reconfiguration, the framework ensures that constraints are still met when removing processors from the system. If the total number of processors drops below the minimum number required for the active configuration, the Dynamic Reconfiguration operation is denied. If one of the processors being removed is part of a processor set, the system reconfigures all processor sets in the system so that the processor is no longer in a set. Adding CPUs to a running system also causes a reconfiguration of processor sets, depending on the constraints set by the administrator.

Multiple configurations can be defined to adapt to changing resource requirements such as seasonal workloads or different daily and nightly workloads. The appropriate configuration can be instantiated by invoking the pooladm(1M) command manually or from a cron(1M) job.

Binding Processes To Pools

Instead of binding a process to a processor set directly, a process is bound to a Resource Pool using the poolbind(1M) command. A Resource Pool (or pool) is a logical collection of resource sets such as processor sets. While the processor set is the only type of resource set available in the Solaris OS, the resource pool abstraction allows other types of resource sets, such as memory sets, to be added in later Solaris OS versions.

A pool can optionally be associated with a scheduling class such as the Fair Share Scheduler (FSS) or the Real Time (RT) scheduling class. Processes bound to the pool are subject to that pools scheduler, allowing the system to use different schedulers for different types of workloads. A server can be partitioned into two pools, one pool using the Fair Share Scheduler for applications, and a second pool using the Time Share scheduler (TS) for interactive users.

Multiple pools can be linked to the same resource set. As a result, it is possible to have a system with one processor set and several pools associated with the same processor set. This may not seem useful in a world with only processor sets. However, when other types of resource sets become available, it will be possible to let pools share a common processor set while giving each pool its own memory set, for instance.

The poolbind(1M) command allows administrators to bind processes, tasks, projects and zones to pools. A default pool binding for projects can be established by adding the project.pool attribute to the project entry in the project database. All processes started in the project are bound to the pool automatically. While the project.pool attribute designates only the default pool to bind to, specific processes in a project can still be bound to other pools if desired.

Fair Share Scheduler and Processor Sets

The previous discussion of the Fair Share Scheduler assumed all processors reside in the same processor set. When processor sets are present, the Fair Share Scheduler treats every processor set as a separate partition. CPU entitlement for a project is based on CPU usage in that processor set only. The CPU usage of a project in a processor set does not influence its entitlement in a different processor set. The Fair Share Scheduler calculates the proportion of CPU resources allocated to a project in a processor set by dividing the shares of the project by the number of shares of active projects in the processor set.

For example, consider a system with two processor sets, each containing one processor. Project A has two shares, and project B has one share. Both projects have enough processes to use all available CPU resources. Project B is the only one running in the first processor set. Since it is the only project in this set, project B is entitled to all CPU resources in the set. Both projects run in the second processor set. The number of active shares in this processor set is three (two from project A and one from project B). As a result, project A is entitled to of the processor set and project B is entitled to . Project Bs CPU use in the first processor set does not influence its entitlement in the second processor set.

Dynamic Resource Pools

In the Solaris 10 OS the Resource Pools facility has been further extended to provide automated resource allocation based on resource demands in the system and usage objectives set by the system administrator. This relieves system administrators from deciding how to optimally partition available resources for the current workload. Previously system administrators had to manually reassign resources to adapt to changing workloads. While fairly easy for relatively static workloads, this task may be challenging in an environment with highly variable resource demands.

Automated Resource Allocation

The Dynamic Resource Pools resource controller daemon poold(1M) is responsible for maintaining the resource allocation objectives set by system administrators. Toward this end, it creates an inventory of all available resources in the system. It continually monitors the active workloads in the system to determine if usage objectives can be met. If the resource controller detects that an objective is no longer being met, it evaluates possible alternative resource configurations to see if they can meet the objectives. If a viable alternative configuration exists, the resource controller reconfigures the resources accordingly. For processor sets, this is accomplished by moving processors between processor sets. If no alternative configuration exists that can meet objectives, no reconfiguration occurs. An appropriate message is logged, and the resource controller resumes workload monitoring.

Adding or removing resources using Dynamic Reconfiguration can also trigger a reconfiguration by the resource controller as the amount of available resource changes. Adding CPU capacity to a constrained system may create the opportunity for the resource controller to create a configuration that can meet objectives. Likewise, removing CPU capacity from the system may lead to the objectives no longer being met by the configuration.

Changes made to the objectives themselves by system administrators can also cause the resource controller to re-evaluate the configuration. The resource controller keeps a history of decisions made in the past, enabling it to rule out configuration changes that did not lead to improvement.

Even if the process of reconfiguration is automatic, system administrators can still directly manipulate the active configuration by transferring processors from one set to another. Note that doing so may or may not trigger actions by the resource controller.

Configuration Objectives

The resource controller offers several configuration objectives to influence decisions regarding possible resource configurations. These objectives can be combined and objectives can be assigned a precedence over each other. System administrators can choose from a number of different configuration objectives:

_ wt-load

This objective favors configurations that match resource allocations to resource demands. When this objective is in effect, a resource set that uses more resources is given more resources (within the minimum and maximum properties for the set).

_ locality

This objective is used to instruct the resource controller to take resource locality into consideration when allocating resources. On large servers such as the Sun Fire 15K server, the latency between resources on the same board and on a different board can vary. Depending on the application, latency may or may not be important. The locality objective allows the administrator to express the need for resource locality.

_ utilization

This objective favors configurations that allocate resources to partitions that are not meeting their utilization objective. System administrators can set target utilizations on the set using less than, greater than and about operators. The less than and greater than objectives can be combined to specify a target utilization range, such as between 50% and 80% utilization.

The configuration objectives are detailed in the libpool(3LIB) manual page.

Monitoring Resource Pools

System resource utilization can be monitored using the poolstat(1M) utility. This utility shows statistical data for every pool in the system. Data displayed includes the minimum, maximum and current size of the resource set, a measure of how much of the resource set is currently in use, as well as the load on the resource set. The decisions made by the resource controller can be observed by consulting the /var/log/pool/poold log file.

Commands

The following commands are available to administer resource pools:

Command

Description

pooladm(1M)

Activate and deactivate the pools facility

poolcfg(1M)

Create and modify resource pool configuration files

poold(1M)

Monitors resource usage and adjusts resource allocation

poolbind(1M)

Bind processes, tasks, projects and zones to a pool

poolstat(1M)

Report active pool statistics

 

 
基于数据驱动的 Koopman 算子的递归神经网络模型线性化,用于纳米定位系统的预测控制研究(Matlab代码实现)内容概要:本文围绕“基于数据驱动的Koopman算子的递归神经网络模型线性化”展开,旨在研究纳米定位系统的预测控制方法。通过结合数据驱动技术与Koopman算子理论,将非线性系统动态近似为高维线性系统,进而利用递归神经网络(RNN)建模并实现系统行为的精确预测。文中详细阐述了模型构建流程、线性化策略及在预测控制中的集成应用,并提供了完整的Matlab代码实现,便于科研人员复现实验、优化算法并拓展至其他精密控制系统。该方法有效提升了纳米级定位系统的控制精度与动态响应性能。; 适合人群:具备自动控制、机器学习或信号处理背景,熟悉Matlab编程,从事精密仪器控制、智能制造或先进控制算法研究的研究生、科研人员及工程技术人员。; 使用场景及目标:①实现非线性动态系统的数据驱动线性化建模;②提升纳米定位平台的轨迹跟踪与预测控制性能;③为高精度控制系统提供可复现的Koopman-RNN融合解决方案; 阅读建议:建议结合Matlab代码逐段理解算法实现细节,重点关注Koopman观测矩阵构造、RNN训练流程与模型预测控制器(MPC)的集成方式,鼓励在实际硬件平台上验证并调整参数以适应具体应用场景。
youibot@youibot:~$ kubectl get installation.operator.tigera.io default -o yaml apiVersion: operator.tigera.io/v1 kind: Installation metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"operator.tigera.io/v1","kind":"Installation","metadata":{"annotations":{},"name":"default"},"spec":{"calicoNetwork":{"ipPools":[{"blockSize":26,"cidr":"10.0.31.27/24","encapsulation":"VXLANCrossSubnet","name":"default-ipv4-ippool","natOutgoing":"Enabled","nodeSelector":"all()"}]}}} creationTimestamp: "2025-10-21T03:41:13Z" deletionGracePeriodSeconds: 0 deletionTimestamp: "2025-10-21T03:46:40Z" finalizers: - operator.tigera.io/installation-controller - tigera.io/operator-cleanup generation: 6 name: default resourceVersion: "112339" uid: c0036607-db2e-450e-a993-5e04d407cc04 spec: calicoNetwork: bgp: Enabled hostPorts: Enabled ipPools: - blockSize: 26 cidr: 10.0.31.27/24 disableBGPExport: false encapsulation: VXLANCrossSubnet name: default-ipv4-ippool natOutgoing: Enabled nodeSelector: all() linuxDataplane: Iptables linuxPolicySetupTimeoutSeconds: 0 multiInterfaceMode: None nodeAddressAutodetectionV4: firstFound: true windowsDataplane: Disabled cni: ipam: type: Calico type: Calico controlPlaneReplicas: 2 flexVolumePath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ kubeletVolumePluginPath: /var/lib/kubelet logging: cni: logFileMaxAgeDays: 30 logFileMaxCount: 10 logFileMaxSize: 100Mi logSeverity: Info nodeUpdateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate nonPrivileged: Disabled variant: Calico status: conditions: - lastTransitionTime: "2025-10-21T06:07:43Z" message: 'Failed to autoscale typha - not enough linux nodes to schedule typha pods on, require 1 and have 0: ' observedGeneration: 6 reason: ResourceScalingError status: "True" type: Degraded youibot@youibot:~$ kubectl get apiserver.operator.tigera.io default -o yaml apiVersion: operator.tigera.io/v1 kind: APIServer metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"operator.tigera.io/v1","kind":"APIServer","metadata":{"annotations":{},"name":"default"},"spec":{}} creationTimestamp: "2025-10-21T04:00:41Z" generation: 1 name: default resourceVersion: "99240" uid: 5f4bfbf2-221f-485d-8559-c3919f7d6444 spec: {} status: {}
10-22
<think>我们正在处理两个相关的问题:Calico组件资源删除时检测到更改的警告,以及Typha自动缩放失败的问题。以下将分别分析并提供解决方案。 ### 问题一:资源删除时检测到更改的警告 根据之前的讨论,这个警告通常出现在资源被删除的过程中又被修改。在Calico的上下文中,这可能是由于Tigera Operator管理的资源(如`Installation`或`APIServer`)在删除过程中被操作符或其他控制器修改。 #### 解决步骤: 1. **检查资源状态**: ```bash kubectl get installation.operator.tigera.io default -o yaml kubectl get apiserver.operator.tigera.io default -o yaml ``` 查看是否有`deletionTimestamp`字段(表示正在删除)以及是否有`finalizers`字段阻塞删除。 2. **移除finalizer强制删除**(如果资源卡在Terminating状态): - 编辑资源,删除`metadata.finalizers`字段: ```bash kubectl edit installation.operator.tigera.io default ``` 在编辑器中删除`finalizers`部分,保存退出。 - 同样处理`apiserver.operator.tigera.io/default`: ```bash kubectl edit apiserver.operator.tigera.io default ``` 3. **重启Tigera Operator**: 重启Operator可以解决控制器状态一致的问题: ```bash kubectl delete pod -n tigera-operator --all ``` 4. **重新安装Calico**(如果问题持续): 参考引用[2],下载最新的Calico部署文件: ```bash curl https://raw.githubusercontent.com/projectcalico/calico/v3.28.1/manifests/calico-typha.yaml -o calico.yaml ``` 然后应用: ```bash kubectl apply -f calico.yaml ``` ### 问题二:Typha自动缩放失败 Typha是Calico用于扩展连接数的组件,自动缩放失败可能由于Horizontal Pod Autoscaler (HPA)配置当或指标获取失败。 #### 解决步骤: 1. **检查Typha部署和HPA配置**: ```bash kubectl get deployment -n kube-system calico-typha kubectl get hpa -n kube-system # 查看是否有为Typha创建的HPA ``` 如果未创建HPA,需要创建。以下是HPA示例: ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: calico-typha namespace: kube-system spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: calico-typha minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 ``` 2. **确保Metrics Server已安装**: HPA需要Metrics Server提供资源指标: ```bash kubectl top pods -n kube-system # 测试指标获取 ``` 如果未安装,请安装Metrics Server: ```bash kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml ``` 3. **调整Typha的资源请求**: 确保Typha的Deployment中设置了合理的资源请求(CPU和内存),以便HPA能正确计算利用率。例如: ```yaml resources: requests: cpu: 200m memory: 256Mi ``` 4. **开启Typha的监控指标**(引用[4]): 修改Calico的Monitoring配置以暴露Typha的指标,便于HPA使用: ```yaml apiVersion: operator.tigera.io/v1 kind: Monitoring metadata: name: default spec: prometheusMetrics: enabled: true typhaMetricsPort: 9093 # Typha的监控端口 ``` 应用配置后,HPA可通过Prometheus适配器获取自定义指标。 5. **检查HPA日志**: 查看HPA控制器日志以确定失败原因: ```bash kubectl logs -n kube-system deployment/kube-controller-manager | grep horizontal-pod-autoscaler ``` ### 综合建议: - 确保Calico版本与Kubernetes版本兼容(引用[1])。 - 清理旧配置:在重新安装前,彻底删除旧Calico资源(包括CRD和命名空间`tigera-operator`)。 ### 相关问题: §§相关问题§§ 1. 如何诊断Calico网络插件的常见故障? 2. Kubernetes中如何正确配置HPA以实现自动扩缩容? 3. Typha在Calico架构中起什么作用? </think>### 解决Calico资源删除警告与Typha自动缩放失败问题 #### 问题一:资源删除时检测到更改的警告分析 当出现 `Detected changes to resource... currently being deleted` 警告时,通常表示 Calico 资源(如 `installation.operator.tigera.io`)在删除过程中被意外修改。根本原因包括: 1. **Finalizer阻塞**:资源存在未执行的 finalizer,阻止完全删除 2. **控制器冲突**:Tigera Operator 控制器在删除过程中尝试更新资源状态 3. **版本兼容**:Calico v3.28.1 与 Kubernetes 1.32.2 可能存在兼容性问题(参考引用[1]) #### 解决方案步骤 1. **强制移除阻塞资源**: ```bash # 检查卡在 Terminating 状态的资源 kubectl get installation.operator.tigera.io -o yaml | grep deletionTimestamp # 移除 finalizer(针对每个阻塞资源) kubectl patch installation.operator.tigera.io default \ --type=merge -p '{"metadata":{"finalizers":[]}}' ``` ![资源状态示意图](https://via.placeholder.com/400x200?text=Terminating+Resource+Diagram) 2. **重启Tigera Operator**: ```bash # 重启Operator控制器 kubectl rollout restart deployment tigera-operator -n tigera-operator # 验证Operator日志 kubectl logs -n tigera-operator deploy/tigera-operator | grep "reconciliation" ``` 3. **版本兼容性验证**: 参考 Kubernetes 1.32.2 的[兼容性矩阵](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.32.md),确认 Calico v3.28.1 是否支持: ```markdown | Kubernetes | Calico | 兼容性 | |-----------|--------------|--------| | 1.32.x | v3.25+ | ✓ | | 1.32.x | v3.28.1 | ✓ | # 当前版本兼容 ``` #### 问题二:Typha自动缩放失败分析 Typha 自动缩放失效通常由以下原因导致: 1. **HPA配置错误**:HorizontalPodAutoscaler 未正确关联 Typha Deployment 2. **指标缺失**:Metrics Server 未提供 CPU/内存使用数据 3. **资源限制未设置**:Typha Pod 未定义 resources.requests 导致无法计算利用率 #### 解决方案步骤 1. **验证HPA配置**: ```bash # 检查现有HPA配置 kubectl get hpa -n calico-apiserver # 创建/修复HPA配置(示例) kubectl autoscale deployment calico-typha -n calico-apiserver \ --cpu-percent=70 --min=2 --max=10 ``` 2. **启用Typha指标监控**(参考引用[4]): ```yaml # calico-typha-metrics.yaml apiVersion: operator.tigera.io/v1 kind: Monitoring metadata: name: typha-metrics spec: prometheusMetrics: enabled: true typhaMetricsPort: 9093 # Typha专用指标端口 ``` ```bash kubectl apply -f calico-typha-metrics.yaml ``` 3. **配置资源请求**: ```yaml # typha-deployment-patch.yaml spec: template: spec: containers: - name: calico-typha resources: requests: cpu: "200m" memory: "512Mi" ``` ```bash kubectl patch deployment calico-typha -n calico-apiserver --patch-file typha-deployment-patch.yaml ``` 4. **验证指标流水线**: ```bash # 检查Metrics Server kubectl top pod -n calico-apiserver # 测试指标端点 kubectl run -it --rm testpod --image=curlimages/curl \ -- sh -c "curl calico-typha.calico-apiserver:9093/metrics" ``` #### 预防措施 1. **版本控制**:保持 Calico 与 Kubernetes 版本匹配(参考引用[1]) 2. **定期清理**:每月执行资源审计 ```bash kubectl get installations.operator.tigera.io -o jsonpath='{range .items[?(@.metadata.deletionTimestamp)]}{.metadata.name}{"\n"}{end}' | xargs -I{} kubectl patch installations.operator.tigera.io {} --type=merge -p '{"metadata":{"finalizers":[]}}' ``` 3. **监控配置**:启用 Prometheus 监控(引用[4]) $$ \text{监控覆盖率} = \frac{\text{已监控组件}}{\text{总组件}} \times 100\% $$ > **关键提示**:Typha 实例数建议设置为 Kubernetes 节点数的 1/50(向上取整)。例如 100 节点集群至少需要 2 个 Typha 实例[^2]。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值