问题
OpenStack中很容易导致数据库和真实状态不一致的情况。因为OpenStack中操作基本都是分步完成的,从api接受请求到调度再到具体的操作节点,每一步都有可能更新数据库状态,如果哪一个出错就会直接抛出异常导致整个操作链中断,然后数据库就处于上一个操作后的更新状态。比较典型的就是删除实例,如果在nova-compute出错那这个实例的状态就可能永远处于deleting状态了。
现在我遇到这样一个问题,我有一个Volume挂载在一个实例上,但是不知道什么原因,这个Volume与这个实例的联系断了,在nova-volume通过tgtadm查看发现已经没有客户端连接到该Volume了。但是数据库中该记录还在,这导致以下结果:
1) 删除实例时无法删除,提示“Stderr: 'iscsiadm: No records found'”
2) 无法从实例卸载Volume 。于是只能直接操作数据库了。
与Volume相关的表
数据库中与Volume直接相关的几个表如下所示
操作Volume时数据库的相关数据变化
新建Volume:
select * from volumes where id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 07:00:23
updated_at: 2012-10-29 07:00:25
deleted_at: NULL
deleted: 0
id: 40
ec2_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
host: store2.sigsit.org
size: 20
availability_zone: nova
instance_id: NULL
mountpoint: NULL
attach_time: NULL
status: available
attach_status: detached
scheduled_at: 2012-10-29 07:00:23
launched_at: 2012-10-29 07:00:25
terminated_at: NULL
display_name: test
display_description:
provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
provider_auth: NULL
snapshot_id: NULL
volume_type_id: NULL
select * from volume_metadata where volume_id = 40\G
select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
deleted: 0
id: 205
target_num: 5
host: store2.sigsit.org
volume_id: 40
select * from block_device_mapping where volume_id = 40\G
select * from sm_volume where id = 40\G
将该Volume挂载到一个实例后:
select * from volumes where id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 07:00:23
updated_at: 2012-10-29 11:55:36
deleted_at: NULL
deleted: 0
id: 40
ec2_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
host: store2.sigsit.org
size: 20
availability_zone: nova
instance_id: 70
mountpoint: /dev/vdc
attach_time: NULL
status: in-use
attach_status: attached
scheduled_at: 2012-10-29 07:00:23
launched_at: 2012-10-29 07:00:25
terminated_at: NULL
display_name: test
display_description:
provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
provider_auth: NULL
snapshot_id: NULL
volume_type_id: NULL
select * from volume_metadata where volume_id = 40\G
select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
deleted: 0
id: 205
target_num: 5
host: store2.sigsit.org
volume_id: 40
select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 11:55:36
updated_at: NULL
deleted_at: NULL
deleted: 0
id: 49
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
select * from sm_volume where id = 40\G
select * from instances where id = 70\G
*************************** 1. row ***************************
created_at: 2012-09-10 02:32:36
updated_at: 2012-09-12 10:43:48
deleted_at: NULL
deleted: 0
id: 70
internal_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
kernel_id:
ramdisk_id:
server_name: NULL
launch_index: 0
key_name: NULL
key_data: NULL
power_state: 1
vm_state: active
memory_mb: 1024
vcpus: 1
hostname: jiangyong-win7
host: stack6.sigsit.org
user_data:
reservation_id: r-h1yqckm4
scheduled_at: 2012-09-10 02:32:37
launched_at: 2012-09-10 02:32:48
terminated_at: NULL
display_name: jiangyong-win7
display_description: jiangyong-win7
availability_zone: NULL
locked: 0
os_type: NULL
launched_on: stack6.sigsit.org
instance_type_id: 19
vm_mode: NULL
uuid: 333f6afa-9009-40f7-a493-20b2382628b1
architecture: NULL
root_device_name: /dev/vda
access_ip_v4: NULL
access_ip_v6: NULL
config_drive:
task_state: NULL
default_ephemeral_device: NULL
default_swap_device: NULL
progress: 0
auto_disk_config: NULL
shutdown_terminate: 1
disable_terminate: 0
root_gb: 0
ephemeral_gb: 0
cell_name: NULL
将该Volume从实例卸载:
select * from volumes where id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 07:00:23
updated_at: 2012-10-29 11:58:36
deleted_at: NULL
deleted: 0
id: 40
ec2_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
host: store2.sigsit.org
size: 20
availability_zone: nova
instance_id: NULL
mountpoint: NULL
attach_time: NULL
status: available
attach_status: detached
scheduled_at: 2012-10-29 07:00:23
launched_at: 2012-10-29 07:00:25
terminated_at: NULL
display_name: test
display_description:
provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
provider_auth: NULL
snapshot_id: NULL
volume_type_id: NULL
select * from volume_metadata where volume_id = 40\G
select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
deleted: 0
id: 205
target_num: 5
host: store2.sigsit.org
volume_id: 40
select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 11:55:36
updated_at: NULL
deleted_at: 2012-10-29 11:58:36
deleted: 1
id: 49
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
select * from sm_volume where id = 40\G
select * from instances where id = 70\G
*************************** 1. row ***************************
created_at: 2012-09-10 02:32:36
updated_at: 2012-09-12 10:43:48
deleted_at: NULL
deleted: 0
id: 70
internal_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
kernel_id:
ramdisk_id:
server_name: NULL
launch_index: 0
key_name: NULL
key_data: NULL
power_state: 1
vm_state: active
memory_mb: 1024
vcpus: 1
hostname: jiangyong-win7
host: stack6.sigsit.org
user_data:
reservation_id: r-h1yqckm4
scheduled_at: 2012-09-10 02:32:37
launched_at: 2012-09-10 02:32:48
terminated_at: NULL
display_name: jiangyong-win7
display_description: jiangyong-win7
availability_zone: NULL
locked: 0
os_type: NULL
launched_on: stack6.sigsit.org
instance_type_id: 19
vm_mode: NULL
uuid: 333f6afa-9009-40f7-a493-20b2382628b1
architecture: NULL
root_device_name: /dev/vda
access_ip_v4: NULL
access_ip_v6: NULL
config_drive:
task_state: NULL
default_ephemeral_device: NULL
default_swap_device: NULL
progress: 0
auto_disk_config: NULL
shutdown_terminate: 1
disable_terminate: 0
root_gb: 0
ephemeral_gb: 0
cell_name: NULL
再次将该Volume挂载到该实例:
select * from volumes where id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 07:00:23
updated_at: 2012-10-29 12:00:32
deleted_at: NULL
deleted: 0
id: 40
ec2_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
host: store2.sigsit.org
size: 20
availability_zone: nova
instance_id: 70
mountpoint: /dev/vdc
attach_time: NULL
status: in-use
attach_status: attached
scheduled_at: 2012-10-29 07:00:23
launched_at: 2012-10-29 07:00:25
terminated_at: NULL
display_name: test
display_description:
provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
provider_auth: NULL
snapshot_id: NULL
volume_type_id: NULL
select * from volume_metadata where volume_id = 40\G
select * from iscsi_targets where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-09-24 09:00:36
updated_at: 2012-10-29 07:00:24
deleted_at: NULL
deleted: 0
id: 205
target_num: 5
host: store2.sigsit.org
volume_id: 40
select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 11:55:36
updated_at: NULL
deleted_at: 2012-10-29 11:58:36
deleted: 1
id: 49
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
*************************** 2. row ***************************
created_at: 2012-10-29 12:00:32
updated_at: NULL
deleted_at: NULL
deleted: 0
id: 50
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
select * from sm_volume where id = 40\G
select * from instances where id = 70\G
*************************** 1. row ***************************
created_at: 2012-09-10 02:32:36
updated_at: 2012-09-12 10:43:48
deleted_at: NULL
deleted: 0
id: 70
internal_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
image_ref: 6c239063-9d2a-41ce-9612-bfe3564cc203
kernel_id:
ramdisk_id:
server_name: NULL
launch_index: 0
key_name: NULL
key_data: NULL
power_state: 1
vm_state: active
memory_mb: 1024
vcpus: 1
hostname: jiangyong-win7
host: stack6.sigsit.org
user_data:
reservation_id: r-h1yqckm4
scheduled_at: 2012-09-10 02:32:37
launched_at: 2012-09-10 02:32:48
terminated_at: NULL
display_name: jiangyong-win7
display_description: jiangyong-win7
availability_zone: NULL
locked: 0
os_type: NULL
launched_on: stack6.sigsit.org
instance_type_id: 19
vm_mode: NULL
uuid: 333f6afa-9009-40f7-a493-20b2382628b1
architecture: NULL
root_device_name: /dev/vda
access_ip_v4: NULL
access_ip_v6: NULL
config_drive:
task_state: NULL
default_ephemeral_device: NULL
default_swap_device: NULL
progress: 0
auto_disk_config: NULL
shutdown_terminate: 1
disable_terminate: 0
root_gb: 0
ephemeral_gb: 0
cell_name: NULL
再次卸载该Volume并删除:
select * from volumes where id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 07:00:23
updated_at: 2012-10-29 13:41:56
deleted_at: 2012-10-29 13:44:36
deleted: 1
id: 40
ec2_id: NULL
user_id: 397dd3be88b6492caa88521502b07617
project_id: c6159a4f3dd34a2b83527499a40dbd2b
host: store2.sigsit.org
size: 20
availability_zone: nova
instance_id: NULL
mountpoint: NULL
attach_time: NULL
status: deleting
attach_status: detached
scheduled_at: 2012-10-29 07:00:23
launched_at: 2012-10-29 07:00:25
terminated_at: 2012-10-29 13:41:55
display_name: test
display_description:
provider_location: 10.61.2.14:3260,5 iqn.2010-10.org.openstack:volume-00000028 1
provider_auth: NULL
snapshot_id: NULL
volume_type_id: NULL
select * from volume_metadata where volume_id = 40\G
select * from iscsi_targets where volume_id = 40\G
select * from block_device_mapping where volume_id = 40\G
*************************** 1. row ***************************
created_at: 2012-10-29 11:55:36
updated_at: NULL
deleted_at: 2012-10-29 11:58:36
deleted: 1
id: 49
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
*************************** 2. row ***************************
created_at: 2012-10-29 12:00:32
updated_at: NULL
deleted_at: 2012-10-29 13:10:51
deleted: 1
id: 50
instance_id: 70
device_name: /dev/vdc
delete_on_termination: 0
virtual_name: NULL
snapshot_id: NULL
volume_id: 40
volume_size: NULL
no_device: NULL
connection_info: {"driver_volume_type": "iscsi", "data": {"device_path": "/dev/disk/by-path/ip-10.61.2.14:3260-iscsi-iqn.2010-10.org.openstack:volume-00000028-lun-1", "target_discovered": false, "target_iqn": "iqn.2010-10.org.openstack:volume-00000028", "target_portal": "10.61.2.14:3260", "volume_id": 40, "target_lun": 1}}
select * from sm_volume where id = 40\G
相关结论
-
创建Volume对数据库的修改
- 修改volumes表,添加一条Volume记录。
- 修改iscsi_targets表,寻找一个可用的target记录,将该记录的volume_id设置为新添volume的id。target记录有target_num和host信息,然后nova会用这两个值去相应的主机用这个target_num创建Volume。 挂载Volume对数据库的修改
- 修改volumes表,设置instance_id、mountpoint为实例id、设备名,修改Volume的状态status、attach_status为in-use、attached
- 修改block_device_mapping表,添加一条映射记录,包括包含实例和卷的信息,特别是卷的连接信息。 卸载Volume对数据库的修改
- 修改volumes表,设置instance_id、mountpoint为null,修改Volume的状态status、attach_status为available、detached
- 修改block_device_mapping表,修改相应的映射记录,设置deleted_at时间及deleted为1。 删除Volume对数据库的修改
- 修改volumes表,设置deleted_at时间及deleted为1。
- 修改iscsi_targets表,修改将volume_id对应的记录,设置volume_id为null。
在数据库中修改以上四个数值,Volume可以成功挂载、卸载到其它实例,原来的问题实例也可以正常删除了。当然具体情况还得具体分析,我已经查看过Volume的状态知道已经实质上被卸载了,没有客户端连接到该Volume上,所以只要在数据库中将该Volume恢复成未挂载状态即可。
因此,当Volume处于不一致的状态时,首先应当登录Volume所在的存储节点,通过tgtadm --lld iscsi --mode target --op show命令查看Volume的状态:Volume是不是还存在?有没有连接的客户端?然后修改相关的数据库状态值。