CMap 使用注意说明

CMap深入解析
本文深入讲解了MFC中CMap的使用方法,包括其内部结构、如何为自定义类提供哈希函数及比较函数,以及如何正确使用CString作为键。

Introduction

Programmers like me, learnt STL::map before CMap always think CMap is difficult to use, and always try to use CMap in the way as a STL::map. In this article, I will explain more information about CMap and what should you do to use it for your own custom class. And as the end of this article, I will show an example of how to use CMap correctly with CString* (note I mean CString pointer and not CString :> )

CMap Internal

The first thing to be noted is that CMap is actually a hash map, and not a tree map (and usually a Red-black tree) as STL::map. Below shows the internal structure of a CMap.

 

How to declare a CMap

Many people get confused about CMap's declaration CMap<KEY, ARG_KEY, VALUE, ARG_VALUE>, why not just CMap<KEY, VALUE>

In fact, the ultimate data container in CMap is CPair, and the internal of CPair is {KEY, VALUE}. Therefore, CMap will really store a KEY, and not ARG_KEY. However, if you check with the MFC source code, almost all the internal parameters passing within CMap itself is called with ARG_KEY and ARG_VALUE, therefore, using KEY& as ARG_KEY seems always be a correct answer, except

  1. You are using primitive date types like int, char, where pass-by-value makes no difference (may be even faster) with pass-by-reference
  2. If you use CString as KEY, you should use LPCTSTR as ARG_KEY and not CString&, we will talk more about this later

So what should I do to make CMap works with my ClassX

Well, as I mentioned earlier, CMap is a hash map, a hash map will try to get the "hash value" -- an UINT -- from the key and use that hash value as the index in the hash table (well, actually it is hash value % hash table size). If more then one key have the same hash value, they will be link in a link list. Therefore, the first thing you have to do is to provide a hash function.

CMap will call a templated function HashKey() to do the hashing. The default implementation and specialized version for LPCSTR and LPCWSTR are listed as follows:

// inside <afxtemp.h>template<class ARG_KEY>AFX_INLINE UINT AFXAPI HashKey(ARG_KEY key){	// default identity hash - works for most primitive values	return (DWORD)(((DWORD_PTR)key)>>4);}// inside <strcore.cpp>// specialized implementation for LPCWSTR#if _MSC_VER >= 1100template<> UINT AFXAPI HashKey<LPCWSTR> (LPCWSTR key)#elseUINT AFXAPI HashKey(LPCWSTR key)#endif{	UINT nHash = 0;	while (*key)		nHash = (nHash<<5) + nHash + *key++;	return nHash;}// specialized implementation for LPCSTR#if _MSC_VER >= 1100template<> UINT AFXAPI HashKey<LPCSTR> (LPCSTR key)#elseUINT AFXAPI HashKey(LPCSTR key)#endif{	UINT nHash = 0;	while (*key)		nHash = (nHash<<5) + nHash + *key++;	return nHash;}

As you can see, the default behavior is to "assume" key is a pointer, and convert it to DWORD, and that's why you will get "error C2440: 'type cast': cannot convert from 'ClassXXX' to 'DWORD_PTR'" if you don't provide a specialized HashKey() for your ClassX.

And because MFC only have specialized implementation for the LPCSTR and LPCWSTR, and not for CStringA nor CStringW, this is why if you want to use CString in CMap, you have to declare CMap<CString, LPCTSTR....>

Ok, now you know how CMap calculate the hash value, but since more than one key may have the same hash value, CMap needs to traverse the whole link list to find the one with exactly the same key "content", not only with the same "hash value". And when CMap do the matching, it will call CompareElements(), another templated function.

// inside <afxtemp.h>// noted: when called from CMap, TYPE=KEY, ARG_TYPE=ARG_TYPE// and note pElement1 is TYPE*, not TYPEtemplate<class TYPE, class ARG_TYPE>BOOL AFXAPI CompareElements(const TYPE* pElement1, const ARG_TYPE* pElement2){	ASSERT(AfxIsValidAddress(pElement1, sizeof(TYPE), FALSE));	ASSERT(AfxIsValidAddress(pElement2, sizeof(ARG_TYPE), FALSE));	// for CMap<CString, LPCTSTR...>	// we are comparing CString == LPCTSTR	return *pElement1 == *pElement2;}

Therefore, if you want to use CMap with you own custom ClassX, you will have to provide a specialized implementation for HashKey() and CompareElements().

Example: CMap with CString*

Provided as an example, below is what you need to do to make CMap works with CString*, and of course, using the string content as the key, and not the address of the pointer.

template<> UINT AFXAPI HashKey<CString*> (CString* key){	return (NULL == key) ? 0 : HashKey((LPCTSTR)(*key));}// I don't know why, but CompareElements can't work with CString*// have to define thistypedef CString* LPCString;template<>BOOL AFXAPI CompareElements<LPCString, LPCString> (const LPCString* pElement1, const LPCString* pElement2){	if ( *pElement1 == *pElement2 ) {		// true even if pE1==pE2==NULL		return true;	} else if ( NULL != *pElement1 && NULL != *pElement2 ) {		// both are not NULL		return **pElement1 == **pElement2;	} else {		// either one is NULL		return false;	}}

And the main program is as simple as the following:

int _tmain(int argc, TCHAR* argv[], TCHAR* envp[]){	CMap<CString*, CString*, int, int> map;	CString name1 = "Microsoft";	CString name2 = "Microsoft";	map[&name1] = 100;	int x = map[&name2];	printf("%s = %d/n", (LPCTSTR)name1, x);*/	return 0;}--------- console output ---------Microsoft = 100

Please note that the program can compile without error even without the specialized HashKey() and CompareElements(), but of course, the out output will then be 0, probably not what you want.

My final note about CMap

  1. CMap is a hash map and STL::map is a tree map, it is no meaning to compare the two performance (compare apple to orange!). But if you will retrieve the keys in sorted order, then you will have to use STL::map.
  2. The design of HashKey() is critical to the overall performance. You should provide a HashKey() that has low collision rate (different key generally would have different hash value) AND is easy to calculate (not a MD5 of the string, etc..). And we have to note that -- at least for some of the classes -- this is not easy.
  3. When using CMap (as well as STL::hash_map), always beware of the hash table size. As quoted in MSDN "the hash table size should be a prime number. To minimize collisions, the size should be roughly 20 percent larger than the largest anticipated data set". By default, CMap hash table size is 17, which should be ok for around 10 keys. You can change the hash table size with InitHashTable(UINT nHashSize), and only can do so before the first element is added to the map. You can find more prime numbers at here. (And don't mix-up with CMap(UINT nBlockSize), nBlockSize is to acquire more than one CAssoc to speed up the creation of a new node.)

Reference

History

  • 17th March 2006: Initial version uploaded.


Trackback: http://tb.blog.youkuaiyun.com/TrackBack.aspx?PostId=649293

 
<think>我们提供的是一个融合光学影像和SAR数据进行滑坡识别的端到端系统。下面将详细说明代码的使用步骤、参数配置和注意事项。 ### 使用说明 #### 1. 环境准备 确保安装以下Python库: - TensorFlow 2.x - NumPy - scikit-learn 安装命令: ```bash pip install tensorflow numpy scikit-learn ``` #### 2. 数据准备 - 输入数据应为Numpy数组格式,存储为`.npy`文件 - 数据维度要求:`[样本数, 高度, 宽度, 时间步长, 通道数]` - **通道顺序**:前4个通道为光学影像(建议使用Sentinel-2的波段2、3、4、8),后2个通道为SAR数据(VV和VH极化) - 示例数据加载代码: ```python # 加载光学数据 (示例:4个波段,8个时间点) optical_data = np.load('optical.npy') # shape: [num_samples, 256, 256, 8, 4] # 加载SAR数据 sar_data = np.load('sar.npy') # shape: [num_samples, 256, 256, 8, 2] # 合并数据 data = np.concatenate([optical_data, sar_data], axis=-1) # [num_samples, 256, 256, 8, 6] ``` #### 3. 模型初始化 ```python from landslide_detector import LandslideDetector # 初始化模型 detector = LandslideDetector(input_shape=(256, 256, 8, 6)) ``` 参数说明: - `input_shape`: 输入数据的形状(高度、宽度、时间步长、通道数) - `latent_dim`: GAN生成器的潜在向量维度(默认100) #### 4. 非监督预训练(可选) 当有大量无标签数据时,使用GAN进行特征提取器的预训练: ```python # 假设data是一个包含多个样本的numpy数组 detector.train_unsupervised(data, epochs=50, batch_size=16) ``` 参数: - `epochs`: 训练轮数 - `batch_size`: 批大小 #### 5. 滑坡检测 ```python # 使用聚类方法(非监督) landslide_prob = detector.detect_landslides(data[0:1], use_clustering=True) # 使用监督分割(需要预先训练分割网络) # detector.fusion_net.fit(x_train, y_train) # 如果有标签,先训练分割网络 # landslide_prob = detector.detect_landslides(data[0:1], use_clustering=False) ``` 参数: - `data`: 输入数据,形状为`[batch_size, H, W, T, C]` - `use_clustering`: - `True`: 使用非监督聚类方法(无需标签) - `False`: 使用监督分割网络(需要预先训练) #### 6. 结果可视化 滑坡概率图输出为与输入图像空间尺寸相同的单通道概率图: ```python import matplotlib.pyplot as plt plt.imshow(landslide_prob[0, ..., 0], cmap='jet') # 可视化第一个样本的结果 plt.colorbar() plt.savefig('result.png') ``` ### 关键参数调整 1. **非监督训练参数**: - `contamination` (在`unsupervised_detection`方法中): 调整异常检测的污染率(滑坡比例) - `n_clusters` (在`unsupervised_detection`方法中): K-means聚类数量 2. **监督训练**: 如果需要使用监督分割网络,需额外训练: ```python # 加载标签数据 (二值掩膜) labels = np.load('landslide_labels.npy') # [num_samples, 256, 256, 1] # 训练分割网络 history = detector.fusion_net.fit( x=training_data, y=labels, batch_size=8, epochs=100, validation_split=0.2 ) ``` ### 注意事项 1. **数据预处理**: - 光学和SAR数据必须进行精确的几何配准 - 建议对SAR数据做辐射定标和地形校正 - 光学数据需进行大气校正 2. **硬件要求**: - 3D卷积需要较大显存,建议使用GPU - 若显存不足,可减小批次大小或空间尺寸 3. **自定义模型**: 如果输入数据的时间步长或通道数变化,需调整`input_shape`参数: ```python # 例如仅使用5个时间步 detector = LandslideDetector(input_shape=(256, 256, 5, 6)) ``` 4. **结果后处理**: 输出概率图可通过阈值二值化: ```python binary_mask = (landslide_prob > 0.5).astype(np.uint8) ``` §§相关问题§§: 1. 如何对Sentinel-1 SAR数据进行预处理以用于本模型? 2. 当只有单时相数据时如何修改模型? 3. 滑坡检测中常见的误检有哪些?如何通过后处理减少? 4. 如何将模型部署到GIS平台进行实际应用? </think>### 光学-SAR滑坡识别系统使用说明 以下是融合优化后的滑坡识别系统的详细使用指南,包含数据准备、模型训练、预测和可视化全流程: ```python # ====================== # 安装依赖 (Python 3.8+) # ====================== # 必需库文件 !pip install tensorflow==2.10.0 numpy scikit-learn matplotlib rasterio gdal # ====================== # 1. 数据准备脚本 # ====================== import numpy as np import rasterio from osgeo import gdal def prepare_data(optical_paths, sar_paths, output_dir): """ 准备多时相光学和SAR数据 :param optical_paths: 光学影像路径列表 [时序] :param sar_paths: SAR影像路径列表 [时序] :param output_dir: 输出目录 """ # 创建输出目录 import os os.makedirs(output_dir, exist_ok=True) # 读取参考影像获取地理信息 with rasterio.open(optical_paths[0]) as src: profile = src.profile transform = src.transform # 处理每个时间步 all_data = [] for i, (opt_path, sar_path) in enumerate(zip(optical_paths, sar_paths)): # 读取光学数据 (Sentinel-2: B2,B3,B4,B8) with rasterio.open(opt_path) as src: optical = src.read([2,3,4,8]) # 读取4个波段 optical = np.moveaxis(optical, 0, -1) # [通道, H, W] -> [H, W, 通道] # 读取SAR数据 (Sentinel-1: VV,VH) with rasterio.open(sar_path) as src: sar = src.read([1,2]) # VV和VH波段 sar = np.moveaxis(sar, 0, -1) # 数据归一化 optical = (optical - np.mean(optical, axis=(0,1))) / np.std(optical, axis=(0,1)) sar = (sar - np.mean(sar, axis=(0,1))) / np.std(sar, axis=(0,1)) # 合并为单时间步数据 [H, W, 6] timestep_data = np.concatenate([optical, sar], axis=-1) all_data.append(timestep_data) # 保存单时间步数据 np.save(f"{output_dir}/timestep_{i:02d}.npy", timestep_data) # 创建时空数据立方体 [时间步, H, W, 6] data_cube = np.stack(all_data, axis=0) # [T, H, W, C] np.save(f"{output_dir}/data_cube.npy", data_cube) print(f"数据准备完成. 形状: {data_cube.shape} [时间, 高度, 宽度, 通道]") return data_cube # 示例使用 if __name__ == "__main__": optical_paths = [ "data/S2_20230101.tif", "data/S2_20230110.tif", "data/S2_20230120.tif" ] sar_paths = [ "data/S1_20230102.tif", "data/S1_20230111.tif", "data/S1_20230121.tif" ] data_cube = prepare_data(optical_paths, sar_paths, "prepared_data") # ====================== # 2. 模型训练与预测 # ====================== from landslide_detector import LandslideDetector import matplotlib.pyplot as plt def train_and_predict(data_path, epochs=50): """ 训练模型并生成滑坡概率图 :param data_path: 数据立方体路径 :param epochs: 训练轮数 """ # 加载数据 data_cube = np.load(data_path) # [T, H, W, C] # 重塑为模型输入格式 [批, H, W, T, C] # 本例假设单个场景,批量大小为1 input_data = np.expand_dims(np.moveaxis(data_cube, 0, 2), 0) # [1, H, W, T, C] # 初始化检测器 detector = LandslideDetector(input_shape=input_data.shape[1:]) # 非监督预训练 print("开始非监督预训练...") detector.train_unsupervised(input_data, epochs=epochs, batch_size=1) # 滑坡检测 print("进行滑坡检测...") landslide_map = detector.detect_landslides(input_data) # 可视化结果 plt.figure(figsize=(15, 5)) # 显示光学预览 (第一时相) plt.subplot(131) plt.title("光学影像 (RGB)") plt.imshow(input_data[0, :, :, 0, 1:4]) # 显示RGB波段 # 显示SAR预览 (第一时相) plt.subplot(132) plt.title("SAR (VV极化)") plt.imshow(input_data[0, :, :, 0, 4], cmap='gray') # 显示滑坡概率图 plt.subplot(133) plt.title("滑坡概率图") prob_map = plt.imshow(landslide_map[0,...,0], cmap='jet', vmin=0, vmax=1) plt.colorbar(prob_map, fraction=0.046, pad=0.04) plt.tight_layout() plt.savefig("landslide_result.png") plt.show() return landslide_map # 执行训练和预测 if __name__ == "__main__": result = train_and_predict("prepared_data/data_cube.npy", epochs=30) # ====================== # 3. 后处理与输出 # ====================== from skimage import filters def postprocess_results(landslide_map, output_path, threshold=0.5): """ 滑坡结果后处理 :param landslide_map: 原始概率图 [H, W] :param output_path: 输出文件路径 :param threshold: 二值化阈值 """ # 高斯平滑 smoothed = filters.gaussian(landslide_map, sigma=1) # 二值化 binary_mask = (smoothed > threshold).astype(np.uint8) # 形态学开运算去除小噪点 from skimage.morphology import opening, disk binary_mask = opening(binary_mask, disk(3)) # 保存为GeoTIFF from osgeo import gdal driver = gdal.GetDriverByName('GTiff') out_ds = driver.Create(output_path, binary_mask.shape[1], binary_mask.shape[0], 1, gdal.GDT_Byte) # 设置地理信息(需要实际坐标) # out_ds.SetGeoTransform(transform) # out_ds.SetProjection(projection) out_band = out_ds.GetRasterBand(1) out_band.WriteArray(binary_mask) out_band.FlushCache() return binary_mask # 使用示例 if __name__ == "__main__": # 假设landslide_map是从前一步获得的结果 final_mask = postprocess_results(result[0,:,:,0], "final_landslide.tif") ``` ### 使用流程指南 #### 1. 数据准备阶段 1. **数据要求**: - 光学影像:Sentinel-2 L2A级产品(包含波段2,3,4,8) - SAR数据:Sentinel-1 GRD产品(VV和VH极化) - 时间步数:至少3个时相(推荐6-8个时相效果最佳) 2. **目录结构**: ``` project/ ├── data/ │ ├── S2_20230101.tif │ ├── S1_20230102.tif │ └── ...(其他时相) ├── prepared_data/(自动生成) └── landslide_detector.py(模型代码) ``` #### 2. 运行步骤 1. **准备时序数据**: ```python # 在prepare_data.py中运行 prepare_data(optical_paths, sar_paths, "prepared_data") ``` 2. **训练和预测**: ```python # 在train_predict.py中运行 result = train_and_predict("prepared_data/data_cube.npy") ``` 3. **后处理输出**: ```python # 在postprocess.py中运行 postprocess_results(result[0,:,:,0], "output/final_landslide.tif") ``` #### 3. 关键参数调整 | 参数 | 位置 | 推荐值 | 说明 | |------|------|--------|------| | `epochs` | `train_unsupervised()` | 30-100 | 非监督训练轮数 | | `contamination` | `unsupervised_detection()` | 0.01-0.1 | 异常检测比例 | | `n_clusters` | `unsupervised_detection()` | 3-5 | 聚类数量 | | `threshold` | `postprocess_results()` | 0.3-0.7 | 二值化阈值 | | `sigma` | `gaussian()` | 0.5-2.0 | 平滑强度 | #### 4. 硬件建议 - **GPU加速**:推荐使用NVIDIA GPU(≥8GB显存) - **内存需求**:至少16GB RAM(处理512×512影像) - **处理时间**: - 数据准备:每时相约1-2分钟 - 模型训练:每epoch约3-5分钟(取决于图像大小) - 预测:单场景约2-5分钟 ### 注意事项 1. **数据对齐**:确保光学和SAR数据已精确配准 2. **时间一致性**:光学和SAR采集时间差不超过3天 3. **内存优化**:大区域处理时需分块处理 ```python # 分块处理示例 for i in range(0, height, 256): for j in range(0, width, 256): patch = full_image[i:i+256, j:j+256] # 处理每个分块... ``` 4. **无监督模式**:当缺少标签数据时使用聚类检测 ```python # 强制使用聚类方法 landslide_map = detector.detect_landslides(input_data, use_clustering=True) ``` ### 结果解释 输出结果包含三个主要产品: 1. **滑坡概率图**:0-1值表示滑坡可能性 2. **二值掩膜**:滑坡/非滑坡分类结果 3. **可视化对比图**:包含光学、SAR和滑坡概率的对比
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值