Annual Performance Report 工作年度汇报(带PPT文稿)

[Your Name/Department Name]
[Date]


Slide 1: Title Page

  • Title: Annual Performance Report 2024

  • Subtitle: A Year of Growth, Innovation, and Resilience

  • Presenter: [Your Name], [Your Title]

  • Date: [Presentation Date]

  • Company Logo

Speaker Notes: “Good morning/afternoon, everyone. Thank you for being here. Today, I’ll be walking you through our key achievements, learnings, and strategic direction for the upcoming year.”


Slide 2: Agenda / Roadmap

  • 标题: What We’ll Cover Today

  • 内容:

    1. Executive Summary: The Year at a Glance

    2. Key Performance Highlights: Metrics That Matter

    3. Departmental Deep Dive: Major Projects & Wins

    4. Market Insights & Challenges: The Landscape We Navigated

    5. Lessons Learned: Turning Insights into Action

    6. Looking Ahead: Strategic Goals for 2025

    7. Q&A

Speaker Notes: “Here’s our agenda for today. We’ll start with a high-level overview, dive into the specifics, and finish with our vision for the future.”


Slide 3: Executive Summary

  • 标题: 2024 in a Nutshell: A Story of Strategic Progress

  • 内容: (Use 3-4 powerful bullet points or icons)

    • Exceeded Revenue Target by 15%, driven by strong performance in [Product/Region X].

    • Successfully Launched [Major Project/Product Name], enhancing our market position.

    • Improved Customer Satisfaction (CSAT) score to an all-time high of 92%.

    • Navigated Market Volatility by optimizing operational efficiency, reducing costs by 8%.

Speaker Notes: “In summary, 2024 was a year where we not only met but exceeded our core objectives, thanks to the incredible effort of the entire team.”


Slide 4: Key Performance Indicators (KPIs) Dashboard

  • 标题: By the Numbers: Performance Against Targets

  • 内容: (Use charts/graphs: gauges, bar charts comparing Target vs. Actual)

    • Financial: Revenue, Profit Margin, Customer Acquisition Cost (CAC).

    • Operational: Project Completion Rate, Process Efficiency Gain.

    • Customer: Net Promoter Score (NPS), Customer Retention Rate.

    • People: Employee Engagement Score, Training Hours Delivered.

Speaker Notes: “Let’s look at the data. Across financial, operational, and customer metrics, we see a consistent pattern of strong performance and green lights.”


Slide 5-6: Major Achievements & Wins

  • 标题: Celebrating Our Success: Top 3 Wins of 2024

  • 内容: (Use one slide per win or a clean 3-column layout)

    • Win 1: Product Innovation

      • Launched [Product Name] ahead of schedule.

      • Result: Generated $X in new revenue, received [Number] positive press reviews.

    • Win 2: Market Expansion

      • Entered the [New Region/Country/Market Segment].

      • Result: Acquired [Number] new key clients, established a local partnership.

    • Win 3: Team & Culture

      • Implemented [Wellness/Development Program].

      • Result: Improved employee retention by 10% and won the “[External Award Name].”

Speaker Notes: “Behind these numbers are concrete achievements. I want to highlight three that we’re particularly proud of…”


Slide 7: Market Analysis & Challenges Faced

  • 标题: The External Environment: Opportunities & Headwinds

  • 内容:

    • Market Trends: (e.g., Increased demand for AI solutions, shift towards remote work).

    • Competitive Landscape: Key moves by Competitor A and B.

    • Challenges We Overcame:

      • Supply chain disruption in Q2.

      • Tight labor market for tech talent.

      • Our Response: (Briefly state how you adapted).

Speaker Notes: “Our success didn’t happen in a vacuum. We actively navigated a dynamic market. Here are the key trends and challenges we addressed.”


Slide 8: Lessons Learned & Key Insights

  • 标题: What 2024 Taught Us: Fuel for Future Growth

  • 内容: (Frame learnings positively as insights)

    • Insight 1: Agile project methodologies reduced our time-to-market by 20%. → Action: Standardize agile practices across all teams.

    • Insight 2: Customer co-creation sessions directly led to our top-rated product feature. → Action: Integrate customer feedback earlier in the development cycle.

    • Insight 3: Cross-departmental “innovation sprints” unlocked new efficiencies. → Action: Schedule quarterly cross-functional workshops.

Speaker Notes: “We believe in continuous improvement. These are the most valuable lessons we’re carrying forward to make us even stronger.”


Slide 9: Looking Forward: Strategic Priorities for 2025

  • 标题: The Road Ahead: Our Focus for 2025

  • 内容: (Use the OKR framework: Objective + Key Results)

    • Objective 1: Dominate in the [Specific] Market Segment.

      • KR1: Achieve 30% market share.

      • KR2: Launch 2 niche product enhancements.

    • Objective 2: Build a World-Class Customer Experience.

      • KR1: Achieve 95% CSAT.

      • KR2: Reduce average response time to under 2 hours.

    • Objective 3: Foster Innovation and Talent.

      • KR1: Implement a new idea incubation platform.

      • KR2: Increase internal promotion rate by 15%.

Speaker Notes: “Building on this momentum, our strategy for 2025 centers on three bold priorities. Here’s how we plan to execute.”


Slide 10: Investment & Resource Needs

  • 标题: Enabling Our 2025 Vision: Critical Support Needed

  • 内容: (Be specific and tie requests to strategic goals)

    • Talent: Approval for 3 new roles in [Data Analysis, Marketing].

    • Technology: Investment in [CRM Upgrade, AI Tool] to boost efficiency.

    • Training: Budget allocation for leadership development programs.

    • Summary: This investment will enable us to achieve [Specific KR from Slide 9].

Speaker Notes: “To successfully deliver on these priorities, we have identified key investments that will provide a strong return in capability and results.”


Slide 11: Acknowledgments

  • 标题: Thank You

  • 内容:

    • A huge thank you to my incredible team for their dedication and hard work.

    • Thank you to our leadership for their guidance and support.

    • Thank you to all our partners and clients for their trust.

    • Final Motivating Statement: “Here’s to building on our success in 2025!”

Speaker Notes: “None of this would be possible without the amazing people here. My sincere thanks to everyone. Let’s make 2025 even better.”


Slide 12: Q&A

  • 标题: Questions & Discussion

  • 内容:

    • Large, clean text: “Q&A”

    • Your contact information: [Email] | [LinkedIn/Internal Link]

    • Optional: Reiterate one key metric or vision statement.

Speaker Notes: “Thank you. I now welcome your questions.”

import baostock as bs import pandas as pd import numpy as np import matplotlib.pyplot as plt from datetime import datetime, timedelta import warnings warnings.filterwarnings('ignore') # 设置中文字体 plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False class MultiFactorStockSelection: def __init__(self): self.lg = None self.stock_pool = [] def login_baostock(self): """登录baostock""" self.lg = bs.login() if self.lg.error_code != '0': print(f'登录失败: {self.lg.error_msg}') return False print('baostock登录成功') return True def logout_baostock(self): """登出baostock""" if self.lg: bs.logout() print('baostock已登出') def get_all_stocks(self, date): """获取指定日期所有股票代码""" try: rs = bs.query_all_stock(date) if rs.error_code != '0': print(f"获取股票列表失败: {rs.error_msg}") return [] stock_list = rs.get_data() if stock_list.empty: print("获取的股票列表为空") return [] # 过滤有效的股票代码 valid_stocks = [] for code in stock_list['code']: if code and len(code) >= 6 and code not in ['', 'code']: valid_stocks.append(code) print(f"获取到 {len(valid_stocks)} 只有效股票") return valid_stocks except Exception as e: print(f"获取股票列表异常: {e}") return [] def clean_stock_pool(self, stock_list, clean_date): """ 清洗股票池 - 简化版本 实际应用中需要更复杂的ST股判断逻辑 """ if not stock_list: return [] cleaned_stocks = [] checked_count = 0 for code in stock_list: try: # 基本格式检查 if not code or len(code) < 6: continue # 跳过指数和基金 if code.startswith(('sh.1', 'sz.1', 'sh.15')): continue # 简化处理:直接通过,实际应用需要更严格的ST判断 cleaned_stocks.append(code) checked_count += 1 # 限制检查数量,提高速度 if checked_count >= 300: # 最多检查300只股票 break except Exception as e: print(f"检查股票 {code} 时出错: {e}") continue print(f"股票池清洗完成:原始{len(stock_list)}只,清洗后{len(cleaned_stocks)}只") return cleaned_stocks def get_stock_data(self, code, start_date, end_date): """获取股票历史数据""" try: # 确保代码格式正确 if not code or len(code) < 6: return None rs = bs.query_history_k_data_plus( code, "date,code,open,high,low,close,volume,turn,peTTM,pbMRQ", start_date=start_date, end_date=end_date, frequency="d", adjustflag="3" ) if rs.error_code != '0': return None df = rs.get_data() if df.empty: return None # 转换数据类型 numeric_cols = ['open', 'high', 'low', 'close', 'volume', 'turn', 'peTTM', 'pbMRQ'] for col in numeric_cols: if col in df.columns: df[col] = pd.to_numeric(df[col], errors='coerce') df['date'] = pd.to_datetime(df['date']) return df except Exception as e: print(f"获取股票 {code} 数据失败: {e}") return None def calculate_factors(self, df, current_date): """ 计算三个核心因子 - 简化稳定版本 """ if df is None or df.empty: return None try: current_date_dt = pd.to_datetime(current_date) current_data = df[df['date'] == current_date_dt].copy() if current_data.empty: # 如果没有精确日期数据,使用最近的数据 current_data = df[df['date'] <= current_date_dt].tail(1).copy() if current_data.empty: return None result_data = current_data.iloc[[0]].copy() # 1. 价值因子:市净率PB倒数 if 'pbMRQ' in result_data.columns and pd.notna(result_data['pbMRQ'].iloc[0]): pb_value = float(result_data['pbMRQ'].iloc[0]) if pb_value > 0: result_data['value_pb'] = 1 / pb_value else: result_data['value_pb'] = 0 else: result_data['value_pb'] = 0 # 2. 质量因子:使用价格变化作为ROE代理 if len(df) >= 60: # 至少2个月数据 current_price = float(result_data['close'].iloc[0]) start_price = float(df['close'].iloc[0]) if start_price > 0: result_data['quality_roe'] = (current_price / start_price - 1) else: result_data['quality_roe'] = 0 else: result_data['quality_roe'] = 0 # 3. 动量因子:过去3个月收益率(简化版本) if len(df) >= 60: current_price = float(result_data['close'].iloc[0]) price_3m_ago = float(df['close'].iloc[0]) if price_3m_ago > 0: result_data['momentum_3m'] = (current_price / price_3m_ago - 1) else: result_data['momentum_3m'] = 0 else: result_data['momentum_3m'] = 0 return result_data except Exception as e: print(f"计算因子失败: {e}") return None def remove_outliers(self, df, factor_columns): """去极值处理 - 稳健版本""" if df.empty: return df for factor in factor_columns: if factor in df.columns: valid_data = df[df[factor].notna()][factor] if len(valid_data) > 0: q_low = valid_data.quantile(0.05) q_high = valid_data.quantile(0.95) df[factor] = df[factor].clip(q_low, q_high) return df def standardize_factors(self, df, factor_columns): """Z-score标准化 - 稳健版本""" if df.empty: return df for factor in factor_columns: if factor in df.columns: valid_data = df[df[factor].notna()][factor] if len(valid_data) > 1: mean_val = valid_data.mean() std_val = valid_data.std() if std_val > 0: df[f'{factor}_std'] = (df[factor] - mean_val) / std_val else: df[f'{factor}_std'] = 0 else: df[f'{factor}_std'] = 0 return df def calculate_composite_score(self, df, weights=None): """计算综合得分""" if df.empty: return df if weights is None: weights = {'value_pb_std': 0.4, 'quality_roe_std': 0.3, 'momentum_3m_std': 0.3} df['composite_score'] = 0 for factor, weight in weights.items(): if factor in df.columns: df['composite_score'] += df[factor].fillna(0) * weight return df def select_top_stocks(self, df, top_pct=0.1): """选择排名前10%的股票""" if df is None or df.empty: return pd.DataFrame() df = df[df['composite_score'].notna()].copy() if df.empty: return pd.DataFrame() df = df.sort_values('composite_score', ascending=False) select_count = max(1, int(len(df) * top_pct)) selected_stocks = df.head(select_count)[['code', 'composite_score']].copy() return selected_stocks def get_trade_price(self, code, trade_date): """获取T+1日开盘价 - 简化版本""" try: if not code or len(code) < 6: return None # T+1日 trade_dt = datetime.strptime(trade_date, '%Y-%m-%d') t1_date = (trade_dt + timedelta(days=1)).strftime('%Y-%m-%d') rs = bs.query_history_k_data_plus( code, "open,close", start_date=t1_date, end_date=t1_date, frequency="d", adjustflag="3" ) if rs.error_code == '0': price_data = rs.get_data() if not price_data.empty: # 使用开盘价,如果没有则用收盘价 if 'open' in price_data.columns and pd.notna(price_data['open'].iloc[0]): return float(price_data['open'].iloc[0]) elif 'close' in price_data.columns and pd.notna(price_data['close'].iloc[0]): return float(price_data['close'].iloc[0]) return None except: return None def run_backtest(self, start_date, end_date, initial_capital=1000000): """运行回测 - 修复版本""" print("开始多因子选股回测...") if not self.login_baostock(): return None, None, None, None try: # 生成调仓日期(每月第一个交易日) all_dates = pd.date_range(start_date, end_date, freq='D') rebalance_dates = [] for date in all_dates: if date.day == 1: # 每月第一天作为调仓日 rebalance_dates.append(date) if not rebalance_dates: print("没有找到调仓日期,请检查时间范围") return None, None, None, None print(f"找到 {len(rebalance_dates)} 个调仓日期") # 初始化投资组合 portfolio_records = [] current_cash = initial_capital current_positions = {} # {code: shares} portfolio_values = [] dates_record = [] # 获取基准数据 benchmark_data = self.get_benchmark_data(start_date, end_date) for i, rebalance_date in enumerate(rebalance_dates): if i >= len(rebalance_dates) - 1: break current_date = rebalance_date.strftime('%Y-%m-%d') print(f"\n=== 调仓日: {current_date} ===") # 1. 获取并清洗股票池 all_stocks = self.get_all_stocks(current_date) if not all_stocks: print("获取股票列表失败,跳过本次调仓") continue cleaned_stocks = self.clean_stock_pool(all_stocks, current_date) if not cleaned_stocks: print("清洗后无有效股票,跳过本次调仓") continue # 2. 计算因子 all_factor_data = [] success_count = 0 for code in cleaned_stocks: stock_data = self.get_stock_data( code, start_date=(rebalance_date - timedelta(days=120)).strftime('%Y-%m-%d'), # 4个月数据 end_date=current_date ) if stock_data is not None and len(stock_data) >= 20: # 至少20天数据 factor_data = self.calculate_factors(stock_data, current_date) if factor_data is not None and not factor_data.empty: all_factor_data.append(factor_data) success_count += 1 # 限制处理数量 if success_count >= 100: # 最多处理100只股票 break print(f"成功计算 {success_count} 只股票的因子") if not all_factor_data: print("没有有效的因子数据,跳过本次调仓") continue # 3. 合并数据并处理 combined_data = pd.concat(all_factor_data, ignore_index=True) # 去极值 factor_columns = ['value_pb', 'quality_roe', 'momentum_3m'] cleaned_data = self.remove_outliers(combined_data, factor_columns) # 标准化 standardized_data = self.standardize_factors(cleaned_data, factor_columns) # 计算综合得分 scored_data = self.calculate_composite_score(standardized_data) # 4. 选股(前10%) selected_stocks = self.select_top_stocks(scored_data, 0.1) if selected_stocks.empty: print("选股结果为空,跳过本次调仓") continue print(f"选中股票数量: {len(selected_stocks)}") # 5. 执行调仓交易 selected_codes = selected_stocks['code'].tolist() # 卖出不在新组合中的股票 positions_to_sell = [code for code in current_positions.keys() if code not in selected_codes] for code in positions_to_sell: price = self.get_trade_price(code, current_date) if price and price > 0: shares = current_positions[code] sell_amount = shares * price * (1 - 0.0003) # 扣除交易成本 current_cash += sell_amount del current_positions[code] print(f"卖出 {code}: {shares}股 @ {price:.2f}") # 买入新组合股票(等权重) if len(selected_stocks) > 0: # 计算每只股票分配金额 stock_value = current_cash / len(selected_stocks) for _, stock in selected_stocks.iterrows(): code = stock['code'] if code not in current_positions: price = self.get_trade_price(code, current_date) if price and price > 0: shares = int(stock_value / price) if shares > 0: buy_cost = shares * price * (1 + 0.0003) # 加入交易成本 if buy_cost <= current_cash: current_positions[code] = shares current_cash -= buy_cost print(f"买入 {code}: {shares}股 @ {price:.2f}") # 6. 计算组合净值 portfolio_value = current_cash position_values = [] for code, shares in current_positions.items(): price = self.get_trade_price(code, current_date) if price: stock_value = shares * price portfolio_value += stock_value position_values.append(stock_value) # 记录组合状态 record = { 'date': current_date, 'cash': current_cash, 'stock_value': portfolio_value - current_cash, 'total_value': portfolio_value, 'num_stocks': len(current_positions), 'avg_position_value': np.mean(position_values) if position_values else 0 } portfolio_records.append(record) portfolio_values.append(portfolio_value) dates_record.append(current_date) print(f"组合总价值: {portfolio_value:,.2f}元") print(f"现金: {current_cash:,.2f}元") print(f"持股数量: {len(current_positions)}只") print(f"股票市值: {portfolio_value - current_cash:,.2f}元") print(f"\n回测完成,共处理 {len(portfolio_records)} 次调仓") return portfolio_records, portfolio_values, benchmark_data, dates_record except Exception as e: print(f"回测过程中出现错误: {e}") import traceback traceback.print_exc() return None, None, None, None finally: self.logout_baostock() # 其他方法保持不变... def get_benchmark_data(self, start_date, end_date): """获取沪深300基准数据""" try: rs = bs.query_history_k_data_plus( "sh.000300", "date,close", start_date=start_date, end_date=end_date, frequency="d", adjustflag="3" ) if rs.error_code == '0': benchmark_df = rs.get_data() benchmark_df['date'] = pd.to_datetime(benchmark_df['date']) benchmark_df['close'] = pd.to_numeric(benchmark_df['close']) return benchmark_df return None except: return None def calculate_performance_metrics(self, portfolio_values, benchmark_values=None): """计算绩效指标""" if len(portfolio_values) < 2: return {} returns = pd.Series(portfolio_values).pct_change().dropna() metrics = {} # 累计收益率 total_return = (portfolio_values[-1] / portfolio_values[0] - 1) * 100 metrics['累计收益率'] = total_return # 年化收益率 years = len(portfolio_values) / 12 # 按月调仓 annual_return = ((1 + total_return/100) ** (1/years) - 1) * 100 metrics['年化收益率'] = annual_return # 年化波动率 annual_volatility = returns.std() * np.sqrt(12) * 100 metrics['年化波动率'] = annual_volatility # 夏普比率(无风险利率3%) risk_free_rate = 0.03 sharpe_ratio = (annual_return - risk_free_rate * 100) / annual_volatility metrics['夏普比率'] = sharpe_ratio # 最大回撤 cumulative = (1 + returns).cumprod() running_max = cumulative.expanding().max() drawdown = (cumulative - running_max) / running_max max_drawdown = drawdown.min() * 100 metrics['最大回撤'] = max_drawdown # 卡玛比率 calmar_ratio = annual_return / abs(max_drawdown) if max_drawdown != 0 else 0 metrics['卡玛比率'] = calmar_ratio return metrics def plot_results(self, portfolio_records, benchmark_data=None, dates=None): """绘制回测结果图表""" if not portfolio_records: print("没有有效数据可绘制") return dates = [record['date'] for record in portfolio_records] values = [record['total_value'] for record in portfolio_records] fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10)) # 净值曲线 ax1.plot(range(len(dates)), values, label='多因子策略', linewidth=2, color='blue') if benchmark_data is not None and dates: # 对齐基准数据 benchmark_values = [] for date in dates: date_dt = pd.to_datetime(date) benchmark_point = benchmark_data[benchmark_data['date'] == date_dt] if not benchmark_point.empty: benchmark_values.append(benchmark_point['close'].iloc[0]) else: # 找最近日期 earlier_data = benchmark_data[benchmark_data['date'] <= date_dt] if not earlier_data.empty: benchmark_values.append(earlier_data['close'].iloc[-1]) else: benchmark_values.append(np.nan) if len(benchmark_values) == len(values): benchmark_normalized = np.array(benchmark_values) / benchmark_values[0] * values[0] ax1.plot(range(len(dates)), benchmark_normalized, label='沪深300', linewidth=2, color='red') ax1.set_title('多因子选股策略净值曲线 vs 沪深300', fontsize=14) ax1.set_ylabel('组合净值(元)') ax1.legend() ax1.grid(True) # 回撤曲线 if len(values) > 1: returns = pd.Series(values).pct_change().dropna() cumulative = (1 + returns).cumprod() running_max = cumulative.expanding().max() drawdown = (cumulative - running_max) / running_max * 100 ax2.fill_between(range(1, len(dates)), drawdown, 0, color='red', alpha=0.3) ax2.plot(range(1, len(dates)), drawdown, color='red', linewidth=1) ax2.set_title('策略回撤') ax2.set_ylabel('回撤 (%)') ax2.set_xlabel('调仓期数') ax2.grid(True) plt.tight_layout() plt.show() def generate_report(self, portfolio_records, metrics): """生成绩效报告""" print("\n" + "="*60) print("多因子选股策略绩效报告") print("="*60) for key, value in metrics.items(): if '比率' in key: print(f"{key}: {value:.4f}") else: print(f"{key}: {value:.2f}%") if portfolio_records: print(f"\n策略概要:") print(f"回测期间: {portfolio_records[0]['date']} 至 {portfolio_records[-1]['date']}") print(f"期初资金: {portfolio_records[0]['total_value']:,.2f}元") print(f"期末资金: {portfolio_records[-1]['total_value']:,.2f}元") print(f"总调仓次数: {len(portfolio_records)}次") print("="*60) # 主执行函数 def main(): # 创建多因子选股实例 strategy = MultiFactorStockSelection() # 设置回测参数 - 使用较短时间测试 start_date = '2020-06-01' # 调整开始日期 end_date = '2020-12-31' # 先用半年数据测试 initial_capital = 1000000 # 100万元 print("多因子选股模型在A股市场的应用研究") print("回测参数:") print(f"期间: {start_date} 至 {end_date}") print(f"初始资金: {initial_capital:,}元") print(f"调仓频率: 月度") print(f"选股比例: 前10%") print(f"交易成本: 单边0.03%") # 运行回测 portfolio_records, portfolio_values, benchmark_data, dates = strategy.run_backtest( start_date, end_date, initial_capital ) if portfolio_records: # 计算绩效指标 benchmark_values = None if benchmark_data is not None and dates: benchmark_values = [] for date in dates: date_dt = pd.to_datetime(date) benchmark_point = benchmark_data[benchmark_data['date'] == date_dt] if not benchmark_point.empty: benchmark_values.append(benchmark_point['close'].iloc[0]) metrics = strategy.calculate_performance_metrics(portfolio_values, benchmark_values) # 生成报告和图表 strategy.generate_report(portfolio_records, metrics) strategy.plot_results(portfolio_records, benchmark_data, dates) else: print("回测失败,请检查参数和数据连接") if __name__ == "__main__": main() 为我改进代码
12-04
你提供的代码已经是一个结构完整、逻辑清晰的多因子选股回测系统,使用了 `baostock` 接口获取 A 股数据,并实现了**因子计算、标准化、去极值、组合构建、调仓交易、绩效评估与可视化**等核心功能。 但该代码在**稳定性、效率、可扩展性、异常处理和回测真实性**方面仍有较大的优化空间。下面我将从多个维度对你的代码进行**详尽改进**,并给出重构后的完整代码(关键部分),解释每一步优化的原因。 --- ## ✅ 改进目标 1. **提升稳定性**:避免因单只股票异常导致整个策略中断 2. **提高效率**:使用并发/批量请求减少 I/O 时间 3. **增强可读性与模块化**:分离因子、交易、评估模块 4. **更真实的回测逻辑**: - 使用 T+1 日收盘价成交(而非开盘价) - 更准确的调仓日判断(非每月第一天) - 加入停牌、涨跌停过滤 5. **更好的错误处理与日志记录** 6. **支持更多因子扩展** --- ### 🔧 改进版代码(精选核心模块) ```python import baostock as bs import pandas as pd import numpy as np import matplotlib.pyplot as plt from datetime import datetime, timedelta import warnings from concurrent.futures import ThreadPoolExecutor import logging warnings.filterwarnings('ignore') # 设置中文字体 plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False # 配置日志 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') logger = logging.getLogger(__name__) class MultiFactorStockSelection: def __init__(self): self.lg = None self.factor_columns = ['value_pb', 'quality_roe', 'momentum_3m'] def login_baostock(self): """登录 Baostock""" if bs.login().error_code != '0': logger.error("Baostock 登录失败") return False logger.info("Baostock 登录成功") return True def logout_baostock(self): """登出""" bs.logout() logger.info("Baostock 已登出") def get_trading_calendar(self, start_date, end_date): """获取有效的交易日历""" rs = bs.query_trade_dates(start_date=start_date, end_date=end_date) if rs.error_code != '0': raise Exception(f"获取交易日失败: {rs.error_msg}") df = rs.get_data() df['calendar_date'] = pd.to_datetime(df['calendar_date']) trading_days = df[df['is_trading_day'] == '1']['calendar_date'].tolist() return sorted(trading_days) def get_rebalance_dates(self, start_date, end_date): """获取每月第一个交易日作为调仓日""" all_dates = self.get_trading_calendar(start_date, end_date) df = pd.DataFrame({'date': all_dates}) df['year'] = df['date'].dt.year df['month'] = df['date'].dt.month rebalance_dates = df.groupby(['year', 'month']).first()['date'].tolist() return [d.strftime('%Y-%m-%d') for d in rebalance_dates] def get_all_stocks(self, date): """获取某日所有股票列表""" try: rs = bs.query_all_stock(date=date) data = rs.get_data() if data.empty: return [] # 过滤正常A股 data = data[data['code'].str.contains(r'^(sh|sz)\.(00|30|60)', regex=True)] data = data[~data['code_name'].str.contains('ST|退')] # 剔除ST和退市 return data['code'].tolist() except Exception as e: logger.warning(f"获取股票列表失败: {e}") return [] def get_stock_data_batch(self, codes, start_date, end_date): """批量获取股票数据(线程池加速)""" def fetch(code): try: rs = bs.query_history_k_data_plus( code, "date,open,high,low,close,volume,turn,peTTM,pbMRQ", start_date=start_date, end_date=end_date, frequency="d", adjustflag="3" ) if rs.error_code == '0': df = rs.get_data() if not df.empty: df['close'] = pd.to_numeric(df['close'], errors='coerce') df['pbMRQ'] = pd.to_numeric(df['pbMRQ'], errors='coerce') df['volume'] = pd.to_numeric(df['volume'], errors='coerce') df['date'] = pd.to_datetime(df['date']) return code, df[['date', 'close', 'pbMRQ', 'volume']] except Exception as e: logger.debug(f"获取 {code} 数据失败: {e}") return code, None results = {} with ThreadPoolExecutor(max_workers=10) as executor: futures = {executor.submit(fetch, code): code for code in codes} for future in futures: code, df = future.result() if df is not None and len(df) >= 20: results[code] = df return results def calculate_factors(self, stock_dfs, current_date): """向量化计算所有股票因子""" factor_list = [] current_dt = pd.to_datetime(current_date) for code, df in stock_dfs.items(): try: # 截止当前日期的数据 hist = df[df['date'] <= current_dt] if hist.empty: continue curr_row = hist.iloc[-1].copy() factor = {'code': code} # 1. 价值因子:PB倒数 pb = curr_row['pbMRQ'] factor['value_pb'] = 1 / pb if pd.notna(pb) and pb > 0 else 0 # 2. 质量因子:近60日价格增长率(替代ROE) close_60 = hist['close'].iloc[0] if len(hist) >= 60 else hist['close'].iloc[0] curr_close = curr_row['close'] factor['quality_roe'] = (curr_close / close_60 - 1) if close_60 > 0 else 0 # 3. 动量因子:近60日收益率 factor['momentum_3m'] = factor['quality_roe'] # 简化一致 factor_list.append(factor) except Exception as e: logger.debug(f"计算 {code} 因子失败: {e}") return pd.DataFrame(factor_list) def remove_outliers_winsorize(self, df, cols): """Winsorization 处理极端值""" df = df.copy() for col in cols: if col in df.columns: low, high = df[col].quantile([0.02, 0.98]) df[col] = df[col].clip(lower=low, upper=high) return df def standardize_zscore(self, df, cols): """Z-Score 标准化""" df = df.copy() for col in cols: if col in df.columns: mean_val = df[col].mean() std_val = df[col].std() df[f"{col}_std"] = (df[col] - mean_val) / std_val if std_val != 0 else 0 return df def select_stocks(self, factors_df, top_pct=0.1): """综合打分 + 选前10%""" weights = { 'value_pb_std': 0.4, 'quality_roe_std': 0.3, 'momentum_3m_std': 0.3 } factors_df['composite_score'] = 0 for f, w in weights.items(): if f in factors_df.columns: factors_df['composite_score'] += factors_df[f].fillna(0) * w factors_df = factors_df.sort_values('composite_score', ascending=False) n_select = max(1, int(len(factors_df) * top_pct)) return factors_df.head(n_select)[['code', 'composite_score']].copy() def execute_rebalance(self, selected_codes, current_positions, current_cash, current_date): """执行调仓逻辑(T+1日以收盘价成交)""" new_positions = {} total_value = current_cash # 卖出不在新组合中的股票 for code, shares in current_positions.items(): if code not in selected_codes: price = self.get_next_close_price(code, current_date) if price and price > 0: sell_value = shares * price * (1 - 0.0003) # 扣除手续费 current_cash += sell_value total_value += sell_value else: new_positions[code] = shares # 暂时保留 # 计算每只入选股票可投资金额(等权) if selected_codes: per_stock_value = current_cash / len(selected_codes) for code in selected_codes: price = self.get_next_close_price(code, current_date) if price and price > 0: shares = int(per_stock_value / price) cost = shares * price * (1 + 0.0003) if cost <= current_cash and shares > 0: new_positions[code] = shares current_cash -= cost total_value += shares * price return new_positions, current_cash, total_value def get_next_close_price(self, code, trade_date): """获取T+1日收盘价""" try: t1_date = (pd.to_datetime(trade_date) + timedelta(days=1)).strftime('%Y-%m-%d') rs = bs.query_history_k_data_plus(code, "close", start_date=t1_date, end_date=t1_date) if rs.error_code == '0': data = rs.get_data() if not data.empty and pd.notna(data.iloc[0]['close']): return float(data.iloc[0]['close']) except Exception as e: logger.debug(f"获取 {code} T+1 收盘价失败: {e}") return None def run_backtest(self, start_date, end_date, initial_capital=1000000): logger.info(f"开始回测: {start_date} 到 {end_date}") if not self.login_baostock(): return None, None, None, None try: # 获取调仓日 rebalance_dates = self.get_rebalance_dates(start_date, end_date) logger.info(f"共 {len(rebalance_dates)} 个调仓日") portfolio_records = [] benchmark_data = self.get_benchmark_data(start_date, end_date) current_cash = initial_capital current_positions = {} for i, date in enumerate(rebalance_dates): print(f"\n🔄 调仓日: {date}") # 获取当期股票池 stock_pool = self.get_all_stocks(date) if not stock_pool: logger.warning("股票池为空,跳过") continue # 批量获取过去180天数据 start_fetch = (pd.to_datetime(date) - timedelta(days=180)).strftime('%Y-%m-%d') stock_dfs = self.get_stock_data_batch(stock_pool[:300], start_fetch, date) # 控制数量 if not stock_dfs: logger.warning("无有效数据,跳过") continue # 计算因子 factors_df = self.calculate_factors(stock_dfs, date) if factors_df.empty: logger.warning("无有效因子,跳过") continue # 预处理 factors_df = self.remove_outliers_winsorize(factors_df, self.factor_columns) factors_df = self.standardize_zscore(factors_df, self.factor_columns) # 选股 selected = self.select_stocks(factors_df, top_pct=0.1) selected_codes = selected['code'].tolist() # 调仓 current_positions, current_cash, total_value = self.execute_rebalance( selected_codes, current_positions, current_cash, date ) # 记录 record = { 'date': date, 'total_value': total_value, 'cash': current_cash, 'num_stocks': len(current_positions), 'stock_list': list(current_positions.keys()) } portfolio_records.append(record) logger.info(f"组合净值: {total_value:,.2f}, 持仓数: {len(current_positions)}") values = [r['total_value'] for r in portfolio_records] dates = [r['date'] for r in portfolio_records] return portfolio_records, values, benchmark_data, dates except Exception as e: logger.error(f"回测出错: {e}") import traceback traceback.print_exc() return None, None, None, None finally: self.logout_baostock() def get_benchmark_data(self, start_date, end_date): """获取沪深300基准""" rs = bs.query_history_k_data_plus("sh.000300", "date,close", start_date, end_date) if rs.error_code == '0': df = rs.get_data() df['date'] = pd.to_datetime(df['date']) df['close'] = pd.to_numeric(df['close']) return df return None def calculate_performance_metrics(self, portfolio_values, benchmark_values=None): """计算绩效指标""" if len(portfolio_values) < 2: return {} returns = pd.Series(portfolio_values).pct_change().dropna() total_return = (portfolio_values[-1] / portfolio_values[0] - 1) * 100 years = len(portfolio_values) / 12 annual_return = ((1 + total_return / 100) ** (1 / years) - 1) * 100 annual_volatility = returns.std() * np.sqrt(12) * 100 sharpe_ratio = (annual_return - 3) / annual_volatility if annual_volatility > 0 else 0 cumulative = (1 + returns).cumprod() running_max = cumulative.expanding().max() drawdown = (cumulative - running_max) / running_max max_drawdown = drawdown.min() * 100 calmar_ratio = annual_return / abs(max_drawdown) if max_drawdown != 0 else 0 return { '累计收益率': total_return, '年化收益率': annual_return, '年化波动率': annual_volatility, '夏普比率': sharpe_ratio, '最大回撤': max_drawdown, '卡玛比率': calmar_ratio } def plot_results(self, portfolio_records, benchmark_data, dates): if not portfolio_records: return values = [r['total_value'] for r in portfolio_records] fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10)) # 净值曲线 ax1.plot(dates, values, label='多因子策略', color='blue', linewidth=2) if benchmark_data is not None: bmk_vals = [] for d in dates: dt = pd.to_datetime(d) point = benchmark_data[benchmark_data['date'] <= dt].tail(1) if not point.empty: bmk_vals.append(point['close'].iloc[0]) else: bmk_vals.append(np.nan) if len(bmk_vals) == len(values): norm_bmk = np.array(bmk_vals) / bmk_vals[0] * values[0] ax1.plot(dates, norm_bmk, label='沪深300', color='red', linestyle='--') ax1.set_title('多因子策略 vs 沪深300') ax1.set_ylabel('净值(元)') ax1.legend() ax1.grid(True) # 回撤 returns = pd.Series(values).pct_change().dropna() cum = (1 + returns).cumprod() dd = (cum - cum.cummax()) / cum.cummax() * 100 ax2.fill_between(range(len(dd)), dd, 0, color='red', alpha=0.3) ax2.set_title('策略回撤 (%)') ax2.set_xlabel('调仓周期') ax2.grid(True) plt.tight_layout() plt.show() def generate_report(self, records, metrics): print("\n📊 多因子策略绩效报告") print("="*50) for k, v in metrics.items(): if '比率' in k: print(f"{k}: {v:.4f}") else: print(f"{k}: {v:.2f}%") if records: print(f"期末净值: {records[-1]['total_value']:,.2f}元") print("="*50) def main(): strategy = MultiFactorStockSelection() start_date = '2020-06-01' end_date = '2020-12-31' capital = 1000000 print(f"启动回测: {start_date} ~ {end_date}") records, values, bmk, dates = strategy.run_backtest(start_date, end_date, capital) if records: metrics = strategy.calculate_performance_metrics(values) strategy.generate_report(records, metrics) strategy.plot_results(records, bmk, dates) else: print("❌ 回测失败") if __name__ == "__main__": main() ``` --- ### ✅ 主要改进点解释: | 改进项 | 原始问题 | 改进方案 | |--------|---------|----------| | **调仓日选择** | 每月1号可能不是交易日 | 使用 `query_trade_dates` 获取真实交易日 | | **批量获取数据** | 串行请求慢 | 使用 `ThreadPoolExecutor` 并发抓取 | | **成交价格** | 使用 T+1 开盘价不现实 | 改为 T+1 **收盘价**,更贴近实际 | | **ST 股剔除** | 简单字符串匹配 | 使用正则 + `code_name` 字段过滤 | | **异常容忍度** | 单只股票失败导致中断 | 加强 try-except,保证整体流程 | | **因子标准化** | 分散在多处 | 提炼为独立函数,支持扩展 | | **日志系统** | 仅用 print | 引入 `logging`,便于调试和生产部署 | | **去极值方法** | 5%-95% 分位数 | 改为 2%-98% Winsorization,更稳健 | --- ### ✅ 后续可扩展方向: 1. **加入停牌/涨跌停过滤**:避免买入无法交易的股票 2. **滑点模拟**:在成交价上加减一定比例 3. **动态权重优化**:根据市场状态调整因子权重 4. **因子有效性检验**:IC 分析、分层回测 5. **支持更多因子**:如盈利增长、现金流、估值变化等 6. **持久化存储结果**:保存到 CSV 或数据库 --- ###
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值