112-Fama-French三因子模型策略
由bq2qbou2创建,最终由small_q 被浏览 30 用户
策略介绍
上世纪90年代,经济学家Eugene Fama和Kenneth French提出了著名的Fama-French三因子模型,在经典的CAPM模型上进行了拓展。
Fama-French三因子模型使用三个因素来解释股票收益
- 市场因子(MKT):体现为整个市场的收益
- 规模因子(SMB):体现为小市值公司与大市值公司的收益差距
- 价值因子(HML):体现为高市值账面比公司与低市值账面比公司的收益差距
这三个因子的数据格式是只有时序变化、没有截面变化的,也就是说,对于同一天来说,所有股票的这三个因子值,都是相同的
这三个因子的具体构建方式如下
- 市场因子,就是全市场股票的市值加权平均收益率
- 规模因子和价值因子需要用以下方式构建:
- 将全市场股票按照两个维度,分成六组:
- 将全市场股票按照市值的大小分为,小市值S、大市值B
- 将全市场股票按照账面市值比的高低分为,高市账比H、中市账比M、低市账比L
- 这样就有了六组:SH、SM、SL、BH、BM、BL,每一组我们都算出组内的市值加权平均收益率
- 之后我们用这六个值来计算规模因子和市值因子的值
SMB = (1/3) * (SL + SM + SH) - (1/3) * (BL + BM + BH)
HML = (1/2) * (SH + BH) - (1/2) * (SL + BL)
策略流程
我们使用个股收益率作为Y,三因子作为X,建立一个时序120日的回归,并取残差,再将这个残差取20日指数移动加权平均,作为因子建立一个单因子策略,当然该因子在使用前要经过因子数据处理
该策略的逻辑为,FF三因子可以解释大部分的股票收益来源,那么FF三因子解释不了的残差部分,就是股票的超额收益,该策略就是想要捕捉这一部分超额收益
策略实现
在SQL中构建这三个因子的代码如下:
首先我们可以建立一个临时表,准备所有所需的数据,再建立一个临时表data1准备收益率Y
WITH
data_base AS (
SELECT
date,
instrument,
change_ratio,
float_market_cap,
total_owner_equity,
FROM cn_stock_factors
),
data1 AS (
SELECT
date,
instrument,
change_ratio,
FROM data_base
),
接着我们建立一个临时表data2,构建MKT因子,方法是全市场股票的市值加权平均收益率
data2 AS (
SELECT
date,
instrument,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS MKT
FROM data_base
),
接着我们将全市场股票按照市值与市账比两个维度,分为六组,并计算每组的市值加权平均收益率,之后算出SMB与HML因子的值
data3 AS (
WITH
data3_0 AS (
SELECT
date,
instrument,
change_ratio,
float_market_cap,
c_pct_rank(float_market_cap) AS rank_sb,
c_pct_rank(float_market_cap / total_owner_equity) AS rank_lmh,
CASE
WHEN rank_sb < 0.5 THEN 1
ELSE 2
END AS group_sb,
CASE
WHEN rank_lmh < 0.3 THEN 1
WHEN rank_lmh > 0.7 THEN 3
ELSE 2
END AS group_lmh,
FROM data_base
),
data3_sl AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SL
FROM data3_0
WHERE group_sb = 1 AND group_lmh = 1
),
data3_sm AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SM
FROM data3_0
WHERE group_sb = 1 AND group_lmh = 2
),
data3_sh AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SH
FROM data3_0
WHERE group_sb = 1 AND group_lmh = 3
),
data3_bl AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BL
FROM data3_0
WHERE group_sb = 2 AND group_lmh = 1
),
data3_bm AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BM
FROM data3_0
WHERE group_sb = 2 AND group_lmh = 2
),
data3_bh AS (
SELECT DISTINCT
date,
c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BH
FROM data3_0
WHERE group_sb = 2 AND group_lmh = 3
),
data3_merge AS (
SELECT
data3_0.date,
data3_0.instrument,
(1/3) * (SL + SM + SH) - (1/3) * (BL + BM + BH) AS SMB,
(1/2) * (SH + BH) - (1/2) * (SL + BL) AS HML,
FROM data3_0
JOIN data3_sl USING (date)
JOIN data3_sm USING (date)
JOIN data3_sh USING (date)
JOIN data3_bl USING (date)
JOIN data3_bm USING (date)
JOIN data3_bh USING (date)
)
SELECT * FROM data3_merge
),
最后我们将所有的数据结合,将Y和X统一到一张表上,使用m_ols3d_last_resid
函数做时序回归,再使用m_ta_ema
函数做指数加权移动平均,算出因子值
data_merge AS (
SELECT
date,
instrument,
change_ratio,
MKT,
SMB,
HML,
-1 * m_ta_ema(m_ols3d_last_resid(change_ratio, MKT, SMB, HML, 120) ^ 2, 20) AS factor,
FROM data1 JOIN data2 USING (date, instrument) JOIN data3 USING (date, instrument)
QUALIFY COLUMNS(*) IS NOT NULL
),
data AS (
SELECT
*
FROM data_merge
)
SELECT *
FROM data
ORDER BY date, instrument
将以上的长SQL当作一个字符串,命名为alpha_sql,传入以下因子处理SQL中
sql = f"""
WITH
data_alpha AS (
{alpha_sql}
),
data_alpha_origin AS (
SELECT *
FROM data_alpha
QUALIFY COLUMNS(*) IS NOT NULL AND factor != 'Infinity' AND factor != '-Infinity'
),
data_alpha_process AS (
SELECT
date,
instrument,
factor,
clip(factor, c_avg(factor) - 3 * c_std(factor), c_avg(factor) + 3 * c_std(factor)) AS clipped_factor,
c_normalize(clipped_factor) AS normalized_factor,
c_neutralize(normalized_factor, sw2021_level1, LOG(total_market_cap)) AS neutralized_factor,
FROM data_alpha_origin JOIN cn_stock_factors_base USING (date, instrument)
WHERE 1=1
AND amount > 0
AND st_status = 0
AND trading_days > 252
AND (instrument LIKE '%SH' OR instrument LIKE '%SZ')
QUALIFY COLUMNS(*) IS NOT NULL
ORDER BY date, instrument
)
SELECT
date,
instrument,
neutralized_factor AS factor
FROM data_alpha_process
ORDER BY date, factor DESC
"""
最后我们用处理后的因子值,做一个单因子策略,持股10只,持仓5天,回测时间为2020-01-01至今
策略源码
https://bigquant.com/codeshare/9ae9afb7-8dfc-47c6-a9fd-c63ab18fcb41
\