BigQuant使用文档

112-Fama-French三因子模型策略

由bq2qbou2创建,最终由small_q 被浏览 30 用户

策略介绍

上世纪90年代,经济学家Eugene Fama和Kenneth French提出了著名的Fama-French三因子模型,在经典的CAPM模型上进行了拓展。

Fama-French三因子模型使用三个因素来解释股票收益

  • 市场因子(MKT):体现为整个市场的收益
  • 规模因子(SMB):体现为小市值公司与大市值公司的收益差距
  • 价值因子(HML):体现为高市值账面比公司与低市值账面比公司的收益差距

这三个因子的数据格式是只有时序变化、没有截面变化的,也就是说,对于同一天来说,所有股票的这三个因子值,都是相同的

这三个因子的具体构建方式如下

  1. 市场因子,就是全市场股票的市值加权平均收益率
  2. 规模因子和价值因子需要用以下方式构建:
  • 将全市场股票按照两个维度,分成六组:
  • 将全市场股票按照市值的大小分为,小市值S、大市值B
  • 将全市场股票按照账面市值比的高低分为,高市账比H、中市账比M、低市账比L
  • 这样就有了六组:SH、SM、SL、BH、BM、BL,每一组我们都算出组内的市值加权平均收益率
  • 之后我们用这六个值来计算规模因子和市值因子的值

SMB = (1/3) * (SL + SM + SH) - (1/3) * (BL + BM + BH)

HML = (1/2) * (SH + BH) - (1/2) * (SL + BL)

策略流程

我们使用个股收益率作为Y,三因子作为X,建立一个时序120日的回归,并取残差,再将这个残差取20日指数移动加权平均,作为因子建立一个单因子策略,当然该因子在使用前要经过因子数据处理

该策略的逻辑为,FF三因子可以解释大部分的股票收益来源,那么FF三因子解释不了的残差部分,就是股票的超额收益,该策略就是想要捕捉这一部分超额收益

策略实现

在SQL中构建这三个因子的代码如下:

首先我们可以建立一个临时表,准备所有所需的数据,再建立一个临时表data1准备收益率Y

WITH 
data_base AS (
    SELECT
        date,
        instrument,
        change_ratio,
        float_market_cap,
        total_owner_equity,
    FROM cn_stock_factors
),
data1 AS ( 
    SELECT 
        date, 
        instrument, 
        change_ratio, 
    FROM data_base 
), 

接着我们建立一个临时表data2,构建MKT因子,方法是全市场股票的市值加权平均收益率

data2 AS ( 
    SELECT 
        date, 
        instrument,
        c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS MKT
    FROM data_base 
), 

接着我们将全市场股票按照市值与市账比两个维度,分为六组,并计算每组的市值加权平均收益率,之后算出SMB与HML因子的值

data3 AS (
    WITH 
    data3_0 AS (
        SELECT
            date,
            instrument,
            change_ratio,
            float_market_cap,
            c_pct_rank(float_market_cap)                       AS rank_sb,
            c_pct_rank(float_market_cap / total_owner_equity)  AS rank_lmh,
            CASE
                WHEN rank_sb  < 0.5 THEN 1
                ELSE 2
            END AS group_sb,
            CASE
                WHEN rank_lmh < 0.3 THEN 1
                WHEN rank_lmh > 0.7 THEN 3
                ELSE 2
            END AS group_lmh,
        FROM data_base
    ),
    data3_sl AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SL
        FROM data3_0
        WHERE group_sb = 1 AND group_lmh = 1
    ),
    data3_sm AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SM
        FROM data3_0
        WHERE group_sb = 1 AND group_lmh = 2
    ),
    data3_sh AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS SH
        FROM data3_0
        WHERE group_sb = 1 AND group_lmh = 3
    ),
    data3_bl AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BL
        FROM data3_0
        WHERE group_sb = 2 AND group_lmh = 1
    ),
    data3_bm AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BM
        FROM data3_0
        WHERE group_sb = 2 AND group_lmh = 2
    ),
    data3_bh AS (
        SELECT DISTINCT
            date,
            c_sum(float_market_cap * change_ratio) / c_sum(float_market_cap) AS BH
        FROM data3_0
        WHERE group_sb = 2 AND group_lmh = 3
    ),
    data3_merge AS (
        SELECT 
            data3_0.date,
            data3_0.instrument,
            (1/3) * (SL + SM + SH) - (1/3) * (BL + BM + BH) AS SMB,
            (1/2) * (SH + BH)      - (1/2) * (SL + BL)      AS HML,
        FROM data3_0
        JOIN data3_sl USING (date)
        JOIN data3_sm USING (date)
        JOIN data3_sh USING (date)
        JOIN data3_bl USING (date)
        JOIN data3_bm USING (date)
        JOIN data3_bh USING (date)
    )
    SELECT * FROM data3_merge
),

最后我们将所有的数据结合,将Y和X统一到一张表上,使用m_ols3d_last_resid函数做时序回归,再使用m_ta_ema函数做指数加权移动平均,算出因子值

data_merge AS ( 
    SELECT 
        date, 
        instrument, 
        change_ratio,
        MKT,
        SMB, 
        HML, 
        -1 * m_ta_ema(m_ols3d_last_resid(change_ratio, MKT, SMB, HML, 120) ^ 2, 20) AS factor,
    FROM data1 JOIN data2 USING (date, instrument) JOIN data3 USING (date, instrument)
    QUALIFY COLUMNS(*) IS NOT NULL
),
data AS (
    SELECT 
        *
    FROM data_merge
)

SELECT *
FROM data 
ORDER BY date, instrument

将以上的长SQL当作一个字符串,命名为alpha_sql,传入以下因子处理SQL中

sql = f"""
    WITH
    data_alpha AS (
        {alpha_sql}
    ),
    data_alpha_origin AS (
        SELECT *
        FROM data_alpha
        QUALIFY COLUMNS(*) IS NOT NULL AND factor != 'Infinity' AND factor != '-Infinity'
    ),
    data_alpha_process AS (
        SELECT 
            date,
            instrument,
            factor,
            clip(factor, c_avg(factor) - 3 * c_std(factor), c_avg(factor) + 3 * c_std(factor)) AS clipped_factor,
            c_normalize(clipped_factor) AS normalized_factor,
            c_neutralize(normalized_factor, sw2021_level1, LOG(total_market_cap)) AS neutralized_factor,
        FROM data_alpha_origin JOIN cn_stock_factors_base USING (date, instrument)
        WHERE 1=1
        AND amount > 0
        AND st_status = 0
        AND trading_days > 252
        AND (instrument LIKE '%SH' OR instrument LIKE '%SZ')
        QUALIFY COLUMNS(*) IS NOT NULL
        ORDER BY date, instrument
    )
    SELECT 
        date, 
        instrument, 
        neutralized_factor AS factor 
    FROM data_alpha_process 
    ORDER BY date, factor DESC

"""

最后我们用处理后的因子值,做一个单因子策略,持股10只,持仓5天,回测时间为2020-01-01至今

策略源码

https://bigquant.com/codeshare/9ae9afb7-8dfc-47c6-a9fd-c63ab18fcb41

\

标签

价值因子股票收益
{link}