起因:从2019年开始的大白马行情,我的股票策略收益颇丰,后来受事件驱动类策略的启发,基于自己的选股池,再通过百度指数中的咨询热度指数过滤标的,查询创新高日期引起爆炸的新闻是哪一条【消息】,人工判断是进场还是出场信号,统计下来收益和胜率能提升不少。
引用项目:
baidux():网页链接
#!/usr/bin/env python# -*- encoding: utf-8 -*-'''
@文件 :baidu.py
@时间 :2021/02/08 23:40:03
@作者 :野吉他
@联系 :微信:wildguitar
'''from baidux.utils import test_cookiesfrom baidux import configfrom baidux import BaiduIndex, ExtendedBaiduIndeximport pandas as pd
#配置浏览器cookie
cookies = """BIDUPSID=4CC599E65D0C06AD3D1EA4B3C1BDB551; PSTM=1583631035; ab_jid=48fcd4047247d3d1955cad12131f741399f5; ab_jid_BFESS=48fcd4047247d3d1955cad12131f741399f5; MCITY=-:; BAIDUID=A2420F0E7436228C14BEFFCD389D7E3C:FG=1; BDUSS=dzd1lqeHd0T09lWWRxY3ptUmwtOENSQTdWOEEwZXJJb3c1aEVaaUpSSU5sMDFnRVFBQUFBJCQAAAAAAAAAAAEAAABUG5kESGl3YXIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA0KJmANCiZgeX; BDUSS_BFESS=dzd1lqeHd0T09lWWRxY3ptUmwtOENSQTdWOEEwZXJJb3c1aEVaaUpSSU5sMDFnRVFBQUFBJCQAAAAAAAAAAAEAAABUG5kESGl3YXIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA0KJmANCiZgeX; __yjs_duid=1_cddfba1a17ffde390615add5e63982bc1614070318027; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; delPer=0; BAIDUID_BFESS=A2420F0E7436228C14BEFFCD389D7E3C:FG=1; BDRCVFR[feWj1Vr5u3D]=I67x6TjHwwYf0; ZD_ENTRY=baidu; H_PS_PSSID=33512_33272_31660_33594_33570_26350; PSINO=2; BA_HECTOR=012la50g24al0k20so1g4lfem0q; bdindexid=6431a5n5o9lpqqq0npjusov9l3; ab_sr=1.0.0_NzhhYTRiYmRjNGE5ZDk4YjI3ZTNkZWJiNDk0Mjc1NTRjZmJlYzEzNzJhMTUzYzQ3NmI3ODBjNjg2Zjg0NjE4NzRkODVmMWMyNWZlMmY2NTI2MjUwZTllNDM3MTIyNjE4; __yjsv5_shitong=1.0_7_10aef771dca20747b6f825abc3b9ecd21961_300_1615511006125_58.16.225.248_0b8b9be8; RT="z=1&dm=baidu.com&si=0sgnoju4lf5&ss=km5ljcz1&sl=4&tt=2rf&bcn=网页链接
# 测试cookies是否配置正确# True为配置成功,False为配置不成功
print(test_cookies(cookies))
#检查股票近期百度资讯指数是否异常上升def check_stock_hot_event(stock_pool, start, end):
"""
param:stock_pool 股票列表数组,[['新宝股份'], ['广州酒家']]
param:start_date 开始日期
param:end_date
遍历所有股票最近30的百度资讯指数,如果最近一天的数据创新高则对股票标记为热点股票
"""
keywords = []
for index, row in stock_pool.iterrows():
#print(row)
new_item = [stock_pool.at[index,'display_name']]
#这里是测试代码,由于baidux没有返回值会报错异常,可删除
if ('博杰股份' in new_item[0]) or ('福莱特' in new_item[0]):
print("filterd")
else:
keywords.insert(index, new_item)
feed_index = ExtendedBaiduIndex(
keywords = keywords,
start_date = start,
end_date = end,
cookies = cookies,
kind = 'feed'
)
data_source = []
for index in feed_index.get_index():
print(index)
item = {}
item['date'] = index['date']
item['index'] = float(index['index'])
item['keyword'] = index['keyword'][0]
data_source.append(item)
df = pd.DataFrame(data_source)
df['pct_ch'] = (df.groupby('keyword')['index'].apply(pd.Series.pct_change) + 1)
result = df[df.groupby(['keyword'])['index'].rank(method="first", ascending=False)==1]
return result
关于事件类股票策略的回测,目前有两个问题需要解决:
1.识别热点事件是利好还是利空,目前思路就是NLP通过自定义语料训练然后判断一下即可,但是这个语料库的维护,哪些消息算利空、利好,这个分类逻辑需要人工干预。
2.事件类回测框架:回测有点类似我以前做的游戏录像replay,比如我这个百度指数,回测的最大的问题是当某标的的指数创新高,那还需要关联到创新高那天的新闻热点事件,将离散的百度指数数据关联到正确的新闻数据上,所以需要维护一个标的的新闻库,才能实现完整的回测。
3.发散:于持仓标的指数监测,可以作为出场条件的判断,大家可以自行复盘。股票小白,写的不好,希望各位大佬拍砖,感谢!附件有文件,开箱即用。