笑雨备份wolai
Created: September 23, 2023 10:19 AM Last edited by: Pan Wanke Last edited time: December 25, 2023 8:23 AM Owner: xiaoyuzengpsy@gmial.com
安装hddm
推荐安装HDDM的docker ,不然可能在安装这一步你会遇到无数bug
python相关
pandas filter data
df[[df.name](http://df.name/) == 'Ben'] # 姓名为Ben
入门demo
DDM需要多少被试多少trials? 20~40subs+40~60trials
@2020madspedersen_RLDDM_simultaneous hierarchical bayesian parameter estimation for reinforcement learning and drift diffusion models a tutorial and links to neural data
!2017_How many trials are required for parameter estimation in DDM
怎么run hddm
0 docker install container/image e.g.,
pull madslupe/hddm
1打开docker,run hddm image

2在管理员权限的命令行中输入
mads win10
docker run -it –rm –cpus=6 -v /d/ZengXiaoyu/Jupyterlab/:/home/jovyan/data -p 8888:8888 madslupe/hddm jupyter notebook
hcp HDDM 0.9 beta
docker run -it –rm –cpus=6 -e NB_USER=jovyan -e CHOWN_HOME=yes -e CHOWN_EXTRA_OPTS=’-R’ -w /home/jovyan/ -v /d/ZengXiaoyu/Jupyterlab:/home/jovyan/hddm -p 8888:8888 hcp4715/hddm:0.9_beta jupyter notebook
怎么做simulation:hddm.utils.post_pred_gen
怎么检查收敛?
http://ski.clps.brown.edu/hddm_docs/howto.html#assess-model-convergence
1MC error statistic
There is a column called MC error. These values should not be smaller then 1% of the posterior std.
2geweke statistic
DIC为负合理吗?为什么?
收敛不好怎么办What to do about lack of convergence
In the simplest case you just need to run a longer chain with more burn-in and more thinning. E.g.:
model.sample(10000, burn=5000, thin=5)
This will cause the first 5000 samples to be discarded. Of the remaining 5000 samples only every 5th sample will be saved. Thus, after sampling our trace will have a length of a 1000 samples.
You might also want to find a good starting point for running your chains. This is commonly achieved by finding the maximum posterior (MAP) via optimization. Before sampling, simply call:
model.find_starting_values()
which will set the starting values to the MAP. Then sample as you would normally. This is a good idea in general.
If that still does not work you might want to consider simplifying your model. Certain parameters are just notoriously slow to converge; especially inter-trial variability parameters. The reason is that often individual subjects do not provide enough information to meaningfully estimate these parameters on a per-subject basis.
One way around this is to not even try to estimate individual subject parameters and instead use only group nodes. This can be achieved via the group_only_nodes keyword argument:
model = hddm.HDDM(data, include=[‘sv’, ‘st’], group_only_nodes=[‘sv’, ‘st’])

1 more samples, more burn-in, more thining
2 add inter-trial variablity
3 only group-level: regressor + inter trial variablity
把收敛不好的参数只在组水平去拟合


model comparison
https://groups.google.com/g/hddm-users/c/qY_tUW737ds
http://ski.clps.brown.edu/hddm_docs/tutorial_post_pred.html
要放boundary吗?可以fix
被试内被试间问题
https://groups.google.com/g/hddm-users/c/7t9Hc3v-geI
2: could I used the depends_on for my experimental design? I found that in many threads it was clear the “depends_on” is for between-subject design instead of within-subject design. So is it completely unacceptable to use “dependes_on” for a within-subject design?
——Yes, can’t use it.
https://groups.google.com/g/hddm-users/c/MarwmUlHKqs/m/SMiPYXOdi-UJ
accuracy coding vs stimulus coding
RL: accuracy
here we are using accuracy coding where 1 means the more rewarding symbol was chosen, and 0 the less rewarding) we flip error RTs to be negative.
怎么做PPC
http://ski.clps.brown.edu/hddm_docs/tutorial_post_pred.html
问题:之前好像是,没办法画图,Jupyter上面好像空间不够画不出来;但是,如果我们自己直接把数据导出来,其实是更方便,还可以画group level的 ppc,所以其实更重要的问题是,我们自己把PPCdata导出来。画图用matlab或者r都是可以的
这个人好像提了一个怎么提取PPCdata的方法:
regression hddm怎么做PPC posterior predictive check
胡传鹏的理解应该是对的
https://groups.google.com/g/hddm-users/c/brFLEQbYKTg/m/vTpa3aZEBAAJ
答案:
The returned data structure is a pandas DataFrame object with a hierarchical index.
——就像是matlab里面的table
也可以拿PPC去做model comparison
如果只在group level拟合,那就会做不了个体level PPC
https://groups.google.com/g/hddm-users/c/CxbtLY0UxMk/m/8BKLQMF-OU8J
只拟合group level,是把数据当做全部来自一个大被试吗?是
我也是后来才发现这个差异 关于group_only_nodes/regressor这个参数: 1我还是没找到关于这两个参数更详细的官方注释文档…… 2但我理解只要is_group_only没有被改成false(或者说就是default的true),那么hddm在拟合时就会是trial-sub-group的层级模型,区别只在于是否还同时给出个体水平拟合 3hddm拟合收敛不是很好时 Thomas经常建议不拟合个体水平,但这个是说is_group_only=false,而是在group_only_nodes中把一些参数设进去;不拟合每个个体的参数经常能够改善模型拟合情况。 4在HDDMregression中,设置group_only_regressor以后,虽然beta是只有群体分布,但intercept还是每个被试都有,也就是依然拟合了被试水平random intercept,所以也不是把全部数据当成来自于一个被试。 回到最开始我提的问题:为什么拟合个体水平和不拟合个体只拟合群体水平时,某些参数后验分布的结果会出现差异?也许得先对比一下这两种模型的拟合情况?比如对比一下DIC和PPC? 有一定可能,不拟合个体只拟合群体的拟合情况会更好,但这一点会不会以及会如何影响结果,我也不确定
PPCdata注意事项
1sample栏是什么意思?要怎么平均一下?
The second level contains the simulated data sets. Since we simulated 500, these will go from 0 to 499 – each with generated from a different parameter value sampled from the posterior.
——不行,不能平均全部500个sample;甚至我看sub0102的sample0都觉得对不上;负的RT也没了
————是顺序乱了吗?有没有可能regression model的PPC还是有问题呀?
——————还是说,只画出rt distribution,并不能具体到trial level的prediction?
- *——————如果这样的话,那response是不是也只能算一个ratio?**待办事项
——sample acc 0.93是怎么来的?
2epoch数量能对上吗?
3 0.05的outlier是怎么去掉的?我们怎么对应原始的和PPC出来的outlier?
传鹏老师好!想请教一个HDDM regression model做PPC的问题。我查资料时看到你在google group刚好提到这个问题 post_pred_gen应该是可以也给regression model产生PPC data的 但我有两个疑惑: 1一个是这个函数会抽样500个sample 我理解的是从后验分布里 并给出每个sample的PPC data。那如果我们要自己画PPC plot 是否就是对每个被试每个trial平均掉500 个sample的rt呢?但是后来我用post_pred_stats又看到每个sample的accuracy是不一样的 有的高有的低 又有点不确定平均掉500个sample是否是对的或者说是否是最好的…… 2regression model的PPC 数据好像只有rt没有response?我的一个猜测是负的反应时对应的可能是0 正的反应时对应的1的选项 这样也就有了response。当然这个我是纯靠猜的 也想和你confirm一下
共同问题:pasty regression的参数没法depend on条件了?
比如cmot里面的vself money, vothermoney没法再看dis ad了
当时我想的办法是,两个条件的数据提取出来,分别跑;
如果这个办法可行,那player1 player2也可以这样做
怎么和neural做相关?
比如把EEG power data放在hddm里面,增强hddm的解释力预测力
http://ski.clps.brown.edu/hddm_docs/tutorial_python.html#loading-data
比如把EEG power data放在regression里面,看看能预测哪个变量
regression model更高级的tutorial
https://github.com/hddm-devs/hddm/blob/master/docs/source/tutorial_regression_stimcoding.rst
提醒:regressor其实是可以分条件的; self money other money可以有一个group?

regression的交互项interaction怎么做
https://groups.google.com/g/hddm-users/c/taD9oirzx3M
假设检验&bayesian hypothesis testing
http://ski.clps.brown.edu/hddm_docs/howto.html#hypothesis-testing
yuhongbo认为ddm参数不适合用频率估计。。。得用贝叶斯统计。。。

model recorvery
https://github.com/hddm-devs/hddm/blob/master/docs/source/tutorial_regression_stimcoding.rst
https://github.com/hddm-devs/HDDM-paper
z>0.5什么意思
z > 0.5 means bias towards the CORRECT boundary
https://groups.google.com/g/hddm-users/c/aWQsLtBJfTU/m/EuEtdKHjCQAJ
- *The stim vs accuracy coding distinction can sometimes be confusing and I think this might be the issue. **
You would only want to** flip the “z” (from z to 1-z for stim1 vs stim2) when using accuracy coding. eg if the subject has a bias to respond stim1, you estimate a bias toward stim1 by flipping z (so that z=.7 for stim1 and z=.3 for stim2 both represent a bias toward stim1 **if** the response column is accuracy coded). **
Since you are using** stim coding and the response columns are the responses themselves rather than correct/incorrect, then you don’t want to flip z at all, **
and you just estimate one z. The link function you would use is just lambda x:x and with no flipping, and you get a single estimate z of the bias toward stim1. You still flip the v, as in your example (b) above. (When using HDDMStimCoding you do this with split_param = v but since you are doing the regression you can do it in the link function as you specified).
I suspect that this might be the reason you were having issues with convergence (which is separate from anything to do with the inverse logit). If you flipped before it would mean that when stim1 is present then the bias is to stim1 but when stim2 is present the bias is to stim2, and that doesn’t make sense (ie the subject can’t know the stim before the trial to set the starting point), and instead it should only be the drift that flips.
z bias一定要用公式转换吗&&为什么regression model里不能再用invev logic function?:现在不用转换了
https://groups.google.com/g/hddm-users/c/k8dUBepPyl8/m/8HuUjLOBAAAJ
HDDMRegressor options (google.com)
an important point has come to light about estimating regression models on the starting point, which I believe led to the issue you identified in this thread. (Ultimately you corrected it by choosing a different condition for your baseline, but it turns out that there was an actual problem in the first place, in contrast to what I thought.)
The issue is that your regression used the** inverse logit transformation. This is meant to constrain z between 0 and 1, which is sensible, and is what we had used in our tutorial example for stimulus coded regression. But since then, HDDM now applies that transform to the **prior for z (for all models with z, including the intercept of regression models), which constrains it. Applying the transform again in the link function leads to a bias (because the invlogit prior on an invlogit transformed variable then forces the intercept to be >0.5).
So, with recent versions of HDDM, one should not use an inverse logit transform on z in the link function for regression models (unless they change the prior in the guts of their version of HDDM). It is sufficient to use the constrained prior on the intercept with the regular linear link function (any extreme values of z that would go out of the [0,1] bounds on the full regression would get rejected by the sampler anyway). This also makes z regression coefficients more comparable to those for other models parameters (which are all usually linear), and easier to interpret the coefficients.
I confirmed this was the culprit for your original case that you shared in this thread, by simply refitting the model to data in which the baseline condition has z<0.5 and using the identity link function for z (ie lambda x:x), and it works properly (recovers z intercepts below and above .5). I also confirmed that the original tutorial code with model recovery on stim-coded regression would fail if run as originally specified (due to the altered prior), but that it works properly when the link function is fixed (in that case still needing to apply the 1-z swap) The upshot is that there is no bug in HDDM itself, but just that this particular tutorial code was outdated - just updated it now on github (the old docs are still on ski, will fix that).
Note, anyone who has performed a regression on starting points before and used invlogit in the link function, if you are concerned you can check your results and if got even a single z<0.5 (after transformed back), for any subject/condition, your results should be ok (ie, it would mean you did it in an earlier version of HDDM without that prior, so that the invlogit was applied just once).
h/t: Ian Krajbich and his lab for first noticing the problem with biased estimates- thanks!
确实不用再转换z了
Regarding what link function to use for z: in the past, it has been suggested to use an inverse logit but that is now already incorporated in the prior. Therefore, one should instead just use the linear link function (lambda x:x - this just means that the sampler will directly estimate z instead of a transform of it). See this thread. For v, the link function usually is identity / linear also because it is not constrained.
怎么 fix bias
https://groups.google.com/g/hddm-users/c/KpBu6mdcj0I/m/oTT2RMpDBAAJ

ddm参数间的关系到底是什么,不同参数对于选择的影响真的不同吗?
我又倒回去看了看聊天记录 感觉这里面有两个问题 1 ddm中的a z v的关系到底是啥 2 ddm中a z v对choice/rt的影响是否真的不同。 我觉得第一个问题吧……就不是我们的问题,可能是ratcliff和其他一种ddm模型开发者的问题。我觉得质疑这个问题有点强人所难,如果是我,我可能会引用一堆文献表明ddm是validate的的。 关于第二个问题,我觉得liang yuanchang那种回应方法挺好的,用simulation表明a v z对于choice/rt的影响是不同的,很solid,也很直观。我之前一个研究也用过这个思路去分离a和z对choice/rt的影响。当时我是用一个r包去做的simulation,hddm应该也可以?
我刚又找了一下 当时我应该是用的RWiener这个包做的simulation 但应该挺多包可以做simulation的 可以分别调控其中一个参数,看对choice ratio和rt的影响; 我之前应该是验证过z对choice ratio影响还是挺大的,但同时也会影响rt,因为a priori变了,到达某个boundary的时间自然就变了; 但是a主要是影响rt不影响choice ratio。
bug
plot出错 series无node数据
plot 参数
worker terminated

- 可能是因为数据里面有nan
- 可能是因为regression model里面有多重共线的变量,考虑去掉其中一个
- 还有种情况是,你代码本身就有问题,比如变量名就不对读取不到data里面的变量
并行&运行时间
命令行好像可以看到运行时间

跑一次regression,10000 thin=1,三个小时左右
- 还是得把服务器搞得能跑docker才行,太慢了
- 并行一定要有时间和进度条,不然根本不知道跑到哪里了还需要多少时间,非常不好估计进度
30000, thin=4, 十个小时?
CMOTs,10000,burnin=5000,thin=4,1080.753824
CMOTs, 20000, burnin=10000, thin=4,2562.616598,0.7个小时
CMOTs, 50000,burnin=30000, thin=5,5870.618478
——对比了下50000丢30000,和10000丢5000,前期抽样有一点点自相关的那种依然不会变,也就是说两者其实差不多了已经,跑起来还快一点
simple full model 四个参数,10000sample,burnin=5000,thin=4, 107秒就跑完了……