crystFEL 0.12.0 xsphere -- CC statistics don't go to zero as a function of resolution

I’ve recently been working on some TR-SFX data taken at Alvra, and was testing different merging options.

I discovered that the CC1/2 or CC* plots for data merged using the xsphere model don’t nicely approach zero as one would expect:

but unity appears normal:

The xsphere data aren’t messed up – in fact, they give slightly better refinement R-factors out to 2.0 Å for these data:

      	xsphere	unity
R-work	0.1688	0.1760
R-free	0.1957	0.2061

I am interested in understanding what is going on here, and if I can expect any systematic error/mischief in the xsphere data, or progress those with confidence. Has anyone seen this before or have any insight?

@taw I feel bad bugging you all the time, but: who else would know???

(moved to Software > CrystFEL)

Interesting plots. Now that you mention it, I think I’ve seen this effect myself although I never paid attention to it.

It’s just a fact of modelling that adding degrees of freedom to a fit will always “improve” the figures of merit. I think that’s what’s going on here: the extra degrees of freedom have absorbed some of the difference between the two half-datasets, creating a weak correlation.

To test this, you could try manually splitting the stream (needs a bit of script hacking since scripts/alternate-stream doesn’t work on recent data) and running partialator twice for each model (four times in total). I think the correlation will go away.

@taw indeed you seem to be right: see below.

Your account makes sense as well, thanks! Basically at low resolution, xsphere might be able to overfit to noise (if I understood correctly).

Not sure if that overfitting would have any significant negative effect on the data. It makes me a little nervous but on the other hand: the maps and R-factors look good.