Mr OlliW,
Looks to me you are one to not give up in pursuit of answers to questions. This is familiar to me as I tend to work the same way. First thing you need to know, is that there are no hard specifications. When you ask how much deviation is allowed, the answer is simply "the less, the better." Perhaps not what you want to hear, but it is the truth.
Watching your video, I'm fairly certain you do not have the GPS reception necessary to perform a tight position hold. This first shows up with vertical position uncertainty. And, since vertical position is mostly determined by satellites at a perpendicular angle to your local vertical, positioning yourself between two houses only works against you.
Next, the antenna that you are using is not going to be anywhere as effective as an active ceramic patch. I know for a small machine, carrying such a burden is a problem, but it is necessary for any sort of precision. I'm not saying that you can not get the dipole antenna to work to some degree, but it is far from ideal. A 35mm, active ceramic patch antenna with the proper ground plane will allow an M4 FC to perform to 10cm precision. Yes, just the antenna would weigh more than the entire gross weight of the craft, but that's the problem. I've used 10mm active ceramic patches with fairly good results, but far from perfect.
Then, to truly diagnose the situation, you are going to need a log of the flight. Due to its small size, the M4 does not have an onboard uSD interface. You would need an expansion board attached to achieve this. The logged data should paint a picture which show exactly what is going on. You show a <0.5m HACC while static (which I believe) but I don't know what is being reported during the course of the actual flight.
For the altitude hold, there are now two modes automatically switched between depending on the reported GPS accuracy. The primary is dominated by the GPS's reported vertical velocities. The second is a backup used when there is "far less than ideal" GPS data and relies entirely on the barometer, ACCs and estimated attitude - mainly used for indoors.
As to the questions about the MAG sensors, if you have run through the onboard calibration routines, your MAGs are probably calibrated well enough to resolve to better than 5 degrees from actual. If you have good GPS signal, the system will refine this as time goes on. In your video, I do not see any evidence that your estimated heading is causing you problems.
Let's all keep in mind that the M4 in a LadyBird configuration is less than 40 grams. It is amazing to me that such a setup is able to have any autonomous capabilities, much less precision. Please do not let this tarnish the true strength of the AQ firmware which with proper configuration, can perform incredibly well, even with the tiny M4 FC. The M4 in a native configuration is extremely fun to fly, but other than the novelty of the fact that it has an autonomous mode, is really just a toy because the necessary antennas and shielding simply weight too much. The idea is that this "toy" is able to be upgraded and used in a serious application without compromise.