On Thu, Aug 09, 2012 at 05:01:34PM +0300, Avi Kivity wrote:
On 08/09/2012 04:57 PM, Gerd Hoffmann wrote:
Hi,
+u64 kvm_tsc_khz(void) +{
- u32 eax, ebx, ecx, edx, msr;
- struct pvclock_vcpu_time_info time;
- u32 addr = (u32)(&time);
- u64 khz;
- /* check presence and figure msr number */
- cpuid(KVM_CPUID_FEATURES, &eax, &ebx, &ecx, &edx);
- if (eax & KVM_FEATURE_CLOCKSOURCE2) {
msr = MSR_KVM_SYSTEM_TIME_NEW;
- } else if (eax & KVM_FEATURE_CLOCKSOURCE) {
msr = MSR_KVM_SYSTEM_TIME;
- } else {
return 0;
- }
- /* ask kvm hypervisor to fill struct */
- memset(&time, 0, sizeof(time));
- wrmsr(msr, addr | 1);
How can this work?
It did in my testing, although maybe by pure luck ...
There is a 64-byte alignment requirement.
64 bytes? Sure? The whole struct is only 32 bytes in size ...
er, the documentation says 4 bytes (so stack alignment works). I distinctly remember having a large alignment requirement so we don't cross a page or slot boundary... something's wrong here.
Easily fixable though, just need to grab some memory with memalign instead of using the stack.
- wrmsr(msr, 0);
- if (time.version < 2 || time.tsc_to_system_mul == 0)
return 0;
- /* go figure tsc frequency */
- khz = pvclock_tsc_khz(&time);
- dprintf(1, "Using kvmclock, msr 0x%x, tsc %d MHz\n",
msr, (u32)khz / 1000);
- return khz;
That's a meaningless number. You can be migrated to a cpu or a machine with very different tsc.
You want accurate time on kvm, don't use the tsc.
seabios uses the tsc for timeout calculations only, so it doesn't need to be 100% accurate. The order of magnitude should be correct though. The Linux kernel uses the value for delay loops too, so using it for the given purpose can't be *that* horrible after all ...
It is certainly an improvement over the current code which tries to calibrate the tsc and gets totally broken results in case the busy host happens to schedule the guest in the middle of calibration.
So what do you suggest? The options I see are:
(1) Use this patch (with alignment issue fixed of course). (2) Do a full kvmclock implementation. Feels a bit like overkill. (3) SeaBIOS can fallback to the PIT for timing on machines which have no TSC. We could do that too in case we detect kvm ...
What sort of timeouts are these? If seconds, maybe the rtc would be best.
I vote for 3 so nobody has to maintain kvmclock code in SeaBIOS and Gerd can fix the in-kernel PIT issues with GRUB (see Michaels message) while testing.