[SeaBIOS] [PATCH] tsc: use kvmclock for calibration

Marcelo Tosatti mtosatti at redhat.com
Thu Aug 9 21:09:13 CEST 2012


On Thu, Aug 09, 2012 at 05:01:34PM +0300, Avi Kivity wrote:
> On 08/09/2012 04:57 PM, Gerd Hoffmann wrote:
> >   Hi,
> > 
> >>> +u64 kvm_tsc_khz(void)
> >>> +{
> >>> +    u32 eax, ebx, ecx, edx, msr;
> >>> +    struct pvclock_vcpu_time_info time;
> >>> +    u32 addr = (u32)(&time);
> >>> +    u64 khz;
> >>> +
> >>> +    /* check presence and figure msr number */
> >>> +    cpuid(KVM_CPUID_FEATURES, &eax, &ebx, &ecx, &edx);
> >>> +    if (eax & KVM_FEATURE_CLOCKSOURCE2) {
> >>> +        msr = MSR_KVM_SYSTEM_TIME_NEW;
> >>> +    } else if (eax & KVM_FEATURE_CLOCKSOURCE) {
> >>> +        msr = MSR_KVM_SYSTEM_TIME;
> >>> +    } else {
> >>> +        return 0;
> >>> +    }
> >>> +
> >>> +    /* ask kvm hypervisor to fill struct */
> >>> +    memset(&time, 0, sizeof(time));
> >>> +    wrmsr(msr, addr | 1);
> >> 
> >> How can this work?
> > 
> > It did in my testing, although maybe by pure luck ...
> > 
> >> There is a 64-byte alignment requirement.
> > 
> > 64 bytes?  Sure?  The whole struct is only 32 bytes in size ...
> 
> er, the documentation says 4 bytes (so stack alignment works).  I
> distinctly remember having a large alignment requirement so we don't
> cross a page or slot boundary... something's wrong here.
> 
> > 
> > Easily fixable though, just need to grab some memory with memalign
> > instead of using the stack.
> 
> > 
> >>> +    wrmsr(msr, 0);
> >>> +    if (time.version < 2 || time.tsc_to_system_mul == 0)
> >>> +        return 0;
> >>> +
> >>> +    /* go figure tsc frequency */
> >>> +    khz = pvclock_tsc_khz(&time);
> >>> +    dprintf(1, "Using kvmclock, msr 0x%x, tsc %d MHz\n",
> >>> +            msr, (u32)khz / 1000);
> >>> +    return khz;
> >> 
> >> That's a meaningless number.  You can be migrated to a cpu or a machine
> >> with very different tsc.
> > 
> >> You want accurate time on kvm, don't use the tsc.
> > 
> > seabios uses the tsc for timeout calculations only, so it doesn't need
> > to be 100% accurate.  The order of magnitude should be correct though.
> > The Linux kernel uses the value for delay loops too, so using it for the
> > given purpose can't be *that* horrible after all ...
> > 
> > It is certainly an improvement over the current code which tries to
> > calibrate the tsc and gets totally broken results in case the busy host
> > happens to schedule the guest in the middle of calibration.
> > 
> > So what do you suggest?  The options I see are:
> > 
> >   (1) Use this patch (with alignment issue fixed of course).
> >   (2) Do a full kvmclock implementation.  Feels a bit like overkill.
> >   (3) SeaBIOS can fallback to the PIT for timing on machines which
> >       have no TSC.  We could do that too in case we detect kvm ...
> 
> What sort of timeouts are these?  If seconds, maybe the rtc would be best.

I vote for 3 so nobody has to maintain kvmclock code in SeaBIOS and Gerd
can fix the in-kernel PIT issues with GRUB (see Michaels message) while testing.




More information about the SeaBIOS mailing list