Hypervisor Hunt

After getting burnt by Hyper-V, I decided to go for the tried and true and installed VMware Player on Windows 10. I had to install Ubuntu again on the new virtual machine, but it was a breeze thanks to VMware’s automated installation process. Everything that was missing in Hyper-V was there. I was able to use my laptop’s real resolution, networking over Wi-Fi was done automatically, audio magically started working, and even performance was noticeably better.

After a few weeks of heavy usage, I started noticing some problems with VMware. Every once in a while the guest OS would freeze for about a second. I didn’t pay too much attention to it at first, but it slowly started to wear on me. I eventually realized it always happens when I use tab completion in the shell and the real cause was playing sounds. It’s still progress over Hyper-V’s inability to play any audio, but it was not exactly a pleasant experience.

The other, far more severe issue, was general lack of performance. It just didn’t feel like I was running Ubuntu on hardware, or even close to it. I experienced constant lags while typing, alt+tab took about a second to show up, compiling code was weirdly slow, video playback was unusable, and everything was just generally sluggish and unresponsive. Overall it was usable, but far from ideal.

Today I finally broke down and decided to give yet another hypervisor a shot. Next up came VirtualBox. I didn’t have high expectations, but VMware was starting to slow me down so I had to try something. Installation was even easier since VirtualBox can just use VMware images. Then came the pleasant surprise. Straight out of the box performance was noticeably better. Windows moved without lagging, alt+tab reaction was instantaneous, and sound playback just worked. Once I installed the guest additions and enabled video acceleration, video playback started functioning too. I still can’t play 4K videos, but at least my laptop doesn’t crawl to a halt on every video ad.

As a cherry on the top, VirtualBox was also able to properly set the resolution on the guest OS at boot time. In VMware, I had to leave and enter full screen once after login for the real resolution to stick. Switching inputs between guest and host in VirtualBox is also easier. It requires just one key (right ctrl) as opposed to two with VMware (left ctrl+alt).

I realize these results depend on many things like hardware, drivers, host/guest versions, etc. I bet I could also solve some of these issues if I put some research into it. But for running Ubuntu 17.04 desktop on my Windows 10 Dell XPS 13 with the least hassle, VirtualBox is the clear winner. Let me know if you had different experience or know how to make it run even smoother.

Things They Don’t Tell You About Hyper-V

I really wanted to like Hyper-V. It’s fully integrated into Windows and runs bare metal, so I was expecting stellar performance and a smooth experience. I was going to run a Linux box for some projects, get to work with Docker for Windows, and do it all with good power management, smooth transitions and without sacrificing performance.

And then reality hit.

  1. Hyper-V doesn’t support resolutions higher than 1920×1080 with Linux guests. And even that is only adjustable by editing grub configuration which requires a reboot. The viewer allows zooming, but not in full screen mode. With a laptop resolution of 3200×1800, that leaves me with a half empty screen or a small window on the desktop.
  2. Networking support is mostly manual, especially when Wi-Fi is involved. You have to drop into PowerShell to manually configure vSwitch with NAT. Need DHCP? Nope, can’t have it. Go install a third party application.
  3. Audio is not supported for Linux guests. Just like with the resolution issue, you’re forced to use remote X server or xrdp. Both are a pain to setup and didn’t provide acceptable performance for me.
  4. To top it all off, you can’t use any other virtualization solution when Hyper-V is enabled. Do you want both Docker for Windows and a normal Linux desktop VM experience? Too bad… VMware allows you to virtualize VT-x/EPT so you can run a hypervisor inside your guest. Hyper-V doesn’t.

It seems like Hyper-V is just not there yet. It might work well for Windows guests or Linux server guests, but for Linux desktop guest it’s just not enough.

Download PDB by GUID

Sometimes you get stuck with a broken or no dump at all. You know what you’re looking for but WinDBG just keeps refusing to load symbols as you continue to beg for mercy from the all knowing deities of Debugging Tools for Windows. You know what PDB you’re looking for but it just wouldn’t load. The only thing you do know is that you don’t want to go digging for that specific version of your product in the bug report and build a whole setup for it just so you can get the PDB. For those special times, some WinDBG coercion goes a long way.

To download the PDB create a comma separated manifest file with 3 columns for each row. The columns are the requested PDB name, its GUID plus age for a total of 33 characters and the number 1. Finally call symchk and pass the path to the manifest file with the /im command line switch. Use the /v command line switch to get the download path of the PDB.

To demonstrate I’ll use everyone’s favorite debugging sample process.

C:\>echo calc.pdb,E95BB5E08CE640A09C3DBF3DFA3ABCB42,1 > manifest

C:\>symchk /v /im manifest
[...]
SYMSRV: Get File Path: /download/symbols/calc.pdb/E95BB5E08CE640A09C3DBF3DFA3ABCB42/calc.pdb
[...]
DBGHELP: C:\ProgramData\dbg\sym\calc.pdb\E95BB5E08CE640A09C3DBF3DFA3ABCB42\calc.pdb - OK

SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1

To force load the PDB you need to update the PDB path, turn SYMOPT_LOAD_ANYTHING on, and use the .reload command with /f to force and /i to ignore any so called mismatches.

kd> .sympath C:\ProgramData\dbg\sym\calc.pdb\E95BB5E08CE640A09C3DBF3DFA3ABCB42
kd> .symopt+0x40
kd> .reload /f /i calc.exe=0x00400000

You should now have access to all the data in the PDB file and stack traces should start making sense.

SCSIPORT debugging

Microsoft provides useful extensions for debugging SCSIPORT drivers in WinDbg. But with some versions of scsiport.sys, the symbol files don’t contain type information. This produces fun errors like the following.

kd> !scsikd.scsiext 8a392a38
*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: scsiport!_DEVICE_OBJECT                       ***
***                                                                   ***
*************************************************************************
scsikd error (3): ...\storage\kdext\scsikd\scsikd.c @ line 188

This makes the common task of getting your device extension object very daunting. After some digging, I came up with this code to at least get my device extension object from SCSIPORT’s device extension object.

!drvobj mydriver
* get relevant DevObj
!devobj <devobj>
* get DevExt
dt mydriver!MY_DEVICE_EXTENSION poi(<DevExt> + b4)

I’ve only tried it on Windows XP SP3. The offset may be different with other configurations. Anyone knows a better way around this? Preferable method would naturally be making scsikd work.

Debug Xen Hosted Windows Kernel Over Network

Read the original at my company’s blog.

Blue screens are not a rare commodity when working with virtualization. Most of the times, full crash dumps do the trick, but sometimes live kernel debugging is required. Hard disk related crashes that prevent memory dumping is a good example where it is required, but there are times where it’s just easier to follow the entire crash flow instead of just witnessing the final state.

Type 2 (hosted) virtualization usually comes with an easy solution. But type 1 (bare metal) virtualization, like Xen, complicates matters. Debugging must be offloaded to a remote Windows machine. The common solution, it seems, is to tunnel the hosted machine’s serial connection over TCP to another Windows machine where WinDBG is running, waiting anxiously for a bug check. There are many websites describing this setup in various component combinations. I have gathered here all the tricks I could find plus some more of my own to streamline the process and get rid of commercial software.

Lets dive into the nitty gritty little details, shall we?

Hosted Windows

Kernel debugging requires some boot parameters. Windows XP includes a utility called bootcfg.exe that makes this easy.

bootcfg /copy /id 1 /d "kernel debug"
bootcfg /raw "/DEBUG /DEBUGPORT=COM1" /id 2 /a
bootcfg /raw "/BAUDRATE=115200" /id 2 /a
bootcfg /copy /id 2 /d "kernel debug w/ break"
bootcfg /raw "/BREAK" /id 3 /a

This assumes you have only one operation system configured in Windows boot loader. If the boot loader menu shows up when Windows boots, you might need to add the flags on your own to C:\boot.ini.

Xen Host

Windows will now try to access the serial port in search of a debugger. Xen’s domain configuration file can be used to forward the serial port over TCP. Locate your domain configuration file and add the following line. The configuration files are usually located under /etc/xen.

serial='tcp::4444,server,nowait'

Debugger Machine

The server side is set and it’s time to move on to the client. As previously mentioned, WinDBG doesn’t care for TCP. Instead of the usual TCP to RS-232 solution, named pipes are used here. I wrote a little application called tcp2pipe (download available on the bottom) which simply pumps data between a TCP socket and a named pipe. It takes three parameters – IP, port and named pipe path. The IP address is the address of the Xen host and the port is 4444. For named pipe path, use \\.\pipe\XYZ, where XYZ can be anything.

tcp2pipe.exe 192.168.0.5 4444 \\.\pipe\XYZ

All that is left now is to fire up WinDBG and connect it to \\.\pipe\XYZ. This can be done from the menus, or from command line.

windbg -k com:pipe,port=\\.\pipe\XYZ

To make this even simpler, you can use kdbg.bat and pass it just the IP. It assumes WinDBG.exe is installed in c:\progra~1\debugg~1. If that’s not the case, you’ll have to modify it and point it to the right path.

tcp2pipe

Source code is included in zip file under public domain.

Download tcp2pipe.zip (mirror).

Happy debugging!

Pragmatic variant

As mentioned in my previous post, I have been working on incorporating some more features into WinVer.nsh. Every little change in this header file requires testing on all possible versions and configurations of Windows. Being the Poor Open Source DeveloperTM that I am, I do not have sufficient resources to assemble a full-blown testing farm with every possible version of Windows on every possible hardware configuration. Instead, I have to settle for a bunch of virtual machines I have collected over the years. It is pretty decent, but has no standards and doesn’t cover every possible version. Still, it does its job well and has proven itself very effective.

Obviously, be it a farm or a mere collection of virtual machines, testing on so many different configurations carries with it a hefty fine. Testing a single line change could waste almost an hour. Imagine the time it would take to test, fix, retest, fix and retest again a complete rewrite of WinVer.nsh. As fascinating as that empirical scientific experiment would have been, I was reluctant to find out. Laziness, in this case, proved to be a very practical solution.

WinVer.nsh tests do not really need the entire operation system and its behavior as it relies on nothing but 4 parameters. All it requires is the return values of GetVersionEx for OSVERSIONINFO and OSVERSIONINFOEX. For nothing more than 312 bytes, I have to wait until Windows Vista decides it wants to execute my test, Windows NT4 gracefuly connects to my network, Windows ME wakes up on the right side of the bed and doesn’t crash, Windows Server 2008 installs again after its license has expired and Windows 95…. Actually, that one works pretty well. So why wait?

Instead, I’ve created a little harvester that collects those 312 bytes, ran it on all of my machines and mustered the results into one huge script that tests every aspect of WinVer.nsh using every possible configuration of Windows in a few seconds. It required adding a hooking option to WinVer.nsh, but with the new !ifmacrondef, that was easy enough.

Currently, the script tests:

  • Windows 95 OSR B
  • Windows 98
  • Windows ME
  • Windows NT4 (SP1, SP6)
  • Windows 2000 (SP0, SP4)
  • Windows XP (SP2, SP3)
  • Windows XP x64 (SP1)
  • Windows Vista (SP0)
  • Windows Server 2008 (SP1)

If you have access to a configuration not listed here, please run the harvester and send me the results. More specifically, I could really use Windows 2003 and Windows Vista SP1. My Windows Vista installation simply refuses the upgrade to SP1. Again.

The test script also includes a hexdump of those 312 bytes for every configuration so anyone performing similar tests for another reason doesn’t have to parse the NSIS syntax. Feel free to use it for your testing.

Bigotry

Ladies and gentlemen, we interrupt the silence schedule to bring you shocking news. Hatred has reared its ugly head on the forsaken grounds of our dear old friend — Windows 98. It appears the bigots have set a new target for their cynical and non-politically-correct persecution. Big-boned dialogs and initialization-limited rectangulars are shamelessly discriminated against and abused for no acceptable reason. Exceptions, overflow errors, division errors and antique dialogs were thrown at the victims, reports say. We were unable to get comments from the alleged bigots.

We were unable to get pictures from the event, but luckily, it can be easily reproduced.

BOOL CALLBACK proc(HWND h, UINT m, WPARAM w, LPARAM l)
{
  return FALSE;
}
int main(int argc, char* argv[])
{
  char dt[24] = {0,};
  RECT r = {32757,};
  HWND dlg = CreateDialogIndirect(
    GetModuleHandle(NULL),
    (LPDLGTEMPLATE) dt,
    0,
    proc);
  MapDialogRect(dlg, &amp;r); // BOOM!
  return 0;
}

Genuinely later

On every second Tuesday of the month, Microsoft indulges us with a slew of updates ranging from trivial to critical and sometimes even truly superior. Sadly, not even the most ardor imbued zealot of Windows rejuvenation can bring the updates to life without a reboot. To ensure everyone do reboot, Microsoft has added the lovely “Restart Now” dialog we have all come to cherish.

Distressing as it may be, while loved and cherished, the dialog is often the center of attention in the Windows loath-fest. Getting rid of it, however, isn’t that difficult. All it takes is killing one service.

net stop wuauserv

But what if yours truly is not near the computer on patch Tuesday and the dialog starts its cheerful countdown to complete and total annihilation of the current session? While skimming through some Group Policies, I’ve noticed there’s one for disabling this annoying reboot countdown. Simply create a DWORD named NoAutoRebootWithLoggedOnUsers under HKLMSoftwarePoliciesMicrosoftWindowsWindowsUpdateAU, set it to 1 and say bah-bye to Microsoft’s equivalent of the dreaded ad pop-up.

Microsoft’s Tim Rains has more details on the subject.