Users of application software that use (mainly) the parallel port usually complain that
dlportio.sys won't work on a 64-bit Windows version. Therefore, the recommendation was to either use a 32 bit Windows version instead, either native or in a Virtual Machine, or to upgrade the hardware to a USB version trashing the old hardware + software, if available. As parallel ports are still available as PCI, PCIexpress, or ExpressCard extensions, the problem not being upgradeable towards 64 bit remains. (I assume the reader expects AMD64, not the totally incompatible and almost dead Intel64 architecture.)
giveio.sys + dlportio.sys replacement is available.
Results from Debug Session
The application programs affected must be 32 bit, as 16 bit programs won't run on Windows 64bit either. Moreover, the programs have the I/O instructions inlined rather than calling a well-known DLL for each I/O, mostly inpout32.dll. For the latter case, an upgraded inpout32.dll would do the job since a couple of years.
As introduction of AMD64 architecture is long ago, I'm wondering that no someone else found such a solution beforehand. I coped with that problem redirecting that inlined I/O by injecting a DLL into the process in question that fakes to run Windows 9x and catches exceptions 0xC0000005 (Privileged Instruction) with some instruction decode to do the actual I/O in kernel mode. But that is time consuming and performs bad for data-acquisition software.
When I read the AMD64 manual, I detected that the TSS (Task State Segment) has the same construction as for 32-bit, and the IOPM (Input Output Permission Map) works also the same way. So it's worth investigating how Win64 TSS is implemented and whether it's possible to enable the IOPM.
Using a kernel mode debugger (windbg with a serial null-modem cable), I detected that the TSS is a per-CPU structure with 0x67 limit. Therefore it's 0x68 bytes long, the bare minimum, with no space for the IOPM. Interestingly, the IOPM pointer at offset 0x66 contains the "right" value 0x68 just beyond the limit, and the memory is filled up with zeroes at least 8 KByte for each processor's TSS. (A zero bit enables access from user mode, while one-bits will deny access. Port addresses that would result in testing bits beyond the TSS limit will also denied, therefore, the TSS limit value dictates the effective length of IOPM and the maximum allowed port address. It's common to set the TSS limit in a way to have a full 64 KBit = 8 KByte IOPM, to ease system software.)
Therefore, simply adjusting the limit value in the GDT entry for the TSS will do the job:
eb @gdtr+@tr+1 20
This debugger command adds 8 KByte to the limit value of TSS. The value of TR (Task Register) seems to be fixed to 0x40 in Windows.
That command must be repeated for each processor. To switch the active processor, use:
whereas the "
1" is the desired processor number, counting from zero. As an alternative, the process in question must be set to a specific processor affinity somehow.
After doing these steps, any application with inlined I/O can now execute without privilege exceptions. Most programs will complain that the driver still cannot be loaded, but that issue can be solved more easily. In some cases, you can fake the program in question to execute inlined I/O without checking the Windows version or driver first.
Making a Driver
Now, it's time to put the solution into a functional driver. As a starting point, I used the source code of
giveio.sys. However, the IOPM in 32-bit Windows is thread-specific, which is not the case for 64-bit Windows. Moreover, code had to be added to execute some piece of code on each processor. To execute specific assembly instructions, an assembly helper file was necessary because Microsoft's C compiler won't allow inline assembly for 64-bit target.
The driver itself is totally straightforward: On load, it expands the GDT's limit entry, on unload, it shrinks it to the original value. The "open" entry point is solely for compatibility to
dlportio.sys. Because both well-known 32-bit drivers work the same way but have different Win32 entry names, my driver simply offers two names. So either "
\\.\giveio" or "
\\.\dlportio" can be opened by
CreateFile() with success. As stated above, nothing happens there.
The code for running code on each processor I copied out of another project, namely my USB2LPT which needed it to patch the IDT (Interrupt Descriptor Table) to catch kernel-mode I/O for redirecting. The workhorse is an array of DPC (Deferred Procedure Call) where the processor that will run the procedure can be assigned beforehand.
While testing the driver, some unwanted behaviour occurred:
- When started without debugger attached, the PatchGuard terminates the Windows session with Bluescreen 0x109 some minutes later. For disabling PatchGuard, there are solutions around. Moreover, the driver won't work at all: Port access is still not possible.
- When started with debugger attached, it worked sometimes.
The latter case revealed that I forgot to reload the Task Register explicitly. A breakpoint, either hitting Ctrl+Break or a
__debugbreak() macro, will reload the Task Register as a side effect of the kernel debugger. It was easy to add an
str + ltr sequence, but the result was an immediate double-fault bluescreen. Reading the AMD64 manual again, it stated that
ltr to a busy task will fault. Next time I deleted the busy bit out of the GDT entry, and it works:
mov byte ptr[rdx+1],0x20
and byte ptr[rdx+5],not 2
The reason for the need of
ltr (Load Task Register) is the CPU-internal cache that holds a copy of the GDT entry. Therefore, when nobody does
ltr later, it's possible to revert the GDT entry to previous state, and - voilà - PatchGuard won't see the hack!
mov byte ptr[rdx+1],0
I checked this out for some hours running one and another software: It really works! I'm lucky that nobody checks or overwrites the Task Register Cache value. Unfortunately, any Microsoft engineer can use this publication to include an
ltr instruction into a future PatchGuard update, and - if so - that driver's functionality will die without notice. (A bluescreen won't occur.) On the other hand, a user-mode component can catch this exception and re-enable giveio.sys. Because
ltr execution is time-consuming, PatchGuard cannot use it at high frequency, therefore, Microsoft cannot deny this way of port access enabling for now without a processor (hardware) update.
Using the Driver
See the documentation. The archive contains the full source code, the driver binary, and a readme.txt file.
Driver services can be installed by the
sc (Service Control) command-line tool. Services can be started and stopped using either
net is the old way.) Starting a kernel-mode service equals to loading a kernel driver (
insmod). Therefore, I don't added user-mode installation software.
Moreover, the driver is now digitally signed. Therefore, it can be used without any Driver Sign Enforcement hassle.
- 150409 Successful IOPM activation in debugger
- 150420 Start writing the driver
- 150514 Driver signed