How Bochs works under the hood1 OverviewThe Bochs virtual PC consists of many pieces of hardware. At a bare minimumthere are always a CPU, a PIT (Programmable Interval Timer), a PIC(Programmable Interrupt Controller), a DMA controller, some memory (thisincludes both RAM and BIOS ROMs), a video card (usually VGA), a keyboard port(also handles the mouse), an RTC with battery backed NVRAM, and some extramotherboard circuitry.There might also be a NE2K ethernet card, a PCI controller, a Sound Blaster16, an IDE controller (+ harddisks/CDROM), a SCSI controller (+ harddisks), afloppy controller, an APIC ..There may also be more than one CPU.Most of these pieces of hardware have their own C++ class - and if bochs isconfigured to have more than one piece of a type of hardware, each will haveits own object.The pieces of hardware communicates over a couple of buses with each other- some of the things that the buses carry are reads and writes in memoryspace, reads and writes in I/O space, interrupt requests, interruptacknowledges, DMA requests, DMA acknowledges, and NMI request/acknowledge. Howthat is simulated is explained later FIXME.Other important pieces of the puzzle are: the options object (reads/writesconfiguration files, can be written to and queried while bochs is running) andthe GUI object. There are many different but compatible implementations of theGUI object, depending on whether you compile for X (Unix/Linux), Win32,Macintosh (two versions: one for Mac OS X and one for older OS's), BeOS,Amiga, etc.And then there is the supporting cast: debugger, config menu, panichandler, disassembler, tracer, instrumentation.2 Weird macros and other mysteriesBochs has many macros with inscrutable names. One might even go as far asto say that bochs is macro infested.Some of them are gross speed hacks, to cover up the slow speed that C++causes. Others paper over differences between the simulated PCconfigurations.Many of the macros exhibit the same problem as C++ does: too much stuffhappens behind the programmer's back. More explicitness would be a bigwin.2.1 static methods hackC++ methods have an invisible parameter called the this pointer -otherwise the method wouldn't know which object to operate on. In many casesin Bochs, there will only ever be one object - so this flexibility isunnecessary. There is a hack that can be enabled by #defining BX_USE_CPU_SMFto 1 in config.h that makes most methods static, which means they have a"special relationship" with the class they are declared in but apartfrom that are normal C functions with no hidden parameters. Of course theystill need access to the internals of an object, so the single object of theirclass has a globally visible name that these functions use. It is all hiddenwith macros.Declaration of a class, from iodev/pic.h:...#if BX_USE_PIC_SMF# define BX_PIC_SMF static# define BX_PIC_THIS bx_pic.#else# define BX_PIC_SMF# define BX_PIC_THIS this->#endif...class bx_pic_c : public logfunctions {public: bx_pic_c(void); ~bx_pic_c(void); BX_PIC_SMF void init(bx_devices_c *); BX_PIC_SMF void lower_irq(unsigned irq_no); BX_PIC_SMF void raise_irq(unsigned irq_no);... };extern bx_pic_c bx_pic;And iodev/pic.cc:...bx_pic_c bx_pic;#if BX_USE_PIC_SMF#define this (&bx_pic)#endif... voidbx_pic_c::lower_irq(unsigned irq_no){ if ((irq_no <= 7) && (BX_PIC_THIS s.master_pic.IRQ_line[irq_no])) { BX_DEBUG(("IRQ line %d now low", (unsigned) irq_no)); BX_PIC_THIS s.master_pic.IRQ_line[irq_no] = 0; BX_PIC_THIS s.master_pic.irr &= ~(1 << irq_no); if ((BX_PIC_THIS s.master_pic.irr & ~BX_PIC_THIS s.master_pic.imr) == 0) { BX_SET_INTR(0); BX_PIC_THIS s.master_pic.INT = 0; } }... }}...Ugly, isn't it? If we use static methods, methods prefixed with BX_PIC_SMFare declared static and references to fields inside the object, whichare prefixed with BX_PIC_THIS, will use the globally visible object,bx_pic. If we don't use static methods, BX_PIC_SMF evaluates tonothing and BX_PIC_THIS becomes this->. Making it evaluate tonothing would be a lot cleaner, but then the scoping rules would changeslightly between the two bochs configurations, which would be a load of bugsjust waiting to happen.Some classes use BX_SMF, others have their own version of the macro, likeBX_PIC_SMF above.2.2 CPU and memory objects in UP/SMPconfigurationsThe CPU class is a special case of the above: if bochs is simulating a uni-processor machine then there is obviously only one bx_cpu_c object and thestatic methods trick can be used. If, on the other hand, bochs is simulatingan smp machine then we can't use the trick. The same seems to be true formemory: for some reason, we have a memory object for each CPU object. Thismight become relevant for NUMA machines, but they are not all that common --and even the existing IA-32 NUMA machines bend over backwards to hide thatfact: it should only be visible in slightly worse timing for non-local memoryand non-local peripherals. Other than that, the memory map and device mappresented to each CPU will be identical.In a UP configuration, the CPU object is declared as bx_cpu. In anSMP configuration it will be an array of pointers to CPU objects(bx_cpu_array[]). For memory that would be bx_mem andbx_mem_array[], respectively.Each CPU object contains a pointer to its associated memory object.Access of a CPU object often goes through the BX_CPU(x) macro,which either ignores the parameter and evaluates to &bx_cpu, orevaluates to bx_cpu_array[n], so the result will always be a pointer.The same goes for BX_MEM(x).If static methods are used then BX_CPU_THIS_PTR evaluates toBX_CPU(0)->. Ugly, isn't it?2.3 BX_DEBUG/BX_INFO/BX_ERROR/BX_PANIC --logging macrosgo through a generic tracing mechanism. Can be switched individuallyon/off. Might eat a lot of CPU time - I think there are some BX_INFO calls foreach instruction executed.2.4 BX_TICK1, BX_TICKN(n),BX_TICK1_IF_SINGLE_PROCESSORBX_TICK1_IF_SINGLE_PROCESSOR, only used in cpu.cc -- and onlyconfuses the matter. It calls BX_TICK1 on a single-processor andnothing on SMP.3 CHECK_MAX_INSTRUCTIONS(count) - onlyneeded on SMP configurations without debugger support. I am going to changethe CPU emulation a lot (hopefully cleaning it up in the process), so I'vedecided to lose every SMP thing that gets in the way for me. This is one ofthem. Later, when UP works faster and better, I fully intend to restore SMPfunctionality -- or work with somebody else who does.3.1 BX_SIM_IDWhen using cosimulation it has something to do with which simulator that isexecuting? In any case, I removed it from my own source tree.3.2 BX_HRQ, BX_RAISE_HLDA, BX_INTR,BX_SET_INTR(b), BX_IAC()3.3 Various macros associated with dynamictranslationRelics of Kevin Lawton's initial attempts of using dynamic translation toIA-32 machine code instead of interpretive emulation. That developmentcontinued in Plex86, which seems to be more or less abandoned for the moment.Bochs will probably go in the direction of dynamic translation at some pointin the future but for we will concentrate on better GUIs, betterconfiguration, better hardware emulation and better support for reverseengineering. We would also very much like bochs to be faster but we will usesimpler methods for the foreseeable future. These relics will be cut out assoon as possible.3.4 Cosimulation supportFor debugging changes in the CPU emulation, especially really bigoptimizations, Kevin Lawton invented something he called"cosimulation". The idea is to run two different CPU emulators inlock-step and constantly compare their CPU state. The idea is very good-- and has been independently discovered by many people for decades -- but ishard to put into practice. As Kevin Lawton wrote in some early docs:fixme: something about every time he uses cosimulation he has to hack onthe code to make it work. I think the prudent thing would be to remove itfor the time being -- and hack in specific hooks the next time somebody wantsto use it. It should be maintained as a separate patch until we have found acleaner way of doing it.4 Memory - An IntroductionBoth RAM and BIOS'es. BIOSes can be loaded individually. physical_read(),physical_write(). All address translation and access checking has alreadytaken place in the CPU.Some hardware interaction takes place through this object: VGA. This isunfortunately hardcoded into the memory object at the moment :(5 The Basic CPUSimple CPU: no caches! Does have TLBs. Some real IA-32 implementationsdistinguish between TLBs for code and for data -- we don't. We save some timeon having 1024 TLB entries, which is a lot more than almost all real CPUs haveat the moment. Different CPU levels -- level 5 is not complete, yet.5.1 Some of the things we have to emulate - TheIA-32Real mode. Protected mode. 16-bit code, 32-bit code. segments, TLB,instruction prefetch queue, writes to memory can be executed"immediately" (makes things a bit harder for us later on),extraordinarily complex and varied instruction formats. Four differentprivilege levels, and then a system management mode on top of that for some ofthe CPUs. Six different segment registers (four on <386), capable ofoverriding the default segment register for the instruction, usually DS, butsometimes SS. Prefixes, address and operand size changes, tons of flags, tonsof special cases about which registers can be used for what purpose. Totallyfree alignment of both code and data. Instructions can be one to sixteenbytes. IO Privilege level, IO privilege map, V86 mode.Yada yada, you get the picture...5.2 Some example instructions(real mode, Intel syntax)INC AXMOV CX,[23+BX](protected mode, 32-bit default size, AT&T syntax)<something with two size prefixes, a 0x0F prefix, a lock prefix? and acomplicated address. LOCK ADD [BX*4 + CX*2 + DX + 1234], 17 ? >5.3 Decoding instructions is *HARD*On nicer processors, decoding instructions is an easy task. It's especiallynice on the MIPS and the Alpha.On IA-32 it's just about as lousy as it can get :/ In order to reduce thecomplexity a bit, all the decoding of the operand fields is done first, byBX_CPU_C::FetchDecode(), and then the instruction is executed by one of manyhundred small methods that don't have to care (much) about their operands.5.4 BxInstruction_tb1modr/mrep_usedimm8, imm16, imm32jmp...executeresolvemodrm16resolvemodrm325.5 The Main Loop - First cut5.6 The Main Loop -Interrupts/Traps/Exceptions5.7 The Main Loop - SMP5.8 "Prefetching"Should be called something else.5.9 FetchDecode5.10 Execute pointers5.11 The Anatomy of Memory Accessessegment : offset -> 32-bit linear -> 32-bit physicalsegments, segment caches, base + limit, type5.12 The Main Loop -Interrupts/Exceptions/Traps5.13 So how was the prefetching in detailagain?prefetchrevalidate_prefetch_qinvalidate_prefetch_qwhen is it invalidated?when is it revalidated?when do we actually have to do any of these?5.14 Things I lied aboutA20, extend down segments, FPU, synchronization between CPU and(potentially external) FPU. Reset of the CPU by forcing a triple-fault.debugger interface, config interface temporary disabling of interrupts (afterSS changes) ;; might have to go below the following section5.15 Flag handlinglazy flags, 5 32-bit ints to describe the operation. Some macros thatevaluate the flags on demand.5.16 How are exceptions implemented?all instructions restartable from the register state + BxInstruction_t.Commit EIP + ESP (why that?) after successful execution of the wholeinstruction.Never possible to generate exception /after/ changing the visiblestate.longjmp(), setjmp()5.17 What if we trip on an assertion?Lots of checks all over the place. Also deep inside routines called by thecpu main loop. Die/cont/alwayscont/quit in Control Panel - or a debugger. Howdoes it do that? Some variation on the exception scheme?6 Specific tricks6.1 4GB in real modeWhat is the trick and how does bochs make sure that it works6.2 Switching from protected mode to realmodereset + cmos6.3 Typical reset thru keyboard controller6.4 Triple-fault reset6.5 Fast reset gate6.6 A20 change Should probably hitch a ride onthe TLB paging mechanism for speed.6.7 "CMOS" NMI gate6.8 V866.9 V86 with virtual interrupt flag6.10 APIC: IRQ rerouting to NMI6.11 SMP: IPI (Inter-Processor Interrupt)6.12 SMP: cache bounces6.13 SMP: locked read-modify-writes6.14 SMP: spinlocks6.15 SMP: TSC potentially out of synch6.16 SMP: BIOS and necessary tables6.17 SMM - System Management ModeNot implemented yet. Required for ACPI, I think.6.18 Huge amounts of memoryDon't want bochs to push out other programs - handle swapping manually.BIOS and memory size reporting PAE Small window - big memory file Only need toswap in/out when TLB changes Keep memory on 4K boundary and use mmap() Needs> IA-32 machine (e.g. Alpha or some other 64-bit behemoth) or LFS.6.19 PnP6.20 PCI - configuration6.21 PCI controller7 Things that make you go"hmmm"...7.1 16K pages between 0xC0000 and 0xFFFFF withPCI7.2 Whyread_RMW_virtual_(byte|word|dword)?8 Optimization Ideas8.1 Traces "Almost all programming can be viewed as an exercise in caching" -- Terje Mathisenresolve16/32 can't be cached like this (example that uses registers togenerate an effective address)8.2 Squish out flags handlingBX_NEED_FLAGS, BX_SETS_FLAGS8.3 How to be lazy with addressesonly retranslate seg:ofs -> linear -> physical when strictlynecessary8.4 Handle repeating instructions in biggerglobsspecial versions of access_linear()8.5 split access_linear into read and writeversions8.6 combine segment limits with TLB pagesA bit that says if everything is ok or the address has to bereevaluated8.7 Better branch prediction for execute ptrcallsswitch (len) { case 7: i[len-7].execute(i[len-7]); case 6: i[len-6].execute(i[len-6]); ... case 0: i[len-0].execute(i[len-0]);}9 Communication Between Devices9.1 Ticks and hardware emulationThe non-cpu hardware in the Bochs virtual PC needs to run some code once ina while to either do some real work, synchronize with the rest of the machineor interact with the host OS.Timers, based on simulated instructions retired count. The GUI is made likethis too -- that is probably a bad idea. BX_TICK1_IF_SINGLE_PROCESSOR()Examples of worker functions: xxxxx.9.2 Interrupts9.3 DMAHOLDA9.4 IRQ pinsISA IRQ2/9, IRQ3, IRQ4, IRQ5, IRQ6, IRQ7, IRQ...PCI INTA, INTB, INTC, INTD - routing i PCI controller + on motherboard.9.5 Interrupt routingLevel/edge triggered.PICAPICPCI controller9.6 NMI10 Communication between VGA and GUI10.1 idle (HLT) and GUI10.2 GUI and configuration10.2.1 Floppy disk/dev/fd0A:<path to disk file>inserted/ejectedicon click -> set_status(inserted/injected)how to check with ioctl10.2.2 CD ROM/dev/cdromdisk changeHow does El Torito work?Only in BIOS? What about hardware ATAPI?11 Various Hardware11.1 CPU11.2 CPU - SMP11.3 APIC11.4 PIT11.5 PIC - master/slave11.6 Slowdown11.7 Realtime PIC11.8 RTC + CMOS11.9 FPU FWait, exception handling, somethingabout a weird exception + an IRQ reserved for the FPU.11.10 Memory Some of the address range ishandled by the i440 PCI chipset, which may subdivide it further.11.11 i440 PCI chipset Also handles shadowROMs.11.12 AGP11.13 DMA address bus sizes? built into PCIchipset? speed? limited to ISA bus speed? -- no. DMA happens as fast asdevices want, provided the CPU allows it.11.14 Floppy Controller11.15 IDE11.16 Harddisk11.17 CDROM11.18 Speaker11.19 Sound Blaster11.20 NE2K NIC11.21 Mouse11.22 Keyboard11.23 Parallel port11.24 Serial Port11.25 USB11.26 SCSI11.27 IRQ in general11.28 Ordinary BIOS11.29 VGA BIOS11.30 LBE both some BIOS calls and some"hardware"12 How to register a new device13 How to make snapshots14 How to suspend/resume15 How to make configurations easier #! .... bochs --help bochs -h bochs --version // also prints compile options bochs -V (version) bochs <config filename> bochs -v // tells us which config file is used + all the options read from // it.16 Dreams for the future:Suspend/resume - without APM (for debugging/wizards).Suspend/resume - with APM.Automatic floppy disk change detect.Automatic cdrom disk change detect.More than one boot device (.bochsrc -> CMOS, read by BIOS)Check that longjmp()/setjmp() doesn't violate C/C++ rules about whichvariables are valid after a jump.Net bridgeGTK+/Gnome GUISetup WizardDebugger interface in cTVisionBetter mouse + keyboard handling - copy VMWare with XGrabKey/XGrabMouseobviate the need for a client program to handle the mouse + keyboardLinux console API for the screenEasily run on real SVGA hardware, with only a thin debug/log layer inbetweenlikewise for other hardware - tell bochs what I/O, IRQ, DMA, mem resourcesthe hardware uses, let it negotiate with Linux to access and lock it. Mightneed a suid proxy. SVGA is a special case :)good term mode (bochs in curses - emulate MDA? CGA?)bidirectional parallel port - some API as Linux 2.4 and VMWaregood idle handling in all GUIsuse shared memory, so many bochs instances will share a pool of memorycut'n'paste between host and guestGNU lightning JIT'ingPort GNU lightning to AlphaUse Xft/Render so copying vga fonts won't be necessary anymoreCompressed harddisks and undo logs/checkpointsTænk hvis vores disk I/O ender med at blive hurtigere end VMWares ;)Autodetect hvilken mus og antal knapper og den slagsFloppy/cdrom/etc icons -> dialogs that let you choose images/Much/ better error messages!MIDI supportJoystick support - w/mouse, analog joystick, digital joystick, etc. asinput.USB - access to real usb netUSB - proxy the mouse + printer to bochs' usb net?USB - debugger/monitorUSB - tun/tap like interface to user provided simulated hardwaredecide on good and consistent configuration strategy: use CMOS orconfiguration file.17 Error messagesCheck that floppy/cdrom/disk are accessible with the current privileges andgive the poor user some sensible error messages if not, INCLUDING examples ofcommands to fix the problem(s).With bigmem support: check that 1) glibc supports LFS, 2) that the kernelsupports it, 3) that the file system supports it, 4) that there is room enoughin the designated directory.18 Tools/linksnasm - also contains ndisasm, a nice disassembler for 8086 real mode and386 protected mode. http://nasm.sourceforge.net/bcc - Bruce's C Compiler, by Bruce Evans. Generates either 8086 real modecode or 386 protected mode code. Used to compile the BIOSes.as86 - Assembler for 8086 real mode code, by Bruce Evans. version xxx -older versions don't accept the -O (optimize forward jumps) flag. It is notquite Intel syntax (and very far from AT&T syntax). Usually included withld86 in a package called bin86. Built into cygwin (FIXME: true?)ld86 - Linker for 8086 real mode code, by Bruce Evans. See as86.PC Timing FAQ, by Kris Heidenstrom - his home page is athttp://home.clear.net.nz/pages/kheidens/ and has many interesting docs. Here'sthe link to the FAQ:ftp://ftp.simtel.net/pub/simtelnet/msdos/info/pctim003.zipSerial Port FAQ release 19, by Christian Blum:http://www.repairfaq.org/filipg/LINK/PORTS/F_The_Serial_Port.html There areother versions floating around on the net but this was the newest I couldfind. It's in HTML whereas the original I read so many years ago was a nice,single plain text file. If somebody finds a link to that version I'd like toknow. |
|