HP Forums
What if? - Printable Version

+- HP Forums (https://www.hpmuseum.org/forum)
+-- Forum: Not HP Calculators (/forum-7.html)
+--- Forum: Not remotely HP Calculators (/forum-9.html)
+--- Thread: What if? (/thread-6255.html)



What if? - Joseph_21sv - 05-16-2016 02:45 AM

I have had a wacky-sounding idea for a parallel or alternate continuity microcomputer development might have been following. It goes like this: What if, at each time the data path width of motherboards would be going to double, there would have been some engineer making a design most likely based on twin copies of a CPU with a data path half as wide instead of any single CPU with a data path actually that wide? This might sound wacky on first thought because it amounts to using a chip which appears unnecessary, but on second thought it would have been (and may yet remain) reasonable when mass market motherboards would (and may yet) be going to double their data path width because those lower-bit CPUs were (or are) simply the more commonly known technology of the time, making it easier to find people to develop new software for the new platform or port existing software to it. Moreover, computers with 16-bit and higher motherboards have always used dedicated video RAM chips anyway in order to obviate the need for color attribute video modes and even some very early 8- and 16-bit CPUs are made from discrete chips, albeit ones less integrated than a lower-bit CPU, so why not suppose that there would be a parallel or alternate history of motherboard designs using discrete copies of a CPU to build up a data path wider than that of that particular CPU itself?
To start, we could suppose that this engineer does the most obvious thing and makes a design which is double a certain architecture known at the time, but more likely using dedicated video RAM chips and so not needing a higher resolution color attribute video mode if it is based on an early color computer.


RE: What if? - Garth Wilson - 05-16-2016 07:33 AM

That sounds like bit slice which has been around for decades.


RE: What if? - Joseph_21sv - 05-17-2016 12:23 AM

It is bit slicing, just with the prebuilt instruction set architecture of a full microprocessor. That is what always would have made it a reasonable thing to do. However, nobody has ever done it physically, or else somebody has but nobody else has thought anything constructive of the design. That is why I am posing the question of what a microprocessor-based bit-sliced architecture might look like in terms of real hardware.


RE: What if? - Paul Berger (Canada) - 05-17-2016 01:44 AM

The first problem that I see is that while your data bus may be twice the width of each individual CPU, there will only be one address bus, so you options are to either have the two CPUs run lock-step or have hardware to arbitrate access to the address bus. If each of the CPUs had their own local cache memory then they would not necessarily need access to the address bus 100% of the time, but I would expect that there would be contention quite often. This contention could possibly be reduced with tuned caches.

In the real world one of the reasons for having a wider external data bus that is wider than the processors data bus is the DRAM memory that we usually think of as the system memory is very slow compared to the processors speed, this is why modern processors have multiple levels of faster SRAM cache. In the DRAM it takes the same amount of time to read and write a location in a 8 bit wide DRAM array as it does in a 64 bit memory array the difference is in the second case you are accessing 8 times the amount of data. Accessing the DRAM using a very wide data bus coupled with caching can be to a great advantage. Say your DRAM array was 64 Bytes wide if you cache line was also 64 bytes that would mean you could load and store whole cache lines in a single memory operation which would greatly speed up access to the otherwise slow DRAM. The downside of doing this is it greatly increases the complexity of the PC boards and along with that the cost.

BTW any computer with a built in display device, either built into the system board or in the form of an adapter card has to have dedicated memory for the display, because in order to keep the image on the screen the display hardware has to cycle through it refresh buffer to keep drawing the image over and over. The alternative is to use high persistence phosphor CRTs and have the CPU draw the screen and then go away and the image will remain and only slowly fade away before the CPU has to come back and redraw it, however having seen systems like that, I would sooner stick with display hardware with a refresh buff like any PC even the old CGA and MDA display cards in the first IBM PC had dedicated memory for a refresh buffer.


RE: What if? - Joseph_21sv - 05-17-2016 11:28 PM

Beyond that first problem that you see, Paul, the problem I have realized is that while my idea may be a complete thought, it does not mean what I had intended it to mean. You see, it slipped my mind that it can mean that there is also a bit-sliced address bus, and it radically changes the character of the situation if it does, as I meant it to, for then each individual CPU can have private access to one slice of a wider common address bus, which will mean that the two CPUs do not contend priority access to the bits of the narrower address bus of an individual CPU.
In the real world, it is only during the later end of the 32-bit era that DRAM has come to be thought of as "generally very slow" compared to a CPU (a late model Pentium II could be clocked over 500 MHz while the SDRAM it was supporting could not have an I/O bus clock faster than 200 MHz). Therefore, caches of SRAM started to be built into the CPUs themselves so as to avoid incurring delays by making them access slow SDRAM every time they needed to fetch data. 8 and 16-bit CPUs never really needed or need caching because their heyday was in the days when, for the most part, you could even expect a computer to have a CPU that was very slow compared to its DRAM and moreover the heyday of 8-bit CPUs was so early in computing history that most logic and memory chips simply could not switch fast enough to make it easy to time-division multiplex CPU and VDU accesses to DRAM, a problem most computers of the day solved either by having a register that the VDU's logic would rewrite every time the raster switched into and out of the blanking period (the CPU could read this register and see when it was safe to write to the screen). However, this had the disadvantage that it relied on the CPU not being told in software to simply ignore that register and try to write to the screen anyway, but the designs must have been done trusting that software developers would not purposefully have that happen. Ultimately though, this only ever applied to the cheaper models of those days, expensive enough ones already having a video adapter with separate memory for the VDU to read from or faster logic and memory chips so that it was easier or even effectively unnecessary to time-division multiplex CPU and VDU accesses to DRAM and moreover these technologies became cheaper to deploy fast enough that computer designers simply ceased designing new computers with the slow chips for which the status register trick was necessary in order to cut costs associated with manufacturing them and nobody really missed them either, having known firsthand what the computers with these chips were like. For my part, I have really known no computers like that firsthand, so I cannot say that I really either prefer for CPU and VDU accesses to DRAM to be easy, hard or effectively unnecessary to time-division multiplex, but I do not expect to prefer the path of greater resistance if I ever really get to know what it is like..


RE: What if? - Paul Berger (Canada) - 05-18-2016 01:20 AM

(05-17-2016 11:28 PM)Joseph_21sv Wrote:  Paul, how wide an address bus do you mean, as wide or twice as wide as each individual CPU? If twice as wide, I would not expect that the two CPUs would ever need to contend address bus access because each individual CPU would then have exclusive access to an individual slice of the address bus, thus obviating the need for either option.

Ok lets take a hypothetical new system board that has a 32 bit data bus and a 32 bit address bus. 32 bits of address will allow you to access about 4.29E9 memory locations. If as you propose we split both the address and data bus between the two processors, so each processor has 16 bits data and 16 bits address, well 16 bits of address only allows you to access 65.5E3 memory location, yes it would work but that would hardly be adequate for a modern system. Not only that but since this system board is designed for 32 bits of data and 32 bits of address the memory array is going to be laid out as an array of 4.29E9 32 bit words. The word in that is addressed in that array that is being accessed will be the one pointed to by the sum of the bits active on the whole 32 bit address bus, so with what you propose you would have two separate processors providing half of that address word, the result to say the least would be chaotic. For this to work as you propose with a split address and data bus, the memory array would need to be designed so that it could be reconfigured to split the memory into two separate 64K word array, and if someone was designing a system board for a 32 bit processor with a 32 bit address bus, why would they do such a thing?


RE: What if? - Joseph_21sv - 05-20-2016 03:49 PM

(05-18-2016 01:20 AM)Paul Berger (Canada) Wrote:  
(05-17-2016 11:28 PM)Joseph_21sv Wrote:  Paul, how wide an address bus do you mean, as wide or twice as wide as each individual CPU? If twice as wide, I would not expect that the two CPUs would ever need to contend address bus access because each individual CPU would then have exclusive access to an individual slice of the address bus, thus obviating the need for either option.

Ok lets take a hypothetical new system board that has a 32 bit data bus and a 32 bit address bus. 32 bits of address will allow you to access about 4.29E9 memory locations. If as you propose we split both the address and data bus between the two processors, so each processor has 16 bits data and 16 bits address, well 16 bits of address only allows you to access 65.5E3 memory location, yes it would work but that would hardly be adequate for a modern system. Not only that but since this system board is designed for 32 bits of data and 32 bits of address the memory array is going to be laid out as an array of 4.29E9 32 bit words. The word in that is addressed in that array that is being accessed will be the one pointed to by the sum of the bits active on the whole 32 bit address bus, so with what you propose you would have two separate processors providing half of that address word, the result to say the least would be chaotic. For this to work as you propose with a split address and data bus, the memory array would need to be designed so that it could be reconfigured to split the memory into two separate 64K word array, and if someone was designing a system board for a 32 bit processor with a 32 bit address bus, why would they do such a thing?

In the days of the TI TMS9900 and GI CP16x0 processors that your hypothesis assumes to be on the system board of this computer, 131E3 (128 ki) memory locations and a 17 bit address bus would have been plenty to work with. However, even at the time when PCs were transitioning from 16 to 32 bits, the dominant 16 bit systems (i80286 and M68000) had a 24 bit address bus (the M68000 internally had a 32 bit address space, but Motorola had no space to put on the pins for the highest order byte of the address space). If one assumes either of these historically accurate processors on the hypothetical system board, there are then 33.6E6 (32 Mi) memory locations and a 25 bit address bus (with an internal 33 bit address space). In those days, that would have been plenty to work with (the 32 bit ARM2 processor itself only having 26 bit addressing). Even so, I meant that there would be extra logic for encoding addresses to build up an address bus as wide as the sum of those of the individual CPUs. That way only two CPUs are needed to slice up the address bus as wide as the sum of their individual address busses, which I intended to propose that the hypothetical design would involve.


RE: What if? - Paul Berger (Canada) - 05-20-2016 08:10 PM

Yes but you would still have to have extra hardware to on the memory array end to change the way the address decoding works.


RE: What if? - Joseph_21sv - 05-21-2016 05:16 PM

That would be TTL and CMOS address decoders (2-4, 3-8, 4-16, …), which could have been integrated into a single circuit for that application even as early as 1973.
In addition, since a dual design could have simply been two copies of a pre-existing design in the same computer, it might have been a machine programmed by nothing but sixteen toggle switches with no real video or sound driver, just sixteen LEDs and 256 words of memory. Of course though, most pre-existing designs used real video (and sound) drivers and thousands of words of memory, even if what video (and sound) they drove were a 1-bit display limited to strictly text, that excluding semigraphics and even lowercase letters for the font could have had only 64 characters or been too blocky to be able to distinguish case or have use for semigraphics (and one channel with fixed sound effects). Since the design would be doubled, the resulting machine would have double the capabilities of the basic design.