A computer system includes a first integrated circuit that has a central processing unit (CPU) and a graphics controller. An I/O hub, which is coupled to a plurality of input/output buses, includes a RAMDAC. An interconnect bus couples the first integrated circuit and the I/O hub and carries both graphics data to or from a frame buffer and also carries asynchronous system data between the processor and the input/output integrated circuit. The frame buffer may be located in the I/O hub to reduce graphics traffic over the interconnect bus.
1. A computer system comprising: a first integrated circuit, including a central processing unit (CPU) and a graphics controller; an input/output integrated circuit coupled to a plurality of input/output buses, the input/output integrated circuit including a random access memory digital to analog converter (RAMDAC); and an communication link coupling the first integrated circuit and the input/output integrated circuit, the communication link for carrying graphics data to or from a frame buffer and for carrying asynchronous system data between the processor and the input/output integrated circuit. 2. The computer system as recited in 3. The computer system as recited in 4. The computer system as recited in clam 1 wherein the first integrated circuit includes a graphics processor in the graphics controller and wherein the computer system further comprises a memory controller on the first integrated circuit and system memory coupled to the memory controller. 5. The computer system as recited in 6. The computer system as recited in 7. The computer system as recited in 8. The computer system as recited in 9. The computer system as recited in 10. An integrated circuit providing an input/output hub for a computer system, the integrated circuit comprising: a plurality of interfaces for coupling to a plurality of input/output buses of a computer system and coupled to receive and transmit input/output data; a RAMDAC providing analog video outputs; and a link interface for coupling to a communication link, and coupled to the plurality of interfaces and the RAMDAC, the link interface coupled to receive frame buffer data over the link and coupled to receive and transmit input/output data over the link. 11. The integrated circuit as recited in 12. The integrated circuit as recited in 13. The integrated circuit as recited in 14. A method for communicating frame buffer data to an input/output integrated circuit including a RAMDAC over a communication link connecting the input/output integrated circuit to a first integrated circuit that includes a graphics controller and a CPU, the method comprising: transferring frame buffer data from the first integrated circuit to the input/output integrated circuit over the communication link; transferring input/output data between the CPU and the input/output integrated over the communication link; and transferring input/output data between the memory controller and the input/output integrated over the communication link. 15. The method as recited in writing the frame buffer data into a frame buffer in the input/output integrated circuit; and reading data from the frame buffer and into the RAMDAC for output on a display device. 16. The method as recited in 17. The method as recited in 18. The method as recited in 19. The method as recited in 20. The method as recited in 21. The method as recited in 22. The method as recited in
This application is a continuation-in-part of co-pending application Ser. No. 09/098,854, filed Jun. 17, 1998, entitled “Bus Optimized for Personal Computer Data Traffic” naming Larry D. Hewitt and Dale E. Gulick as inventors, which application is incorporated herein by reference. 1. Field of the Invention The invention relates to computer systems and more particularly to the graphics subsystem. 2. Description of the Related Art Traditional personal computer architectures partition the computer system into the various blocks shown in the exemplary prior art system illustrated in FIG. 1. One feature of this prior art architecture is the use of the Peripheral Component Interconnect (PCI) bus 101 as the connection between the “north bridge” integrated circuit 103 and the “south bridge” integrated circuit 105. North bridge 103 functions generally as a switch connecting CPU 107, a graphics bus 109 such as the Accelerated Graphics Port (AGP) bus, PCI bus 101 and main memory 111. North bridge 103 contains the memory controller function. The architecture also includes the “host bus” connection 108 between north bridge 103 and CPU 107. The south bridge 105 provides an interface to various input/output (I/O) portions of the computer system by providing, e.g., a bridge function between the PCI and legacy ISA bus 115, the Integrated Device Electronics (IDE) disk interface 117 and the Universal Serial Bus (USB) 119. Other devices, buses and functions may also be included in the South Bridge 105. In the illustrated prior art architecture, PCI bus 101 also functions as a major input/output bus for add-in functions such as network connection 121. The various busses and devices shown in The exemplary graphics subsystem 123, illustrated in AGP 109 offloads graphics traffic from the PCI bus 101 and allows the graphics controller direct access to main memory 111 for graphics information (e.g. 3-D textures) stored in system memory. The graphics subsystem conventionally includes a graphics processor 124 for performing video calculations and video memory including frame buffer 125. The frame buffer provides a digital representation of the screen image. Video memory also typically includes memory storing data used in video calculations performed by the video processor. The size of video memory varies but typical graphics cards have memory on the order of e.g., 4-16 Mbytes. Another function provided by the graphics subsystem 123 is provided by the random access memory digital to analog converter (RAMDAC). The RAMDAC converts the digital representation of the screen stored in frame buffer 125 into analog data for display device 127. The RAMDAC may be for instance a 230 MHz RAMDAC with three 8 bit DACs providing the red, green and blue (RGB) signals to display 127. One approach for providing lower cost personal computers is to provide an integrated graphics and northbridge function in order to try to reduce the number of system components. One disadvantage of such an approach is that the RAMDAC consumes a lot of power, e.g., on the order of 3W, making the power budget for the north bridge considerably higher than otherwise. Another cost savings approach, which may be used in conjunction with an integrated graphics and memory controller is to use system memory 111 for the video memory rather than providing for separate video memory. Such an approach is known as Unified Memory Architecture (UMA). The UMA approach can provide lower cost systems by eliminating separate video memory at the price of reduced graphics performance. It would be desirable to reduce the cost of personal computers by providing a more highly integrated system without paying the penalty of high power consumption caused by the RAMDAC. Accordingly, in one embodiment a computer system includes a first integrated circuit that has a central processing unit (CPU) and a graphics controller. An input/output integrated circuit, e.g., an I/O hub, which is coupled to a plurality of input/output buses, includes a RAMDAC. An communication link couples the first integrated circuit and the input/output integrated circuit and carries both graphics data to or from a frame buffer and also carries asynchronous system data between the processor and the input/output integrated circuit. The frame buffer may be located in the I/O hub, which further reduces graphics traffic over the communication link. Another embodiment provides a method for communicating frame buffer data to an input/output integrated circuit that includes a RAMDAC, over a communication link connecting the input/output integrated circuit to a first integrated circuit that includes a graphics controller and a CPU. The method includes transferring frame buffer data from the first integrated circuit to the input/output integrated circuit over the communication link. The method further includes transferring input/output data between the CPU and the input/output integrated circuit over the communication link and transferring input/output data between the memory controller and the input/output integrated circuit over the communication link. If the frame buffer is located in the input/output integrated circuit, the frame buffer data is data written into the frame buffer. Otherwise, the frame buffer data is data being read from the frame buffer and provided to the input/output integrated circuit. The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings in which use of the same reference symbols in different drawings indicates similar or identical items. Referring to Each of the data lines 213 and 220 may include 2ndata bits where n>0. Thus, a minimum implementation has one data bit in each direction. However, each unidirectional connection is typically in byte multiples of 8, 16, 32 bits or larger and is fixed in size. Note however, that the link does not have to be symmetrical. In other words, data lines 213 may be one byte wide, while data lines 220 are four bytes wide. Each unidirectional connection 208 and 210 includes one control line, regardless of the number of data lines on that unidirectional portion. In an exemplary embodiment, processor module 201 provides the major processing function in the computer system and includes memory controller 202, one or more central processing units (CPUs) 204, and graphics controller 206. Processor module 201 may include one or more integrated circuits. For example, processor module 201 may be a daughter card populated by a number of separate integrated circuits and plugs into a motherboard on which is mounted interface module 203. Alternatively, processor module 201 may be a single integrated circuit. System memory (not shown) is coupled to the memory controller 202. Interface module 203 in the exemplary embodiment shown in In one implementation, the link, as described in the co-pending patent application Ser. No. 09/098,854, entitled “Bus Optimized for Personal Computer Data Traffic”, filed Jun. 17, 1998, previously incorporated herein by reference, link 109 provides guaranteed bandwidth and latency to each isochronous stream such as RAMDAC data, audio data, and 1394 isochronous streams while also attempting to minimize latency to asynchronous accesses such as CPU-initiated accesses and PCI-initiated accesses. That may be accomplished by guaranteeing a certain amount of bandwidth to isochronous streams during each predetermined time frame on the link. Additional details can be found in Ser. No. 09/326,304, entitled “Computer Interconnection Bus Link Layer”, filed Jun. 4, 1999, by Dale Gulick, et al, which application is incorporated herein by reference. In another embodiment, providing guaranteed band width and latency may be accomplished by always giving highest priority to any isochronous packets, such as graphics data that would flow to RAMDAC 203. The latter approach assumes that the amount of isochronous data is small enough to not unduly impact the latency of CPU accesses. The key aspect of the link for the invention described herein is the ability to provide sufficient throughput to support both isochronous and asynchronous data streams over the same link connecting an integrated north bridge and an I/O hub such as interface module 203 within appropriate latency requirements for the respective streams. Current north bridge architectures utilizing the PCI bus do not provide that capability. In one embodiment, referring to In one embodiment there is a virtual channel in the link layer for each finctional unit connected to the bus. For example, processor module 201 ( Referring to The link layer hardware, unlike the protocol layer, is specific to the requirements of the local integrated circuit on which the link layer is implemented. FIFOs and queues are designed to the specific requirements of the channels being serviced by the link. Generally, the FIFOs of each channel are optional, based on the channel requirements. For example, one would not expect any isochronous transmit or receive FIFOs for the interface module's PCI block, since the PCI protocol does not allow for guaranteed isochronous data transfers (although, nothing prevents a designer from including these for the PCI bus channel). The interface module's RAMDAC interface 238 (FIG. 2), which provides data for the display, would typically require a single isochronous receive FIFO and potentially no asynchronous access FIFOs at all. FIFOs utilized in the link layer may be designed with the following considerations in mind. The FIFO may contain isochronous or asynchronous data; the FIFO may transmit data to the link or receive data from the link; the FIFO has a predetermined size (in bytes); the FIFO may be a master (controlled by local timing) or a slave (responding only to accesses); if a FIFO is a slave, then design considerations include what causes the slave FIFO to be loaded. The FIFO target address may be static or it may increment with each byte, in which case it will be reloaded periodically. The processor module's system memory channel has special requirements, since multiple streams from the interface module may attempt to access it. It requires several isochronous transmit and receive FIFOs and, potentially, multiple asynchronous FIFOs. In the illustrated embodiment, link 209 is a point to point link physically connecting precisely two integrated circuits. In that way, transfer speed across the link may be maximized. A bit time on link 209 is half a clock (CLK) period in duration. That is, two data bits are transmitted on each wire per cycle on the link. For example, for a clock (CLK) rate of 400 megahertz, data is provided on each edge of the clock which results in 800 million bit times per second. Link 209 transfers packet based information to and from a unit or function, which is a logical entity within a node, such as memory controller 202 that acts as a source or a destination for transactions. A node is a physical entity that connects to one end of the link. A transaction is a sequence of packets exchanged between two or more nodes in the system which results in a transfer of information. A source is the node that starts a transaction. A target is the node that ultimately services the transaction on behalf of the source. There may be intermediary nodes between the source and the destination. That is, the link may be configured as a daisy chain. In addition to the data, CTL and CLK signals, each device in the link receives a PwrOk signal (not shown) indicating that power is stable and clocks have been running for a predetermined amount of time. Each device on the link also receives a reset signal (not shown). The packet structure described herein is described with relation to 8 bit wide data lines. The packet structure for wider links, e.g., 16 and 32 bit links, can be derived from the 8-bit link packet structure by combining the fields within adjacent bit times. For example, where BTNmrepresents the Nth bit time within a packet for a link of width m and “&” represents concatenation. Thus, its takes two bit times to transfer the same information on an eight bit link as can be transferred on a sixteen bit link. Note that if all packets are multiples of 4 bytes long, packet boundaries will always fall on bittime boundaries in the 16 and 32 bit case, as well as the 8 bit case and thus CTL may only be asserted on 4 byte boundaries in certain implementations. The link in the embodiment described herein is assumed to be a pipelined split transaction link in which transactions are tagged by the source and responses can return out of order. Write command and read response packets always have an associated data packet. The data packet associated with the write command is the write data. The data packet associated with the read response is the requested read data for a previously sent read command. In one embodiment, the link transmits control packets and data packets. The control signals 212 or 219 distinguish between the two types of packets. Control packets provide control information. For example, the control information may indicate the kind of command (read or write) or an address to which the command is directed. Data packets carry data related to corresponding read or write operations. In order to reduce latency for information in control packets and to provide flexibility, the CTL signal allows control packets to be inserted in the middle of data packets. That is particularly advantageous when data packets are long. In addition, the control signal can also be used to insert delays in the middle of control packets. A data packet may not always immediately follow the last bit time of its associated control packet. That is another control packet can be placed between a control packet and its associated data packet, In the illustrated embodiment, CTL may only be deasserted when data transfer due to a previously transmitted control packet is pending, or in the middle of a control packet. An exemplary packet structure is illustrated in Nodes with multiple logical I/O streams may own multiple Unit ID values. For example, interface module 203 may be a node with multiple IDs associated with the various interfaces 230-238. Sequence ID (SeqId[3:0]) is used to tag groups of requests which were issued as part of an ordered sequence by a device, and must be strongly ordered within a virtual channel. All requests within the same I/O stream and virtual channel that have matching nonzero sequence ID fields have their ordering maintained. The sequence ID value of 00 is reserved to mean that a transaction is not part of a sequence. Transactions with this value have no sequence ordering restrictions, although they may be ordered for other reasons. PassPW indicates that this packet is allowed to pass packets in the posted request channel in the same I/O stream. Otherwise, it must stay in order behind them. Bytes three and four of the four byte command packet are command specific. The last four bytes (Addr[39:8]) represent the doubleword address accessed by the command. Not all address bits are included in all command types. Where finer granularity is required, byte enables may be used. An exemplary control packet for a response is illustrated in FIG. 5C and is four bytes long. The bridge field indicates whether the response packet was placed onto the link by the host bridge, and is used to distinguish responses travelling upstream from responses travelling downstream. The Error bit is present in all responses, and used to indicate that the issued request could not be completed. If the error bit is set, the NXA (Non-eXistent Address) bit is valid. If it is set, that means that the request could not be completed because no agent on the chain accepted the request. If the bit is clear, it means that the request packet reached its addressed target, but could not be completed by the device. The command field will of course vary from system to system. Exemplary commands include NOPs (no operation), write commands, read commands, and read responses. In addition, commands may relate to initialization of the links. Referring to The key aspect of interface module 203 relevant to the present invention is the presence of RAMDAC 238. RAMDAC 238 converts the digital screen image from the frame buffer into analog data, which is provided to display 252. The frame buffer stores in digital form the video image displayed on monitor 238. In a typical application, the RAMDAC reads 400-600 Mbytes/second from the frame buffer. As the construction and operation of RAMDACs is well known in the art, RAMDACs are not described in detail unless necessary for greater understanding of the invention. One advantage of placing the RAIDAC on interface module 203 rather than processor module 201 is that the RAMDAC consumes a relatively large amount of power, e.g., 3 W. Moving RAMDAC off of processor module 201 may simplify process technologies used in fabricating the integrated processor module 201 by eliminating a significant number of analog components which need to operate at voltages undesirable for the processor module 201. In addition, removing the RAMDAC from processor module 201 can significantly reduce its power budget. The frame buffer may be located in processor module 201 as part of graphics controller 206. Alternatively, the frame buffer may instead utilize system memory 250 in a Unified Memory Architecture (UMA) approach that was previously described. The UMA approach does tend to reduce graphics performance, however such a performance reduction may not be a concern for a wide variety of applications. In a typical scenario, the RAMDAC requires frame buffer data at the rate of, e.g., 400-600 Mbytes per second. That results in 400-600 Mbytes per second peak of frame buffer traffic on link 209. If the link is operating at, e.g. 800 Mbytes per second, graphics traffic is consuming 50-75% of the bus cycles. While that may be tolerable in some systems, another option can significantly reduce the amount of RAMDAC traffic on link 209. Vertical refresh time reduces time interval. Specifically referring to In addition to an analog output, interface module 203 can also provide a digital video output port 256 to support digital display 258. Such digital displays include flat panel displays such as those found in notebook computers. Such displays utilize digital data as inputs to describe the screen display, rather than the analog outputs provided by RAMDAC 238. In an exemplary embodiment digital video port 256 provides 16 bits (4 control bits and 12 data bits) and supports a screen resolution of up to 1024 by 768. As has been described, locating RAMDAC in interface module 207 reduces the power consumed by processor module 201 containing the CPU. Further advantages may be obtained by locating the frame buffer in interface module 203 which can significantly reduce link traffic. The embodiments described so far have an integrated CPU/memory controller/graphics controller. In other embodiments, the memory controller and the graphics controller may be integrated, while the CPU is on a separate chip. In such a configuration it is still advantageous to locate the RAMDAC in interface module 203 for the same reasons given above.CROSS REFERENCE TO RELATED APPLICATIONS
BACKGROUND OF THE INVENTION
SUMMARY OF THE INVENTION
BRIEF DESCRIPTION OF THE DRAWINGS
DESCRIPTION OF THE PREFERRED EMBODIMENT(S)