Last time, I explored the inside of the Super Nintendo cartridges. Today I am going through its video system.
I put myself in the shoes of a Nintendo engineer working in Masayuki Uemura (上村雅之)'s team[1][2] by studying what was available in 1989, namely a TV set, to understand what decisions had to be made while designing the SNES video system.
Here is the summary of what I learned. Perhaps you will enjoy tagging along.
The screen upon which the SNES outputs video is a standard TV set. Usually it is used to watch Captain Tsubasa, Cobra, Astro Boy, Captain Herlock, Saint Seiya, or Dragon Ball.
There is an antenna on the roof of the house which catches the analog TV broadcast (NTSC), a cable bringing the signal to a tuner, and finally the part where the image is displayed, called a cathode ray tube (CRT).
More importantly for the topic at hand, there are auxiliary (AUX) inputs. A basic TV set would have a composite connector (in yellow) which carries a video signal. The auxiliary stereo audio signals are carried over dedicated jacks (in white and red on my ugly drawing).
The CRT is a super line drawing machine. At the time, they were rated at 15kHz which means they could draw in the vicinity of 15,000 lines per second.
Inside the CRT is a gun with three electron cannons. The cannons always shoot straight in front of them, and two sets of magnets (one vertical and one horizontal) route them up/down and left/right.
In the drawing above, I colored the rays from the cannons but only so the reader can follow. Electrons have no color. There is a mask in front of the phosphor strips to make sure the electrons from each cannon land in the appropriate color strip.
There are no pixels in the world of CRTs. A slot is not a pixel. The drawing below zoom into a scanline where various parts of slots are hit. The one guarantee is that electrons from a cannon always land in the correct color strip.
A HD TV has smaller slots, better able to render the color signal. In the drawing below, the same line is rendered horizontally with more fidelity thanks to the higher density of slots.
A CRT consumes five signals, carried over four wires. There is one wire for each of the Red, Green, and Blue signals. They are directly connected to the cannons of the gun. The higher the signal, the more electrons are shot and the more bright the phosphor strips are. No signal on all three wires means no electrons being shot, resulting in black being displayed on that line.
The white wire in the drawing above carries the synchronization signals. There are two, named Horizontal Sync (HSYNC) and Vertical Sync (VSYNC). The two signals use the same wire so it is called Composite Sync (CSYNC).
With my PC programming background, I was used to "Wait for VSYNC" which carried the false idea the CRT emitted it. That is wrong. A CRT emits nothing, it only consumes signals and tries to synchronize the cannon with them.
The CRT draws a line (a.k.a raster) from left to right. When it receives a HSYNC event, it "returns" to the left of the screen (X=0). When it receives a VSYNC event, it goes back to the top of the screen (Y=0).
Observant readers will see a problem with these events. There is no way to go down. The system driving the CRT can issue as many HSYNC and VSYNC as it wants, the same line at the top of the screen will end up being drawn over and over.
The key to understanding CRTs is to assimilate that the cannon moves towards the right of the screen with a downward slope[3]. Upon HSYNC, the CRT returns to X = 0 but because the cannon will have aimed downward, the next line will be drawn below the previous one.
This opens the door to cool tricks. The drawing above shows a signal where VSYNC is issued at the same time as the last HSYNC. The lines are always drawn at the same location on the screen. But look below what happens if a VSYNC is issued between two HSYNC.
Because it only drew half a line at the bottom, the CRT starts drawing the next line at the top of the screen at the same X position. The next set of lines will be interlaced with the previous set.
Lines sets are called "fields". The mode where fields are drawn at the same location is called "progressive" scan ("p"). The mode where fields are interlaced is abbreviated "i". In i mode, the tradeoff is that the vertical resolution is doubled but the refresh rate of each line is halved.
NTSC issues two fields at 30-ish Hz. Therefore all CRTs provisioned enough space between lines for interlacing. When drawing in progressive, scanline spacing is visible. It results in black space between lines[4] which are characteristic of CRT rasterization.
Visible scanlines gaps. Photo source: retrogameboards.comThe CRT is numeric when it comes to drawing lines but analog when it comes to what is inside a line[5]. As seen in the drawing, the three cannons are directly connected to the three RGB wires. A system is free to change the color signal as much as it wants (hence use any horizontal resolution). The only limit is signal propagation and the slot mask density.
While the SNES designers could issue what they wanted on the wires, they still had to make sure the CRT would be able to deal with it. Since the hardware is designed to display a NTSC signal, whatever they decided on had to be close to these specifications[6][7].
59.94Hz is such a weird number. Isn't the power grid running at 60Hz and TVs used that AC frequency directly? Black and White NTSC used to be 60Hz. When broadcast engineers had to find a way to add color to the NTSC signal without breaking backward compatibility they decided to reduce frequency by 0.1% to avoid artifacts[8][9].
Now that we know how a CRT works, it is time to play at being a Nintendo engineer and craft a video system.
The first choice to make it how many lines we want. NTSC uses 262.5 lines per field but the half-line is to interlace fields. We can use 262 to make it progressive. With a target framerate of 59.94, that should require 15,734.26 lines per second which is within 4% of the 15KHz rating.
The CRT screen has an aspect ratio of 4:3. If we use 350 dots horizontally, we will match exactly that aspect ratio and there will be no distortion when the console image is converted into scanlines.
262 lines at 59.94Hz, each with 350 dots means we need a dot clock pulsing at 262 * 350 * 59.94 = 5,496,498Hz. We can craft an ASIC which counts dot ticks. Every 350 ticks it issues a HSYNC. Every 350*262 = 91,700 ticks, it issues a VSYNC[10]. I guess we are done?
There are two issues with this naive design.
When the gun position is reset horizontally or vertically, it continues to shoot electrons. If it was to keep on shooting, it would create visible artifacts.
Another thing to consider is that TVs tend to over-scan their screen area[12], which means the picture on the screen is a little larger than the display. How much the TV over-scans varies from TV to TV. This happens to hide wobbling.
When the gun vertical position is reset to Y=0 (after VSYNC), it is going to undulate up and down for a while. You only get straight lines after a few µs. The same problem happens horizontally after HSYNC.
The solution to all these problems is to "stop" the CRT cannon a little bit after VSYNC and after HSYNC. These time spans during which no electrons are shot are called respectively VBLANK and HBLANK.
All gaming systems of that era used blanking. Here is a summary of the SNES competitors.
Machine | Year | Lines | VBLANK lines | Visible lines | Lines per second | Framerate |
---|---|---|---|---|---|---|
Capcom arcade CPS-1 | 1989 | 262 | 38 | 224 | 15,622 | 59.6294[13] |
Sega Genesis | 1989 | 262 | 38 | 224 | 15,700 | 59.9227[14] |
Neo-Geo AES[15] | 1990 | 264 | 40 | 224 | 15,734 | 59.18 [16] |
If we look closely at the recap table above, we see that all the competing systems, namely the Megadrive, the Neo-Geo, and Capcom's CPS-1 used 224 visible lines.
They probably did not pick that number at random. 224 is a number evenly divisible by 16 (224/16 = 14) which means it plays nicely with the graphic rendering pipeline tilemaps.
My best guess is that Nintendo did not want to reinvent the wheel. They did not need higher resolution but better graphics. What made the system stand apart was its PPUs.
In the end, they went the safe way and split their 262 lines per frame into 224 visible + 38 blanks (as the drawing on the right shows).
Arcade games could afford to be as peculiar as they wanted on a per-title basis.
The designers of R-Type at Irem were unsatisfied with the default ”standard” 224 active lines of a CRT.
They calibrated their M72-System registers to draw 284 lines, 512 dots, and used an 8 Mhz dot-clock. Leaving 128 dots to HBLANK and 28 lines to VBLANK resulted in an active resolution of 384x256 which was higher than other arcade titles at the time.
The trade-off was a vertical refresh rate of 55.017605 Hz which was visually less pleasing and dangerously 10% off from the CRT recommended values. This refresh rate is difficult to replicate for ”modern” emulators but what an impressive feat for a 1987 system!
R-Type (1984) has a whopping 256 visible lines (photo credit: wikipedia)!
So far we have picked a number of lines per frame (262). We also know we won't be able to pick a dotclock. We have to use the Master clock (21.47727MHz) and use a divider to end up close to NTSC dotclock. That leaves us with using a 21.47727 Mhz / 4 = 5.3693175 MHz dot clock.
Lines, dots, dot clock and refresh rate are inter-connected via the framerate equation.
refresh rate = lines * dots / dot clock
Given that our target refresh rate is 59.94Hz, we don't have much of a choice for the number of dots per line.
dots = 5369317.5 (dot clock) / 262 (lines) * 59.94 (rate) ≃ 342
Except that for gory reasons involving carrier artifact when using composite outputs, Nintendo engineers had to use 341 dot per lines instead of 342. This leaves the SNES with a framerate of:
refresh rate = 5369317.5 / (341 * 262) = 60.098Hz
60.098Hz is not NTSC's 59.94 Hz but since, as seen previously with R-Type, CRTs have tolerance it works. If you enjoyed this part, Nerdy pleasure has plenty more[17]
Of these 341 dots, all of them are not usable for the same wobbling, artifact hiding, and TV overscan reasons. The SNES needs an horizontal overscan during which it issues a blank signal.
The constraints are:
A third constraint was to allow enough time for the PPU to populate its sprite line buffer during HBLANK. My guess is that up to 128 sprites was a lot of data to retrieve and the PPU needed more than the 7µs granted by 37 dots of HBLANK if 304 visible dots was to be picked as horizontal resolution.
In the end, Nintendo decided on 256 visible dots per line with 85 dots of HBLANK. This means the PPU has 16µs to retrieve sprite data during HBLANK. This also means the aspect ratio was not 4:3 but 8:7 which results in slight distortion when the CRT displayed what the PPU generated.
So far we have designed the SNES video system with only progressive mode in mind.
Overscan resolution: 341x262 Visible resolution: 256x224 Framerate: 60.098Hz
Even though this is what 99% of games ended up using, the SNES also had high-resolution modes. I can double its resolution vertically and/or horizontally.
Doubling the resolution vertically to 448 lines is easy. We can just change the counter to issue a VSYNC half a line after the latest HSYNC to interlace frames. That means drawing 262.5 lines per frame but each line is now refreshed at only 60.098/2=30.049Hz. It will cause flickering and it won't be very pleasant but the vertical resolution will be higher[18].
Doubling the horizontal resolution however is much more difficult since the console doesn't have the dotclock for it.
The hack is that the SNES shifts every second field horizontally a bit, so the dots of the field end up between the dots of the previous field. You end up with something running at half the framerate and massive color bleeding. Quite a few titles used it, mainly for menu screens as detailed in fullsnes.txt.
Hires Software Air Strike Patrol (mission overview) (whatever mode? with Interlace) Bishoujo Wrestler Retsuden (some text) (512x448, BgMode5+Interlace) Ball Bullet Gun (in lower screen half) (512x224, BgMode5) Battle Cross (in game) (but isn't hires?) (512x224, BgMode1+PseudoH)(Bug?) BS Radical Dreamers (user name input only) (512x224, BgMode5) Chrono Trigger (crash into Lavos sequence) (whatever mode? with Interlace) Donkey Kong Country 1 (Nintendo logo) (512x224, BgMode5) G.O.D. (intro & lower screen half) (512x224, BgMode5) Jurassic Park (score text) (512x224, BgMode1+PseudoH+Math) Kirby's Dream Land 3 (leaves in 1st door) (512x224, BgMode1+PseudoH) Lufia 2 (credits screen at end of game) (whatever mode?) Moryo Senki Madara 2 (text) (512x224, BgMode5) Power Drive (in intro) (512x448, BgMode5+Interlace) Ranma 1/2: Chounai Gekitou Hen (256x448, BgMode1+InterlaceBug) RPM Racing (in intro and in game) (512x448, BgMode5+Interlace) Rudra no Hihou (RnH/Treasure of the Rudras)(512x224, BgMode5) Seiken Densetsu 2 (Secret of Mana) (setup) (512x224, BgMode5) Seiken Densetsu 3 (512x224, BgMode5) Shock Issue 1 & 2 (homebrew eZine) (512x224, BgMode5) SNES Test Program (by Nintendo) (Character Test includes BgMode5/BgMode6) Super Play Action Football (text) (512x224, BgMode5) World Cup Striker (intro/menu) (512x224, BgMode5) Notes: Ranma is actually only 256x224 (but does accidentally have interlace enabled, which causes some totally useless flickering).
We are still not done. In Europe, TVs don't use NTSC but PAL and the French even use SECAM. The framerate expected is exactly 50Hz and there are 312.5 lines per field.
That is actually a simple problem to solve. These versions of the SNES ship with an oscillator running at 17.7344750MHz (instead of NTSC 21.4772700MHz). There is a S-CLK chip which does 6/5 and then the same /4 divider to gives a dot clock of 17.734475 * (6/5) /4 = 5.32034250MHz[19].
The problem is that only 224 lines of graphics is going to result in big black bands above and below the active zone. This is solved via an "Overscan mode" which increased the number of visible lines to 240 (that is 16 lines which is one tile tall).
What a blessing for game developers willing to port a game to the European market you may say. In practice, "overscan mode" was never used. Most titles were tailor made for 224 lines so the developers did not know what to put in these 16 extra lines. In total, only twelve titles ever used it[20]. Nintendo still managed to do something awesome with their flagship title Super Mario World by increasing the vertical view range.
NTSC (256x224) | PAL (256x240) |
Note that both NTSC and PAL screen use the same 4:3 aspect ratio so the PAL image is a little bit more compressed vertically than the NTSC one.
NTSC screen (4:3) | PAL screen (4:3) |
Besides the annoying black band, the game code was also rarely revised to account for the VSYNC which occurred at 50.00697891Hz instead of 60.098Hz. This resulted in game running 17% slower than intended. European gaming was a real dumpster fire. But luckily without the internet we did not know about it.
So far we have only considered the "pure" signals needed to drive a CRT. However, few TVs set allowed to directly feed the CRT. Most sets only had a yellow composite jack input in the back while some high-end models had S-Video inputs.
The SNES does something pretty cool to handle this diversity. It converts the CRT signals to both composite and S-Video[21].
None of the signals are discarded. Thanks to the design of its AV output, gamers get a la carte access to the pure "RGB/CSync" signal, the "Composite" signal, and the S-Video.
1. Red 7. Luminance (S-Video) 2. Green 8. Chrominance (S-Video) 3. C-Sync 9. Composite Video 4. Blue 10. +5V DC 5. Ground 11. Left Audio 6. Ground 12. Right Audio
European TVs, especially those in France, came with SCART connectors (a.k.a Prise peritel). This allowed them to craft cables feeding the CRT directly[22].
That way we could enjoy our 17% slower, black-banded games at the highest level of visual fidelity.