TD;DR: Yes, the NES has borders; the above discussion talked about SMS too much, but what it says is common to ALL consoles intended for CRTs. And an image is not stretched to fit a CRT except horizontally.
The long version: I will try to explain this in a different way that will hopefully make it clear why. I will again focus on NTSC; NTSC, PAL and SECAM differ in details (such as number of lines, or number of frames per second), but their workings are the same. I will not make the mistake of focusing on any given console, TV, video recorder or anything else to avoid confusing the issue even more.
NTSC has 525 lines; no more, no less. These are split into two 262.5-line fields (or 262-line fields, for progressive display, as in consoles) which are drawn alternatively to the even rows and odd rows (progressive display uses the same set of rows always). The field rate is 60 Hz for black & white, or 60/1.001 for color signals (due to the added requirement in terms of bandwidth, as well as for backwards compatibility with older TVs). For interlaced display, this means one 525-line image per frame at 30/1.001 Hz, for progressive display this is a 262-line image per "frame" at 60/1.001 Hz. I will ignore interlaced displays from now on because they are generally irrelevant for consoles (but Sonic 2 2p mode says hi).
Of the 262 lines in progressive display in NTSC, 243 make up the active display; the rest are used for vertical retrace of the electron beams. Thus your video must output 243 lines per field, and 19 lines' worth for the vertical blanking. If it outputs any less, the video will roll over. Thus, ANY console, VCR, or whatever that outputs NTSC signals needs to output 262 lines worth of video, of which 243 need to have actual image data (again, progressive display). Borders count as needing image data, as different CRTs have different overscan sizes; so the NES, too, has vertical borders.
Now comes the lines, also known as scanlines. First off, there is a fundamental difference between CRTs (for which NTSC, PAL and SECAM were developed for originally) and modern displays (LCDs, plasma, etc): while CRTs have a discrete number of scanlines, a scanline in a CRT is controlled by an analog signal. This means that it technically has infinite resolution, if you can manage to generate a signal fast enough; but in truth, the signal will most likely wash out the details if it is too fast, and may not even work with most older TVs and cables. Put in other words, a "pixel" is whatever width your analog video signal makes it. This is usually incorrectly referred to "stretching" or "extending" the image.
The signal used to generate the center image in the NES, SMD, SNES and H32 Genesis is around 5.37 MHz signal; if you do the math, this means about* 341 "pixels" per scanline; of these, 288 "pixels" are in the active region (the rest are for horizontal retrace of the electron beams, and are in overscan). Thus, anything that generates a signal with that frequency will have to generate 341 "pixels" of signal, 288 of which compose the active image. Since a NES image is 256 pixels wide, it needs to generate a border to fill the image region (this is required because of the frequency of the generated signal); so the NES, too, has horizontal borders.
As I mentioned above, overscan varies by TVs, and even as the set ages; this applies for scanlines too. The blanking part is always in overscan; but part of the active image data is likely to also be on overscan. Consoles used a "safe" region of the active image that comprised about either 80% or 90% in vertical and horizontal ranges. This changed over time as TVs were better manufactured and more powerful consoles were made.
* Result depends on really high precision numbers and some rounding, so results may vary depending on which numbers you use
Edit: And for what is worth, I am taking a class on digital processing of sound and video that covers a lot of stuff.