Created 28 June 1996
Last updated 4 September 1996
Copyright © 1996 David H Dennis | All Rights Reserved
The World Wide Web has opened up room for a whole host of businesses, from web page design and construction to web hosting and software development. This FAQ is a road map to the various types of business opportunities available, and a repository for much-needed information on these subjects - business aspects in particular. How much of it is hype, how much of it is reality, and what are a few of the most-asked technical questions and answers?
I'm David Dennis, and I'm the author of the truly massive Internet Access Provider FAQ. Since it seems like there is now more interest in putting together web sites than giving standard Internet Access services, I thought I'd poke my way into this field and see what people think. Note that there is a certain degree of overlap between the two documents; if you want more details on many of the subjects covered here, please refer to the Internet Access Providers' FAQ as well. This is particularly true when it comes to connection-related issues.
My original desire was to create an Internet provider service myself. However, I never really quite got off the ground for it; as soon as I got the capital together to put something together, services became basically very standardized. It is only in the web provider world that creativity and innovation continue to be a major part of the business.
This is not, of course, to slight my friends in the Internet Provider industry; it is, rather, to recognize that the winds of change are blowing, and now the opportunities seem to be mainly with the creation and maintenance of content.
If you have questions or comments on this FAQ, please drop me a line!. For best response, please try to keep your lines below 80 characters, and don't mail messages as MIME attachments. (This pretty much means not to use Microsoft Exchange as your mailer if you can possibly avoid it. In fact, don't use Microsoft products at all if you can possibly avoid it - but that's another story).
My thanks to Michael Dillon for contributing substantial information about Apache and OS/2 web servers.
What is a web space provider? What do they do?
What's the best hardware platform?
The first answer is always "Whichever you're comfortable with." There is not such a dramatic difference between platforms that you shouldn't use whichever one is easiest for you.
That said, if you're creating your own pages, by far the best tool is a Silicon Graphics WebForce Indy. It has all the hardware and software you might need to create and serve wonderful web pages, and for what it does, the price isn't bad. Owner loyalty among SGI users is higher than that of any other computer product I've seen.
Like any platform, the Indy does have some drawbacks. The most important one is that you must buy something called the Iris Development Option, or IDO, for about $ 960. It's their C compiler. Just about any other platform you can name will let you obtain GCC and run C programs obtained from the net. Irix lacks the crucial include files that let you do this, so you must buy their C compiler even if you want to use GCC. Don't ask why; don't reason why; just do it. You need it, unfortunately.
The second most important one is that SGI dramatically overcharges for memory and disk drives - prices are something like 1,000% worse than you'll see anywhere else. Buy your basic system from SGI, but deal with third-party resellers for any memory. Don't be deceived by false marketing promises on the part of your reseller; the Indy uses bone-standard 72-pin SIMMs. They should cost between $ 8 and $ 10 a megabyte, as of now (28 June 1996).
For information on used SGI systems, see my Old SGI FAQ. Unfortunately, since the full WebForce software package is not included with most used systems, you may not find this information as useful as it should be.
If you cannot afford an Indy, you might want to consider a Sun clone from companies such as CERAM. I own a CERAM Sun clone and it's given me superb service over the year or so I've had it. It cost about $ 4,600 for the 85mhz SS5 with 32MB RAM and a 2GB hard drive. I'm switching to the SGI platform, but only because of the better user interface and software that you can get for it.
If you cannot afford even a CERAM Sun clone, I don't think this is the business for you, for reasons we will cover a little later. However, if you insist on using PCs, it's quite possible to create a decent web server using Linux or FreeBSD. This probably is a good route for you to take if you're an authetic PC hardware maven; however, you'd better read the various hardware FAQs for Linux and/or FreeBSD before you begin. My first server was a Linux system, and I listened to the recommendations of a very nice fellow who knew a lot about PCs and setting up Novell servers. Turns out he recommended hardware that was only marginally compatible with Linux, and so my system was a bit of a dog from the start. Hopefully you can take advantage of that unfortunate fact.
FreeBSD is probably slightly more reliable than Linux, but there are more drivers for Linux, and support over the newsgroups is far better for Linux. Because of this, I'd lean towards Linux as my PC system of choice.
Note that if you are creating your own pages, and you're not buying a WebForce Indy, you will probably want to invest in a Power Macintosh. Most graphics software still runs far better on the Mac than any PC platform, and the PC you could use as a web server would have to run a Unix variant anyway, so you're not going to be able to combine authoring and serving in one package like you can with the Indy.
Michael Dillon suggests checking out an OS/2 web server. "For people who want to avoid Unix they will get better performance and better support with an OS/2 WWW server and still get similar scripting and database back-end flexibility that you get on Unix." For web servers, he recommends GoServe, IBM's free non-secure Internet Connection or IBM's secure Internet connection.
Michael also recommends checking out Lotus Internotes Web Server if database integration is your main requirement.
If you are willing to be a Microsoft toady, you can put together a Windows NT server for a price between the cost of a Linux system and a CERAM Sun clone. You'll need substantially more hardware to do the same job; NT is a very memory greedy system. You'll also feel like a second-class citizen in the Internet services world - and that's exactly what you'll be. If you are just creating your pages and storing them on your local ISP, you're reading the wrong FAQ, although this will still probably contain at least some information helpful to you.
How do I connect my web server to the Internet?
If you thought the advice in the last section looked really pricey, you're not going to like what you'll see now.
To run a web server business, you need a minimum of a T1 to the Internet.
A T1 will cost you between $ 1,000 and $ 3,000 per month, depending on your location, how close you are to a telephone central office, and whether you deal with a local ISP or a direct to the backbone connection. The equipment required to pipe a T1 into your home and office will cost around $ 3,500.
Check my Internet Provider FAQ for a detailed discussion of connection types, required equipment and prices.
For a web-only business, there is one other option: Co-location with an existing provider. This is far cheaper than getting your own line, but you have no control over the bandwidth that's available; you could be eating up very little, while the pornographic site in the server next to you is chewing up the entire line. That can give you truly miserable service.
The best way to deal with co-location is to go with a sizable provider with bandwidth to spare, such as Net Access, which has a T3. That way, you can be assured of getting ample bandwidth for all but the most taxing applications. (Note: Net Access is also a sponsor of this FAQ). High-speed hosting services can cost from $ 300 to $ 3,000 a month; expect to get a small slice of someone's T1 at the low end and a ultra-high speed 10mbps cut out of a T3 at the high end.
What's the best web server software?
Most old-line ISPs who run web servers use Apache. The main reason is that the source code is included, so you can make extensive modifications to it to fit your needs. If you see a site that shows you customized versions of server messages, they're almost certainly running Apache. The basic Apache server is free; Apache-SSL is $ 495. Apache-SSL is what you need to conduct secure credit card transactions over the net.
Michael Dillon informs me that Apache is the single most popular WWW server, with 33% of Internet sites. See the Netcraft Survey. The modules feature also makes for greater performance; if you do your CGI in perl, there is a perl module. Or investigate Fastcgi for dedicated CGI server setup. PHP/FI for Apache is an extension language that adds speed and flexibility to server-side includes. For the latest in Apache happenings, check out Apache Week
Commercial support can be found for Apache at Cygnus Support and UKWeb support
CERN and NCSA are considered basically obsolete; hardly anyone uses them anymore, and you probably shouldn't start a new installation with one.
The Netscape FastTrack server, for just $ 295, looks like an appealing alternative to Apache. It's certainly the cheapest way to get SSL (very ironic considering the $ 5,000 price of the first Netscape Commerce Server). Some people, however, don't care for their lack of source code availability. However, the GUI used for server configuration is very smooth and generally considered easier to set up than Apache.
Netscape also has another product, the Enterprise server, that bundles an Informix SQL database with their main server product. This would probably be ideal for a catalogue site or other database handling, but I haven't tried it yet.
GNNServer has gained some popularity, mainly by bundling a free runtime version of the Illustra database. This lets you have a nice client/server system without paying the big bucks for Microsoft BackOffice.
Finally, Microsoft's IIS works in conjunction with Microsoft BackOffice. Although IIS itself is free, BackOffice costs in the range of $ 1,000, making Netscape's servers price-competitive with Microsoft's. Because IIS runs only under Windows NT, I cannot recommend it.
How do I connect a database to a web site?
This really depends on the database.
If you have a small database - like under 10,000 records or so - you can use a program called mSQL. It's relatively easy to connect mSQL databases to the web using C programs; I can do just about anything you want in this area in a day or two of work. A license to use mSQL commercially costs under $ 200 US, and it's free for a 15-day evaluation period.
If you have people who already use Microsoft SQL Server to maintain their database, you may, unfortunately, have no choice but to get an NT server and run Microsoft BackOffice. This means your entire hardware and software setup is completely dictated by the need to run SQL Server.
Fortunately, Netscape has come out with an Enterprise Edition of their server that includes full industrial strength database connectivity through Informix. Unfortunately, I haven't heard any reports on how well this platform is being received; it sounds quite promising, and the pricing is reasonable at $ 1,500.
What is a CGI Script?
A CGI script is a program that runs on your web server that either responds to user input, generates HTML documents the fly, or - most often - both.
Letting users of your system run CGI scripts they've written on your system is often a hazardous security hole; see the next question.
It's often useful to make available standardized CGI scripts to implement a mail to form and other things. Typically, your customers can get most of the functionality they need without custom programming.
What do I do about users who want to run their own CGI Scripts?
CGI scripts are potentially a gigantic security hole in your server, since they can often access and alter data anywhere in your system. Because of this, it's recommended that you examine any CGI scripts you are given very carefully to make sure your users aren't trying to break into your system through the back door.
In most cases, people who need custom CGI scripts are better off co-locating their own server at your site. Unless you write them yourself, vetting them is often not at all cost-effective.
What is Microsoft FrontPage?
FrontPage is a HTML editor and a set of extensions used to add automated publishing features to the web. Through FrontPage, you can automate uploading of pages to your server, and give your users access to various pre-configured scripts, called "bots".
Unfortunately, the current implementations of FrontPage require that you run your web server as root, which opens up all kinds of nasty security holes. In addition, there is no per use security scheme - anyone can upload to anyone else's directory. This is, needless to say, potentially fatal to ISPs.
I would add that FrontPage adds a level of overhead to your system that may significantly damage the efficiency of your hardware.
On the whole, it's recommended that you avoid FrontPage whenever possible. If you must use it for some reason, confine it to a special server you have set up exclusively for it. That will minimize the potential security problems.
Note that AOLServer (formerly GNNserver) is said to have most of the glossier FrontPage features, and it's free and non-Microsoft. This may be appealing to many.
What can I do about security?
Read the operating system release notes and FAQs; keep up with the patches. Because most vendors take a while to release corrected software, you may find it better to obtain and compile your own versions of security-critical software such as sendmail and web servers. That way, you can take advantage of the latest upgrades in this software as soon as it is released, instead of waiting for it to filter through your OS vendor.
The SGI FAQs have a truly excellent section on security that's religiously updated; if you own a SGI system, be sure to read the changes to those FAQs on a regular basis.
If you keep credit cards on your server, you're asking for trouble. If you're receiving orders, immediately send them off to another system, using a CGI script written in C so the source code cannot be obtained by a hacker who breaks into your main system.
What about adult sites, porn and censorship?
Curiously enough, the worst threat to X-rated sites is not people like Paul Cardin and the OCAF. The threat, rather, comes in the bandwidth they consume. A high-quality X-rated site can sap up bandwith like a kid eats candy; hit rates of 300,000 a day and up are not uncommon. In other words, you can consume an entire T1 with just one customer.
If you have someone with this type of site, be aware that it's likely to take all your bandwidth and more, and set your rates accordingly. If you have an individual customer, paying a low rate for the connection, you will probably want to "throttle" her site. (The creators of the best sites of this type are, of course, almost always female). This can be done through Apache if you can hack a little C program to check for specific high-traffic sites. You may want to provide low-priority web serving for certain sites, where it would only serve the pages if the system's load average and/or use of bandwidth is below a certain level. This would let people enjoy the sites, but not so much as to cause a disaster.
It's a good idea, of course, to make this policy apply to all your personal home pages without discrimination, even though in practice it's almost never going to apply to pages that aren't sexually oriented in nature.
If you do handle sexually oriented pages, be sure to have a "shield" that tells people that they must promise that they're over 18 and legally permitted to view adult materials. I haven't seen any real gain by using the new adult verification services such as Validate and Adult Check, and users - quite predictably - don't much like them.
There is more information on this subject available in my Internet Access Provider FAQ, referenced at the top of this FAQ.
Are there any tricks I can use to "fool" the search engines?
This answer was inspired by a question on Market-L. The questioner asked if there was a way to "fool" the search engines into showing your page first - or at least among the top 30 or so. He used a search for "sex" as an example.
I believe older search engines, such as WebCrawler, had an easily manipulatable system where they took the frequency of your key word in a document and used that to compute a "relevence index". So you could get to the top of a search for 'sex' by just having a page like this:
<title>Sex Sex Sex!</title> <body bgcolor = #000000 text = #FFFFFF> <h1>Sex!</h1> <a href = "sex2.html">Enter our sexiest pages!</a><p> <font size = 2 color = #000000> sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex sex ... </font>(If you had a black background, this would be invisible in Netscape).
Note a few tricks here: Text within the <title> </title> and first <h1></h1> is often ranked higher than text outside them. And the endless repetition of the word should be self-explanatory.
Unfortunately for those who try and manipulate them, the spiffier new search engines compensate for the repetition and vindictively dump such sites at the very bottom of listings. They accept a "meta" tag at the top that lets you directly specify a description and keywords for the page. If they don't see it, the page is indexed, but again, it does check for excessive repetition.
I think that something like this might still work, though:
<title>Sex Sex Sex!</title> <body bgcolor = #000000 text = #FFFFFF> <h1>Sex!</h1> <a href = "sex2.html">Enter our sex pages!</a><p>Note that this is the same as the previous example, but sex is still emphasized in the title and h1, and it's common without being obviously manipulative.
I'm not sure if you could combine the new methods (the meta tag) with the old (endless repetition) and get both types of search engines to behave as you would want them to.
Incidentally, I did a search on altavista for 'sex' and it came up with a random but non-commercial hodgepodge, so it looks like the reports of these manipulations failing are true. Oddly enough, I don't know if this is necessarily good; the old-fashioned output of dozens of commercial sites was surely more like what the searcher would like to see than the random garbage the new age engine produces.
Are there any membership organizations for web site developers?
The Internet Developers' Association is a new association founded to support the Internet Developer community. IDA membership is open only to serious developers with references.
The HTML Writers' Guild is an organization that supports a number of HTML-related mailing lists. It's open to anyone. The mailing list has a notoriously poor signal to noise ratio, but the information is still often interesting.
Back to the WWW Business Centre
The Web Providers' FAQ has been accessed times since I changed my server on 12:01 4 September 1996.