Web caching is a technology that can significantly enhance your Web browsing experience and reduce bandwidth usage. What is Web caching? A Web cache is a temporary storage place for files requested from the Internet. After an original request for data has been successfully fulfilled, and that data has been stored in the cache, further requests for those files (a Web page complete with images, for example) results in the information being returned from the cache rather than the original location. We'll start with a simple analogy. This analogy is not a perfect match but it will give you the basic idea. Each morning before Joe goes to get the daily paper he asks his roommate, Bill, if he has already purchased one. If Bill already has the paper there is no reason for Joe to walk all the way to the store and spend money on the exact same information. He'll just read the paper that's already there. If a copy of the daily paper was already retrieved by Bill and is on hand, Joe saves his money and his time by not making a trip to the store. Web caching enhances Web browsing in much the same way. When a user visits a site such as www.ala.org, Web caching (if in use and available) will retrieve the page from the ala.org W eb server and store a copy of that page on your local are network (LAN). The next time a user requests www.ala.org the W eb cache delivers the locally cached copy of the page. The user will experience a very fast download because the request did not have to traverse the entire Internet - the files all came from a local source. Also, the bandwidth that would normally be used to download the Web site is not required and is free for other information retrieval or delivery. Caching is useful for any library. Faster response to users' requests and saved bandwidth are never a bad thing. Caching really makes sense for libraries that feel they must purchase more bandwidth to keep up with increased usage. In such cases a cache server or cache appliance could very likely lower the demand on the existing bandwidth, thus make a costly bandwidth upgrade unnecessary. Suppose an upgrade from a 256K data circuit to a full T1 will increase your monthly Internet bill by $500 . A $3,000 investment in caching (and caching solutions often can be implemented for much less than that ) will start to show a return after six months. Doesn't cached Web content get stale? Web sites are continually updating their content. News headlines change, stock quotes change, weather changes. It may seem that caching is not worthwhile if it is returning dated material. A traffic report that is two hours old doesn't do you much good. Fortunately there are checks and balances in place to ensure that the content you are viewing is current. Web sites are made up of many small pieces that come together to make a complete page. A site might have logos, photographs, tables, text, and sounds. Each item will be cached as a different object, and some items may not cache at all. For example, if you go to CNN.com frequently your cache may hang on to the CNN logo object, some advertising bars, and the rest of the stuff that makes up the basic look of the CNN Web site. But the news items will not sit in cache because they change so often. In this case your cache has made the CNN site much easier and faster to download because all the static graphics are already on hand and the only thing you needed to complete the picture is the news content. For a simple diagram of the basics of how caching works, check out this diagram. So how does your cache know what to hang on to and what to let go? That depends on choices made by the Web developer as well as the way the user configures his cache. As mentioned above, Web sites are made up of individual pieces. Each one of these pieces is encoded with specific information that will tell your cache how to handle it. This information may specify, “Don't cache this item,” in which case the cache will ignore it. The item may have a “max age” specified. This tells the cache that after a set amount of time the cache must check in with the Web site for newer versions of that object. The “expires” field serves roughly the same purpose. The item might also have a “last modified” field. Last modified is another way for your cache to ask the Web server if the object has been modified since your last visit. If it has, the cache gets a new copy, if not the cache just hangs on to the copy it already has. The Web site administrator controls each of these items. There are many cache products available. Each has lots of different configuration options to help ensure that your data is current. Caches can be configured to accept all, some, or none of the priorities that the Web site administrator sets. Types of Web caches Browser Cache The caching function of a Web browser application. The two most popular Web browsers, Netscape Communicator and Microsoft Internet Explorer, have this function built in. Cached pages are stored on the local hard drive and are only used by that computer. This caching is automatically enabled. Users can control some features such as cache size. Your library may decide to use browser cache only. If so, there are steps that can maximize the effectiveness of the browser cache. Familiarize yourself with the caching features of your specific browser. You will find useful information in the application help files. If you are using Internet Explorer or Netscape Communicator, refer to the following links for more information. http://support.microsoft.com/?kbid=263070 and http://help.netscape.com/default.jsp. There are also products that improve on the browser's built-in cache. A list of these along with descriptions can be found at Web Caching.com, a great resource for more advanced Web caching information. Cache Appliance/ Cache Server A cache appliance is a hardware and software caching solution all in one unit. A cache server is a software-only solution. The software is installed on an existing server. Unlike a browser cache that only benefits one user, cache servers or appliances are shared and benefit every user in the network. The cache server/appliance sits on the Local Area Network. There are many cache servers and cache appliances available. Below are a few options in both the appliance and software categories. Cache appliances can be expensive. The Sun Cobalt Qube 3 Server Appliance is one of the more affordable solutions. In addition to caching, the Qube offers many other services. Swell Technology also makes a lower-priced cache appliance. Other cache appliance manufacturers include Cisco, Blue Coat Systems, and Iprism. Below are software packages that can be added to existing hardware to create a caching server: Squid-Cache - Squid-cache is free. It is designed to run on Unix systems, including Linux. It is a standard part of most Linux distributions. Smoothwall - Smoothwall is a free firewall product that includes a caching feature (Squid). It is a modified version of Linux. Smoothwall is installed on a standard PC, turning that PC into a dedicated Smoothwall firewall. Clark Connect - This is another firewall/server solution built on top of Linux. It includes caching features and costs under $200. Internet Security and Acceleration - This is Microsoft's firewall and caching package. It requires Windows 2000 Server. It's more expensive but fully loaded. WinProxy - A product from OSITIS. It is a moderately priced multi-service package that includes caching. This list is not exhaustive but it's a good start. Wondering how a caching server or appliance might fit into your network? Refer to the following diagrams to see where caching hardware fits into your network -- whether you have just a single network (look at the local area network diagram) or multiple networks through a single Internet access point (like in the wide area network scenario). Conclusion Installing a cache appliance or cache server on your organization's local network can significantly improve the Internet users' experience because the locally cached Web data is more quickly accessed than data on remote servers. Caching data also reduces bandwidth usage. It is possible for Web administrators and cache operators to control the freshness of data in the cache. This ensures that cached data is not stale or out of date. Most Web browsers cache data automatically. The cached data only benefits that specific machine. The user has some control over browser cache behavior. Many products extend the capabilities of browser cache. Caching can be implemented either as a hardware/software solution called a cache appliance or as software only. Cache appliances can be expensive but offer the convenience of an all-in-one package that requires no extra hardware. Cache software is installed on existing hardware. Free high-quality caching software is available. If your library is not capable of setting this up in-house, there are also commercial products built using Squid for which these companies provide installation and support services.

Documents
| What is Web Caching |
Web Caching is a technology that stores web pages on your local hard drive or network. Read how it can make your web surfing more enjoyable and productive. Faster, too.
|
|
Contribute to this topic
Do you have an article, presentation, or other content to share on this topic?
You can post it on this topic page. Find out more about submitting documents in the Member Center.
Ratings You must be signed in to rate this item
|
Average (0 Votes)
![]() ![]() ![]() ![]()
|
Comments
