Paper presented to AUUG'95 & Asia-Pacific WWW'95 Conference, Sydney, Australia, 18-21 September 1995.

Ensuring High Quality
in Multifaceted Information Services

Dr T.Matthew Ciolek,
Coombs Computing Unit,
Research Schools of Social Sciences & Pacific and Asian Studies,
Australian National University,
Canberra ACT 0200, Australia
A fast and efficient production cycle, network circulation, and the archiving of the Australian National University's "What's New in WWW Asian Studies" online newsletter all depend on skillful application of several diverse technologies (WWW & form-based pages, CGI scripts, email, mailing-list, gopher server, WAIS server & database). The paper describes and evaluates a synthesis of these technologies into an elegant, networked, scholarly information resource.
electronic newsletter, quality control measures, asian studies


The world continues to experience an ever-widening explosion of networked documents. In January 1995, according to the Lycos crawler dbase [1] there were over 2 mln WWW documents published online. Half a year later, in early August 1995, Lycos had to keep track of 5.07 million web pages. Assuming that an average WWW-based information system publishes some 200 Web documents (eg. 20 projects with 10 documents each), it can be estimated that in August 1995 there were some 25,000 Web sites [2]. This is not a substantial yet not an insignificant fraction of the 6.6 mln hosts counted all over the world by the end of July 1995 Internet Domain Survey [3].

Clearly, not all of this mass of material is worth accessing, since not all of the existing WWW sites are worth their electricity. Frankly, several tens of thousands of the existing Web documents are so thin in their substance, and so amateurish in structure that they are perfectly irrelevant and forgettable. The reasons for this disappointing state of affairs are manyfold.

Firstly, there is a startling lack of commonly accepted standards for both the minimally acceptable content and the minimally acceptable format of an online resource. The ease with which anybody can publish a Web page leads to a situation where documents which are not only whimsical and messy (= "postmodernist") but also promiscuous in the way they establish links to other sites. Many Web pages get encumbered with links which have no regard for their target's factual accuracy, stability of service or coherence of presentation.

Secondly, the WWW becomes an unbalanced place. Judging by the developments in the field of Asia-Pacific studies [Ciolek95a], roughly two thirds of the WWW resources are catalogues, indices and directory pages to online material, while the other third are the data themselves, that is online papers, abstracts, software, databases and e-journals. In other words, as far as the social sciences, humanities and Asia-Pacific studies are concerned, legions of catalogues are established all over the world to cover the same miniscule and still hesitant set of networked data.

Thirdly, the growth of the Web seems to be driven by an inexplicable need to ape, re-work and regurgitate topics already succesfully handled elsewehere. Very few Web maintainers are able to admit another site's superiority and expertise. Very few of them are gutsy enough to spend time and resources on delineating, advertising and cultivating their own unique area of specialisation. Instead, people blindly and patiently replicate general work already completed elsewhere. The net result is the ever growing circularity of links.

It would be easy to be contemptous and dismissive of the whole phenomenon of the today's online information services. Yet, the sheer scale of the Web activities and the volume of Web-based information systems forcibly presents the Internet community with several urgent and nagging questions
These are vital questions. They address both the current situation and the future of Internet-based services. They suggest that one should be able not only to built trustworthy, attractive and ergonomic information facilities, but also be able to spell out the many rules, codes of practice and principles underpinning the design and management of such facilities [Liu94, Ciolek94a]. By doing so it can be hoped that the construction and operation of networked information services will gradually leave the realm of art, guess-work and intuition and enter the realm of craft, routine decision-making and logic.

This paper will provide an account of an information facility explicitly designed to make such a quantum leap.

An Information System and its Organisational Context

Coombs Computing Unit (CCU) is a 9 person team responsible for the provision of computing, networking and statistical services to over 800 researchers and staff of the Research Schools of Social Sciences & Pacific and Asian Studies at the Australian National University (ANU), Canberra. One of the CCU's responsibilities is the development and maintainance of five large-scale information systems, evolved around FTP, email, WAIS, gopher and WWW technology [4]. One of these systems, the Coombsweb is a WWW server which was established in January 1994 to provide convenient access to the online publications by the Research Schools, - lodged mainly with their Coombspapers - ANU Social Sciences Research Data Bank, comprising over 1900 research documents (71.5Mb of text files) [5] - and to keep track of relevant and dependable social sciences and humanities resources elsewhere [6].

To this end the Coombsweb server hosts eight WWW Virtual Libraries (which are a part of a large-scale project initially established and managed by CERN, Switzerland, and later taken over by the MIT, USA) [7] covering Aboriginal-, Asian-, Buddhist-, Demography & Population-, Pacific- and Tibetan-Studies as well as the Social Sciences and the History of Science, Technology & Medicine.

The most important of these incipient and steadily developing "knowledge systems" is the Asian Studies WWW Virtual Library [8]. It was established in March 1994 to develop online social sciences resources about the countries and sub-regions of the Middle East & Caucasus, Central Asia, South Asia, South-East Asia, Australasia, and, finally East Asia. By the end of July 1995, the Asian Studies WWW VL comprised 56 closely interlinked Web pages with annotated links to specialist WWW documents, FTP archives, electronic mailing lists, online databases, electronic journals as well as registers of telnet connections to Asian libraries and catalogues. These documents offered a cumulative total of over 1320 links to networked resources all over the world and were accessed a total of 1682 times/day (= over 610,000 a year) [9].

The "What's New in WWW Asian Studies" Newsletter

An essential component of the the Asian Studies WWW VL operations is an online Newsletter entitled "What's New in WWW Asian Studies" (ISSN 1323-9368) [10]. The Newsletter started is operations in April 1994 and is formally registered with the National Library of Australia. Its aim is to provide a dependable, timely, and high-grade current-awareness service by reporting URLs and brief summaries for new or recently upgraded WWW, Gopher and FTP sites dealing with Asia and Asian studies [11]. It is, so far, the world's only [12] electronic periodical specialising in these matters and is closely integrated with pages of the Coombsweb's Virtual Libraries.

The Newsletter is available free of charge, on a 24 hour and 7 days a week basis, and, despite its narrowly defined focus, it continues to gain on popularity.

A summary of the Newsletter's activities is given in Table 1.
 		 	Table 1
	The WWW edition of "What's  New in WWW Asian
	       Studies" newsletter (Apr94-Aug95) 
		      News     Size   Average no.of
	Issue	      items    in Kb  accesses/week	
	Apr-Jun 94	 7	 7.2	  127	 
	Jul-Sep	94 	 7	 8.9	  148
	Oct-Dec	94 	11	15.6	  223
	    Jan 95	 6	 5.3	  397
	    Feb 95	11	10.2 	  448	
	    Mar 95	13	 8.3	  497
	    Apr 95	33	18.7 	  393	
	    May 95	55	26.4	  484	
	    Jun 95	75	30.4	  759
	    Jul 95	57	24.8      606
	    Aug 95	51	30.0      678
During the 14 months since its launch the Newsletter underwent a number of changes. They were aimed at increasing the speed of its production, streamlining its maintainance and widening the circle of its audience. The Newsletter continually strives to enhance the quality of the information conveyed to its readers and thus is bound to change details of its appearance as well as of its production and dissemination.

The Newsletter - Format of the Data

Each issue of the Newsletter consists of a series of news-items prefaced by a headmast, and concluded by a footer with links to the Coombsweb system.

All items contain six mandatory fields (marked here in bold) and three optional ones:
(a) DATE of the information (in dd Mmm yyyy format) ;
(b) TITLE of the resource in question;
(c) its main URL (address of the top page);
(d) ORGANISATION where resource is published;
(e) COUNTRY where resource is located;
(f) Short DESCRIPTION of the resource, and,

whenever possible, details of the contact person or person supplying the announcement about the resource

Such template, once fleshed out with real-life information, yields the following results

14 Jul 1995
Korea WebWeekly
Kim Software, Inc., USA
News digests, editorials, Internet resources and other info on North and South Koreas.
Information supplied by: Young S. Kim (

13 Jul 1995
Stone Bridge Press, USA
Publisher of Japan-related books in English: language learning, literature in translation, culture, business, etc.
Information supplied by: Peter Goodman (

The Newsletter - Sources of Information

The news-items are compiled from a variety of sources. These are:
  1. Announcements supplied to the Newsletter via forms-based data input page [14]. The page, together with an accompanying CGI script, performs several tasks. It

    In addition, the data input page carries an explanatory note which reserves the Newsletter's right to edit or reject all entries in accordance with the CCU standards. The note also asks that (however, not always succesfully) readers abstain from using this specialised communication channel for posting messages containing inquiries, comments, requests for help etc.
  2. Email messages dispatched to the "" address by those readers of the Newsletter, as well as of the Asian Studies WWW VL, whose Web browsers do not support forms-pages. Authors of these postings are encouraged to structure their messages in terms of the mentioned 6 mandatory and 3 optional variables.
  3. Various email messages sent to this author in his capacity as the administrator of the six (6) out of eight (8) Coombwebs' WWW Virtual Libraries.
  4. Registers of WWW new sites such as :
  5. Finally, use is made of the three specialist electronic mailing lists
The first two sources of information are established by the CCU to generate and direct the flow of the Internet's "news" towards the Newsletter's mailbox. Their function is two-fold. Firstly, they attempt to impose a standard format on the arriving data. Secondly, and perhaps more importantly, they ensure that instead of the Newsletter chasing new information, it is the news-items which chase the Newsletter.

The "external" sources are inspected on regular, almost daily basis. They supply information which is presented according to a variety of formats, and which is endowed with varying degrees of detail. Such material, except for that coming from the NSCA's site "What's New ", require pruning, sub-editing and re-organisation before they can match the Newsletter's publication standards.

An analysis of 275 news items published by the Newsletter in Jan-Jul 1995 period reveals the following pattern:
 		 	Table 2
        Source of 275 news-items published in the "What's  New
	in WWW Asian Studies" newsletter (Jan95-Jul95)
	Announcements & mail to the Newsletter 	- 45 %
	Browsing the WWW News sites		- 40 %
	Mail to WWW VLs at	- 10 %
	Mailing lists				-  5 %
	Total					 100%
These percentages are, of course, averaged values and they do not tell the whole story. For instance, over the last 7 months it is possible to detect a slowly decreasing reliance on the browsing the WWW News sites and a corresponding growth in the percentage of information submitted as announcements & mail to the Newsletter. At the same time the WWWVL mail and the mailing lists continue to play their subsidiary roles.

The Production Cycle of the Newsletter

As noted before, information received via the forms-page, is pre-processed by a locally written CGI script in Perl, and thus arrives to the editor's desk in the shape of an email message which is already formatted and html'ed. Nevertheless, before it is accepted for publication , it still has to undergo mandatory data selection, validation and editing. Information obtained from other sources goes, step by step, through the full sequence of quality-assurance procedures involving (a) data selection; (b) data validation; (c) data editing; (d) data standardisation, and (e) HTML-markup.

(a) Data Selection

All candidate material for publication is subjected to detailed evaluation. Several test-questions are used in a sequence of cascading check-points and filters. Their function is to separate the proverbial chaff from the wheat as well as to gradually enhance the overall quality of the processed material. Naturally, this stage of operation greatly benefits from earlier methodological works and discussions of what constitutes "good practice codes" for networked publishing [Ciolek94b, Brown95, Ciolek95b, Wilson95].

(b) Data Validation

This is an important step, as the obtained URLs may contain minor but frustrating errors. Whenever feasible, the Newsletter attempts to correct any faulty addresses. Unresolved cases, however, are excluded from futher consideration. Furthermore, all URLs are tested to ascertain that they do, in fact, lead to the most central, or the most useful part of a new resource.

(c & d) Data Editing and Standardisation

All Newsletter entries are sub-edited so that they conform with the Newsletter style. The aim is to produce an announcement which is uniform, brief, detailed, and has a pleasing and crisp appearance:

Original TITLES of the resource, taken from the page's headings or the title-field, are used whenever possible. On occasions when these are not informative enough or are too wordy, a substitute title is devised. Whenever possible the INSTITUTION and the COUNTRY of publication are stated. If the publisher cannot be ascertained, details of the host and domain are provided instead. For DESCRIPTIONS, original wording of an announcement is used. If it is not acceptable, a new description is drafted. All notes are sub-edited to make them uniform, succint and informative. Technical terms and definitions are retained where possible. Computer jargon is avoided. Grammatical and typing errors are corrected where necessary.

(e) HTML markup

Next, the cleaned information is marked up so that it may be added to the Newsletter. An analysis of news-items published between Jan-Jul 1995 indicates that, on average, the HTML code represents about 25% of the final size of the a typical Newsletter's entry. All entries follow the same schedule:
< P> < /P>
< I> DATE< /I>
< BR>
< A HREF="URL-details"> TITLE< /A>
< BR>
< BR>
< BR>
URL url-details
< BR>
Information supplied by: NAME SURNAME (EMAIL)
After this stage news items are ready for inclusion into the current issue of the Newsletter.

Three Modes of Publication of the Newsletter

To promote the Newsletter's circulation and usefulness it is released in three parallel formats: (a) as a WWW document; (b) as series of email announcements and (c) as regularly updated WAIS dbase.

(a)The WWW edition of the Newsletter is published as a rolling electronic journal. This means that once a month a new page is established and that it continues to grow through the addition of fresh information. The current isssue is always bears the same URL (i.e. while the past issues receive their respective URLs and get archived as a part of the Coombsweb system [17].

The current issue is arranged in form of a single column of text on a single, continuous page :
		HEADER - containing the name and mission 
		statement of the Newsletter, its ISSN number,  
		reverse links to the central WWW Virtual Library
	 	site, to the Asian Studies WWW VL, as well as to  
		the archive of the past issues of the Newsletter.
		The header also provides a link to the forms-page 
		for acquisition of fresh data and to the WAIS dbase
		(see below) edition of the Newsletter.

		FOOTER containing details of the Coombs Computing
 		Unit, ANU, copyright notice and links to other 
		Coombsweb resources.
Initially, in 1994, the Newsletter issues were created on quarterly basis at the outset of each quarter (Apr-Jun, Jul-Sep, Oct-Dec). However, after a period of experimentation with a bi-monthly format, in 1995 the Newsletter started appearing on a monthly basis. The change to the frequency of publication was dictated by the growth in the amount of data carried by the e-journal. The objective is to keep the size of the issues as small as possible, in order to minimise time neded to access them. An average monthly issue of the Newsletter contains approx. 16.9 Kb of data in a form of approx 29-30 announcements, with each news-item being about 0.5Kb strong (see again Table 1).

(b) There is also an email edition of the Newsletter. Each time a new announcement is placed on the Web page it is also posted as a message [18] to a custom-built "Asia-WWW-Gopher-News-L" [now "asia-www-monitor" - tmc, Apr 1999] mailing list [19]. Subject lines of each of these messages state the resource's TITLE prefaced by one of the three codes : "WWW>", "GOPHER>" or "FTP>" indentifyng thus its general type [20]. The body of the message is made of the html'ed text of the news-item placed between a pair of < HTML>, < /HTML> tags so that the emails can be easily handled by the WWW email browsing software. These postings are dispatched 2-5 times a week, depending on the availability of new materials. The Asia-WWW-Gopher-News-L list, which was established in April 1994, utilises the majordomo software and it runs on the machine. In early August 1995 the list had 165 subscribers. However, the circle of the readers of the newsletter appears to be much wider. Messages distributed by the list are known to be re-posted by their receipients. It can be assumed, therefore, that the actual audience of the emailed Newslettter is about the 1.5 times the number of the actual subscribers.

(c) Each time a news item is sent to the list, the majordomo system places an additional copy of it in a dedicated subdirectory on machine. These archived postings form records of the "ANU-Asia-WWW-Gopher-News-L" WAIS database [21]. This database is one of the over 90 WAIS databases constructed and maintained by the CCU. The CCU databases, utilise the WAIS public-domain software augmented by extensive local modifications [22]. The first of them were developed in early 1992 and were made accessible world-wide through the wais client software. In 1993 access to the Coombswais system was enhanced with the establishement of additional links from the CCU gopher system [23]. In early 1994, when the Coombsweb system was launched, the Coombs gopher was used again, this time as a simple yet reliable WAIS < => WWW gateway. Naturally, all WAIS databases have the full-text search capability. Therefore, any record in the ANU-Asia-WWW-Gopher-News-L database can be located and retrieved on the basis of any arbitrary keyword. The database is regularly updated through an automatic chron-job carried out each day at 03:30 am. This operation re-indexes the complete contents of the postings' archive. This means that every 24 hours the database edition of the Newsletter is made to match the WWW edition. In early August 1995 the average usage rate of the Newsletter's database was 97 accesses/day (=35,000 a year).

The use of three different publication formats confers a number of advantages. The WWW edition provides a current awareness service with a pleasing interface. Also, the news items are placed in the context of both earlier and subsequent announcements. Thirdly, this publication mode means that all Web issues of the Newsletter are made notice of by Web databases such as Crawlers and Harvest brokers. The email version of the Newsletter distributes new information directly to the interested subscribers. The WAIS edition of the Newsletter enables its readers to pinpoint and quickly retrieve any past news item. In sum, harnessing all three publication techniques together makes the Newsletter a truly dependable and online service - that is the one which never fails to be available to its readers despite any possible down-times of either of the servers involved - WWW, Majordomo or WAIS.

Concluding Remarks

At the beginning of the paper, it was said that the bulk of current ills troubling the WWW universe can be attributed to the lack of clearly formulated standards for presentation and substance, and to the proliferation of indiscriminate hypertext linkages as well as to the unrestrained development of resources' catalogues at the expense of development of the resources themselves.

It is hoped that this quasi-etnographic account of the "What's New in WWW Asian Studies" Newsletter has illustrated that some of the ravages of the post-modernist developments on the Network can be effectively repaired by adoption of a disciplined and explicit rules for online publications.

Similarly, the dangers of promiscuity of links can be contained by the quiet determination that "only the very best sites deserve being cited and linked to". In this way the networked rubbish may be, hopefully, denied the life-giving oxygen of attention and connectivity.

Finally, it appears that some of the aforementioned catalogue/data imbalances, especially apparent in the social sciences' and asian studies' areas of the Web, may be eventually rectified and corrected if the ideas of coordination, specialisation and clear-cut division of work between the major and most prestigeous Web sites gain wider acceptance and currency.


