`
`Jean-Claude MAflCOVICI
`
`DGT/DACT/Direction de Programme Teletel
`
`In September 1987, there were more than three million Videotex users in
`France. Most of
`them have Mini tel standalone terminals. Every day,
`24 hours
`a day,
`these users
`can quickly and efficie:1tly obtain
`Directory Information on all 24 million
`telephone subscribers
`in
`France.
`
`successfully melds
`(EDS)
`Service
`'Electronic Directory
`The
`state-of-the-art methods in two areas. Its documentary data base has
`been
`the
`source of
`significant advances
`in user-frieroly query
`procedures, computer sofNare, data storage atd retrieval. The system
`· procedures,
`has
`also
`pioneered
`efficient
`approaches
`to
`data
`cOlilmunications between a very large number of st.mple user tenninals and
`remote computers.
`
`This paper presents the architecture of the Electronic Directory System
`and describes the main search modes and algorithms.
`
`
`
`TH! ELECTRONIC DIRECTORY SERVICE
`
`INTRODUCTION
`
`In September L987, there were more than three milllon Videotex uset"S in
`rrance. Host of them have Minitel standalone terminals, either received
`free of charge for the EDS or rented from French Telecom. Every day,
`24 hours
`a day,
`these usen
`can quickly and · efficiently obtain
`directory
`lnfonnation on al l 24 milllon
`telephone subscribers
`in
`France.
`
`to enter into contact
`-In many reported cases, people have been able
`vith long-lost relatives and friends, Even when t he '·EDS 1s employed for
`everyday purposes, users are mostly unaware that its simplicity and
`technical achievement and hide a very
`user-friendliness are a
`real
`complex system.
`
`The EDS successfully melds state-of-the-art methods in tvo areas. Its
`documentary data base has been the source of signif1cant advances
`in
`user-friendly query procedures, computer software, data storage and
`retrieval. The system has also pioneered efficient approaches
`to data
`communications between a very large number of simple user terminals am
`remote computers.
`
`Develop ping the EDS involved finding elegant solutions to a string of
`complex technical requirements:
`
`storing ~vhlte page" and "yellow page~ directory information for 24
`million
`telephone
`subscribers,
`representing
`over
`25
`billion
`characters in a unified data base system,
`
`this data base each day
`to
`making an average of 40,000 updates
`vithout interrupting or degrading access . for users.
`
`rapidly obtain wanted
`to
`allowing computer-illiterate users
`infomation by means of simple query procedures,
`
`, supporting thousands of simultaneous consultations of short duration
`by different users, round the clock, every hour of the day.
`
`the world's largest
`ls now
`These requirements have been met by what
`system. Technically
`economically viable
`and
`distributed computer
`solutions were developed and proven through the combined efforts of
`French Telecom and its partners in industry.
`
`CAP GEMINI SOGETI and SESA, CAP SOGETI-SESA's parent c ompanies built
`the Electronic Direccory System for
`the French Telec0111munications
`Administration. The EDS is running on DPS6 Computers (HONEYWELL-BULL) .
`
`the Electronic Directory
`the Arch i tecture ot
`This paper describes
`System and the main search modes and algorithms •
`
`. . . I . . •
`
`
`
`1 - SYSTEM ARCHITECTURE
`
`To call the EDS, a user dials "11" and is connected through the
`telephone network to the nearest EDS Videotex Access Point. Their
`decentralization minimizes the local telephone transmission costs.
`They are linked to the central system either through leased lines
`or through a public packet switching data network. The VAP returns
`a query form for display on the user's terminal. When the user has
`completed this form, by entering infot'lllation through the tet'lllinal
`keyboard, he or she presses the SEND key and the query is forwarded
`by the VAP to the nearest Inquiry Center.
`
`the same geographlcal z:one as the calling
`If the query concerns
`user, as lt is the case for 80 '- of calls, the Inquiry Center sends
`the query to the Regional Documentation Center. In other cases, the
`query is sent to the National Documentation Centers by dedicated
`data links or the Trangpac packet data netvork.
`
`1.1 Inquiry Center
`
`component which
`functional
`the
`is
`(I.C.)
`Inquiry Center
`The
`It asshts
`the user
`in entering
`dialogues
`1o1ith
`the users.
`correct and full information in his request, Le . ,
`input of all
`information required for successfull data-base retrieval.
`
`its
`checks
`information and
`this
`Inquiry Center acquires
`The
`validity. In case of ambiguity, it requests additional information
`from the user. The Inquiry Center user has dialogue aids in the
`form of
`files concerning,
`for
`instance,
`localities, business
`headings, and streets.
`
`Provided that it is correctly formulated, each request is forwarded
`by the I.e. to the appropriate Documentation Center. Then, the I.e.
`receives 'the answer from the interrogated n.c., convert.s it to user
`
`An Inquiry Center running on a DPS6 model 9S handles more than 1S0
`simultaneous users with a response
`time which does not exceed 3
`secoros.
`
`452
`
`
`
`to allow
`in EDS development was
`·issues
`toughest
`the
`One of
`computer- illiterate members of the general public
`to easily and
`reliably interrogate the data base to obtain directory information.
`le required development of software
`to
`translate
`into computer
`language the queries formulated by users in ordinary language and
`· co correct or complete the queries when necessary.
`
`reliable
`formidable challenge has been met by extremely
`This
`Inquiry Centers, They act, so
`to speak, as reception desks for
`the EDS data base,
`receiving all queries
`from users
`and
`forwarding
`them
`to
`the Documentation Centers, However,
`they
`first perform corr~tness
`and
`caupleteness
`checks
`on
`each
`If a user has wrongly spelt
`query.
`the name of a
`town,
`for
`example,
`the concerned
`rnquiry Center automatically corrects
`the mistake when possible. S1m.1lary,
`if a user types "vet"
`the
`Inquiry Center
`automatically
`convert:s
`this
`co
`·veterinary
`surgeon" which
`is ~he official professional heading
`in
`the
`telephone directory.
`
`1.2 Documentation Centers
`
`component
`functional
`the
`is
`(D,C.)
`The Documentation Centers
`which stores standard and auxiliary directory
`information on a
`set
`of
`subscribers.
`The
`auxiliary
`information
`includes
`advenising pages.
`The o.c.
`to
`response
`in
`from its files
`information
`retrieves
`I .c. and
`to
`information
`sends
`the
`queries
`forwarded
`from
`the
`him: entries 8 by 8 plus, if needed, one page of advertising and
`the information necessary to re-start the queryfrom the last point.
`
`store complete directory
`each
`Regional Documentation Centers
`inform.at ion on a subset of
`telephone subscribers. this modular,
`decentralized
`approach
`has
`advantages
`such
`as
`easier
`expansion of call-handling capacity
`to match an
`increasing user
`population
`and
`growing
`traffic
`and · logical grouping
`of
`informatio~
`for
`access
`users
`(most
`queries
`are
`by
`intra-regional).
`
`tvo Regional Documentation Centers wet'e experimented
`The first
`as · early as 1982
`in Rennes (Brittany) and . Paris. There are now
`additional RDC sites at Bordeaux. Lille and. Marseille •
`
`• • • I •••
`
`453
`
`
`
`The National Documentation Center. also at Rennes, lies at the heart
`of
`the EDS data base network.
`storing a
`copy
`of directory
`information on all subscribers in · Prance. Constituted progressively
`from 1982 and c01Dpleted · towards the end of 1984. its nationwide data
`base is employed notably for inter-regional queries,
`
`Duplicated storage of information in both the National Documentation
`Center and the various Regional Docume.ntation Centers affords high
`security and protect ion against computer or data link f~ilures,
`
`An EDS Regional Documentation Center has a configuration with up to
`14 disk units (800 Mbytes each) for storing a maximum of 6 million
`duplicated direct-ory inscription&. lt can process 25 . queries per
`second.
`
`!he National Documentation Center has a triplex configuration in
`which the three blocks are linked through their front-erd units, The
`capac1 ty of
`the National Documentation Center
`is m:>re
`than 25
`million subscriber's records,
`
`the
`The requisite security and service continuity are assured by
`modular, mult !processor structure of
`the
`front-end unit and by
`distributing
`traffic over several DPS6 computers,
`
`retrieving
`in
`flexibility
`and
`accuracy
`speed,
`specified
`!he
`information to answer queries were obtained ·through advanced data
`base systems employing associative search methods w~th direct access
`to blocks of
`information an:i multicrtteria canparison of
`their
`contents against the user input, These Oiram 3·2 data base machines
`were developed by the_ Copernlque company.
`
`· 1.3 - Documentation Management Center
`
`Data base updating is effected automatically by a Documentation
`Management Center at Rennes, which receives the update requests from
`two sources :
`
`• French Telecom commercial agencies (ACTELS) for basic directory
`information •
`
`• The Office d'Annonces selling agency for .. yellow page"
`advert !sing.
`
`• .• I • .•
`
`
`
`The Documentation Management Center : perfot'llls coherency checks on
`update requests
`from
`these
`two sources
`; scans
`the requests
`to
`identify equivalent
`tenns (e.g. full name of a company and its
`form} and multiple search words (e.g. the words "power"
`acron}'tllic
`aoo. "lighting" in Everytown Power and Lighting Company) : · generates
`a corresponding number of data base entries to maximize the success
`rate
`in query response
`; and
`then forwards
`the updates
`to
`the
`concerned Documentation Centers.
`
`Each Docwnentation Center integrates
`data base during the night.
`
`the received updates
`
`in its
`
`automatic daily
`The Documentation Management Center perfor,ns
`updating of directory
`information stored at all Documentation
`Center (DCS)
`
`the first computer receives update requests, which concern new
`directory inscriptions of all or part of existing in.qcrlptions,
`
`inscriptions
`the second computer administer a master file of
`lndexed by telephone numbers, perform <!hecks on the coherency of
`update requests with the existlng documentation, and generate the
`actuel data base updates for Documentation Centers.
`
`the Documentation
`to
`transmits updates
`third computer
`The
`Centers and supervises their integration in the EDS data ba9e.
`
`1,4 Supervisory Center
`
`Lastly, a Supervisory Center at Rennes :
`
`Inquiry Center and DocumentaUon
`the operation of
`• supervises
`Centers,
`receives
`alarm messages
`from
`these
`centers,
`and
`transmits necessary reconfiguration commands.
`
`the
`records data on
`Documentation Centers.
`
`traf fie hand led by Inquiry Centers and
`
`localities, professional
`reference files (list of
`generates
`headings, street names, etc). and
`transmits
`them
`to Inquiry
`Center for use in validating queries .
`
`• • • I • •• .
`
`45S
`
`
`
`2 - SEARCH MODES
`
`2.1 General description
`
`a number of additional
`EDS provides a basic service and
`features. The basic service allovs
`the user
`to get
`the
`telephone
`number
`of
`aay
`subscriber
`by
`providing
`some
`infot"ll1ation such as his name, business or address.
`
`A.s with paper directories, there are two main search modes :
`
`search by family or company name in an indicated locality,
`corresponding
`to
`the
`"white pages" of
`a
`conventional
`directory. ·The name can be specified completely or partially.
`Phonetic spelling is accepted when the user is not sure how a
`name is actually spelt. Additional search information can be
`supplied by
`the user, such as
`forename, profession ar.YJ../or
`address,
`
`This search is also offered in an entire department. In that
`case, phonetic spelling is not accepted (for purposes of
`administration and
`local government, France
`is divided
`in
`36.000 localities gathered in 100 departements) .
`
`• search by professional or business category in an indicated
`locality or in a department; corresponding
`to t~ "yellow
`pages" of a conventional dir.ectory. The user does not have to
`know
`the official professional/business category because
`equivalent terms or approximations are accepted. For example,
`the
`system
`is not
`stumped
`if
`a user
`looking
`for a
`paediatrician enters doctor or physician or simply health,
`
`.The other search modes are :
`
`- search by name and professional category
`- search by address (street and numbers) 1n the largest cities
`such as Paris, Lyon, Marseille
`- emergency numbers
`- administrative 1.nformation
`- postal codes
`-
`tariff information
`
`2.2 The m.ai~ files
`
`The Documentation Center function uses different files to
`process the search :
`
`- alphaphonetic file
`- alphabetic file
`- business file
`- address file
`
`..• I . .•
`
`456
`
`
`
`in . several files.
`An entry can be
`corresponding searches.
`
`and
`
`be accessed by
`
`the
`
`the alphabetic and
`instance, a yellow page entry is in
`For
`alphaphonetic files for access by name or corporate name. in the
`business file for access by business heading and in the address
`file for a<:cess by address if this address is codified.
`
`2,2,1 Phonetic spelling search
`
`To per£ orm this seuch, the DC uses the alphaphonet ic file which
`is classified by locality. It h
`the default search when the name
`field 1s keyed in by a user without a -specific character at the
`end of a word.
`
`reduced
`the
`·word set is processed by
`the name
`In this case,
`alternate spelling algo r1 thm which was developped by CNET,
`the
`french Telecom Research Center. The first significant word is used
`to determine the area of the alphaphonetic book where the listing
`srould be if it exists. The others words, business code, first
`n~me, street code or name are used as filter.
`
`2,2.2 Alphabetic search
`
`the DC uses the alphabetic file which is
`To perform this search.,
`classified by department. This search is used :
`
`- w_hen an .incomplete word is _. entered (specific character at the
`erd of the word).
`
`-
`
`for a search by family or company name in an entire department.
`
`2,2,3 Business search
`
`The business file is used if a business heading is keyed in by the
`user and codified by
`the
`Inquiry Center.
`If
`the user adds
`information in the name field, either the alphaphonetic or the
`alphabetic file will be used.
`
`In the first case• the search is .performed to select entries which
`have a business code matching with the input business code.
`
`to select entries
`is performed
`the · search
`the second case.
`In
`which have a corresponding name and
`then the business codes are
`compared. The entry will be selected only if it contains a
`business code which Cl3 tches the one in the input •
`
`• • • I •••
`
`
`
`- 8 -
`
`2.2.4 Address · search
`
`In all the other searches. the address field is used as a secondary
`selection criteria.
`
`For the search by address the address must be complete to street
`level to allow the Inquiry Center to determine the administrative
`code and the name aod business fields must be empty to allow the
`system to select the entry from the addresses file which contains
`the street names.
`
`It's also possible to do a narrow search at a given number street
`address. And
`if the business field contains a single specific
`character,
`i.e. an asterisk,
`the system searches for all the
`business entries at this address.
`
`2.3 Algorithms
`
`standard
`into
`language
`translates ordinary user
`software
`The
`formulations that can be properly understood by . the EDS computers.
`For example :
`
`• searching is correctly initiated from any word of a multiword
`name. The system will supply
`information on
`the
`local water
`company even if the user enters only •water" for ex.ample,
`
`• the system recognizes
`mistakes
`concerning
`professions, street
`"bucher·).
`
`and automatically corrects -co.nmon spelling
`family or
`c0111pany
`names.
`forenames,
`names
`and
`localities
`(e.g.
`·bayker" or
`
`recognizes alternative names
`the system also correctly
`professions (e.g. masseur for physiotherapist),
`
`for
`
`• searching can be progressively extended to neighboring localities
`ot:
`related
`professional/business
`categories
`(e.g.
`from
`physiotherapists
`to physicians, rest and convalescence homes,
`etc).
`
`2.3.l Reduced alternate spelling ~lgorithm
`
`to obtain a
`reduced alternate spelling algorithm is used
`The
`"normalized .. word set from a field in its current spelling fom.
`(i.e. as entered).
`
`This algorithm is used by every center both when creating or
`updating files, and when searching into the existing files •
`
`. . . I. · . .
`
`A58
`
`
`
`A word is a character s~ring Without word delimiters.
`
`A word can be
`
`Significant
`lt is a word which has significance, Each significant word in a
`listing is an access word,
`
`• Insignificant
`It is a word which is used to select be~een word sets which have
`been accessed using a significant word.
`.
`
`• Linking :
`It is a word which is not used ln selecting a word .set. For
`instance. a preposition is a linking word •
`
`• Handle :
`A handle is word that can usually be . c9nca_tenated to another
`word.
`
`Equivalent
`An eq u1 valent word is . a word .which is currently used in place of
`another word, For instance, Co ls an equivalent word of Company.
`
`The word set processing uses auxiliary tables to defini:::e
`
`-
`linking words
`i1'S.ignif~cant _vords
`-
`- hand ],.e"s
`· ·
`- equivalent; words
`
`Different tables .. a·re used to process
`
`Individual or Company names
`business headings
`a~dress word sets
`
`The Algorithms make us~ of a number of elementary processes.
`
`Randle processing
`
`Handles are concatenated tq the following word to ~reate one or
`several additional words· in the word set.
`
`For instance :
`
`Van Der Horst will result in Rors t
`•
`De rho rs t
`Vandernorst
`
`••• I • ••
`
`
`
`Special character mark processing
`
`Some characters can be considered as special punctation characters aod
`as . a result of special character mark processing, the words, which are
`at each side of. these characters, are concatenated. Fo·r instance, if -
`(hyphen) 1s processed as 'special character mark, inter-continental will
`result in
`
`- 1n_ter
`- cont ineot al
`-
`intercontinental
`
`Phonetic process1-ng
`
`During this processing, predefined rules are applied to each word to
`obtain a reduced alternate spelling.
`For instance, 11 can be reduced to 1
`or
`our
`ail
`el
`
`Equivalent word processing
`
`During this processing, when equivalent words ex.lat for a word in the
`word set, equivalent words are added to the word set.
`For instance, Co
`can result in adding Company
`Ltd
`"
`Limited
`
`Insignificant word procedsing
`
`In the word set, all the insignificant words are marked to prevent both
`duplication in the files and
`t-he · search with these words. The first
`word of the entered word set is never considered insignificant.
`
`Linking word processing
`
`The linking words are deleted from the entered word set •
`
`• • • I • ••
`
`460
`
`
`
`Acronyms processing
`
`to
`letters are concatenated
`All separated
`the separation · character between each
`word 1f
`the same.
`
`a
`produce
`letter is
`
`For instance
`
`·I.B.M.
`E.O.F.-G.D.F.
`I 8 M
`
`produces
`produces two words
`produces
`
`IBH
`EDF and GDF
`IBH
`
`But :
`
`IBM
`I BM
`
`does not produce IBM but word IB and letter M
`does not produce IBM but word BH and letter I
`
`2.3.2 Search algorithms
`
`For exact alphabet 1c or phonet 1c spel ll ng search,
`levels of precision are defined
`
`three
`
`1
`
`the words of the request are the exact words which
`are 1n the record·
`2 the words of the request ·are all among the words
`which are in the record
`- -
`
`3
`
`the words which
`some words of . the request a.re among
`the record. Words may be both significant am
`are
`in
`insignificant. Linking wo-rds
`are
`ignored. The word
`.order is ,irrelevant.
`
`The .exact alphabetic . search is perfonied at level 2
`only.
`
`The phonetic search is performed in two phases
`
`Phase 1 .:
`
`The system searches at level 2 with exact spelling.
`
`Phase 2:
`
`If
`result during phase
`is no
`there
`I nquiry Center,
`the
`-request . _from
`the
`the . words
`in
`the
`processes all of
`reduced alternate spelling .
`
`l, or on
`data base
`request
`to
`
`further
`centre
`obtain
`
`. . . I . ..
`
`461
`
`
`
`Then with theses new words, the search is first performed
`at level 2. If there is no result, the search is performed
`at level 3 with all the combinations of two word set and if
`again there is no result, the search is performed with each
`word separately.
`
`2.3.3 Business heading comprehension algorithm
`
`·understand
`to
`used
`is
`This algorithm
`information
`to
`the
`provide
`heading.
`To
`used.
`Two
`specific
`specific
`files
`are
`defined here :
`
`a
`
`business
`algorithm.
`files
`are
`
`- ·heading file
`. - synonim file
`
`The heading file contains all the business headings that
`can be used to classify the business entries. In this file,
`a business heading code is associated with each business
`heading word set.
`The synonim file contains records which are identified by a
`specific code :
`
`1. Code E : The. record cont aina · a word which must be
`eli1111nated if it is found in an entered word
`set.
`
`z .• Code S
`
`3. Code A
`
`The record contains a word and a substitution
`word. If the first word is found in an entered
`word set.
`the substitution word will replace
`it.
`
`record contains an ambigious word and
`: The
`alternate word sets. If the user enters only an
`amblglous word•
`the system asks
`the user to
`choose one of · the associated alternate word
`sets.
`
`a word which
`contains
`record
`:The
`4. Code V
`i~ignificant if it is alone in_ the vord set.
`
`is
`
`5. Code R · :The record contains a word set which
`alternate • fom of a busines.s heading,
`
`is an
`
`••• I • ••
`
`
`
`Prom these two files, the data management center builds a
`composite (He.
`
`The word sets in the co,aposite file are processed in the ·
`following sequence :
`
`-
`
`special ~hara.cter mark processing
`linking word processing
`equivalent word processing
`insignificant ·word processing
`-
`- · reduced alternate spelling proces&i.ng
`
`There are . as many records as there are significant words in
`the vord set.
`
`46 3
`
`
`
`Conclusion:
`
`Directory assistance for telephone subscribers traditionally relies on
`paper directories
`that are increasingly · voluminous and, above all,
`increasingly difficult
`and
`costly
`-to
`update,
`backed-up by
`a
`labor-intensive and loss-making directory inquiries service.
`
`is a truly mass-market
`the .EDS
`these problems,
`Designed to overc0111e
`on-line information service with no counterpart any where in the world.
`Statistics show that both professional and residential users have very
`quickly made
`the system part and parcel of their everyday life . : each
`Mini tel user calls the EDS
`twice a week ,
`this success is due · to the
`service's speed, efficiency, user friendliness and very low costs.
`
`464
`
`
`
`ANNEXE l
`
`EDS TRAFF! C IN JUNE 1987
`
`Number of Minitels in homes and
`of fices
`
`Average EDS traffic in Erl.angs
`
`rotal EDS connect-time
`
`Number of EDS call
`
`I
`I
`I 3 million
`I
`. I 4,640
`I
`I 950,000 hours
`I (Le. 19 mn per
`I M1n1tel)
`I
`I 24 million (Le,
`I 8 per Minitel)
`I
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`·I
`I
`I
`
`CAPACITY OF THE ELECTRONIC DIRECTORY SERVICE
`
`26
`
`44
`
`55
`
`I
`I
`I
`I
`I Dlk,85 I D!c·,86 I June 8 7 ,,
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I 6,240
`I 13,'500
`I 10,560
`I
`I
`1
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`.. I
`I
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I Number of Videotex Access
`I Points for EDS
`I
`I
`I Number of VAP access pons
`I for EDS (•number of simul-
`I taneous EDS calls)·
`I
`I
`I Numbe?' of Inquiry Centers
`I
`I
`I Number of Documentation
`I Centers
`I
`
`4oS
`
`14 .
`
`23
`
`9
`
`10
`
`29
`
`12
`
`.I
`I
`I
`I
`I
`I
`
`