back
Whether you enter the data yourself, or receive it direct from contributors, each entry will consist of a series of data fields, defined in the STAR format described by Syd Hall [(1991), J. Chem. Inf. Comput. Sci. 31, 326-333]. It is probably most convenient if every individual entry is in a separate file, though it is possible to combine several entries in a file. Indeed, when your work of data collection is complete, you will send to Chester a single STAR file containing all the entries for your country.
The STAR file begins with a data_xxxx block name, then a "loop_" statement followed by a list of data names defining the data that will follow. Any particular data item will be recognised by processing software according to its position within the file; if it is the fifth data item, its nature is defined by the fifth data name in the list. It is therefore essential that the file structure is maintained, otherwise it would be possible to lose track of which data name referred to any given fragment of data.
The program QUASAR is a general-purpose STAR handling tool which can be used to validate the structure of any STAR file. You can test it by typing the following sequence of commands:
You should run QUASAR on any STAR file you receive, and rerun it any time you edit such a file.
The program is a pure filter, so you should invoke it under Unix by typing
keychk < test.wdc
or
cat test.wdc | keychk
Valent allows you to:
Try option 1 (input from the keyboard) with your own data. It calls a program named input which adds as many entries as you want and checks their structure and the validity of the keywords. New data will be added to the file created by input called "your_country_acc.wdc". This file will be deleted only when the contents are added to your national database. It is editable if you find that some entries are not valid keywords.
[ THE EDITOR READS HIS MAIL AND DISCOVERS A MESSAGE FROM A CONTRIBUTOR. THE CONTRIBUTOR HAS HELPFULLY GIVEN HIS NAME IN THE Subject: FIELD ] $ mail Mail version SMI 4.0 Thu Jul 23 13:52:20 PDT 1992 Type ? for help. "/usr/spool/mail/bm": 1 message 1 new >N 1 rupert@zenda.bitnet Fri Nov 13 14:33 134/4682 WDC entry for hentzau [ THE EDITOR SAVES THE MESSAGE TO A FILE hentzau.ent IN HIS WORKING DIRECTORY. HIS MAIL SYSTEM WILL APPEND THE MESSAGE TO ANY EXISTING FILE OF THAT NAME. HIS PLAN IS TO HAVE EVERY ENTRY IN A FILE NAMED AFTER THE CONTRIBUTOR - THIS WILL MAKE COLLATION EASIER LATER. EACH FILE IS GIVEN THE SUFFIX ".ent" (FOR "entry") ] & w /usr/home/subed/wdc/hentzau.ent "/usr/home/subed/wdc/hentzau.ent" [New file] 132/4657 & q [ THE EDITOR NOW CHANGES TO HIS WORKING DIRECTORY AND RUNS VALENT ] $ cd /usr/home/subed/wdc $ valent Input new data from Keyboard [1] from a file [2] Exit [3] 2 Give filename hentzau.entIf you use a different version of the Unix operating system, or if you use similar tools on a different operating system, you may not be able to follow this model exactly. However, it should suggest to you the way in which you might like to proceed with your data collection and validation.STAR File Processor (May 18 92) -------------------------------- STAR archive file is hentzau.ent Checking archive file for logical integrity. Error >>> Data structure error at data item dshgfadshgfdsahgfkadsjhgfadsjhg Fatal error -- archive line 1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Copyright (c)1992 International Union of Crystallography %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [ OOPS - FORGOT TO REMOVE THE MAIL HEADER! SOME MAIL SOFTWARE WILL DO THIS AUTOMATICALLY; OR IT MAY BE EASY TO WRITE A SCRIPT TO DO IT ] Add to country list [y|n] ? [ EXIT AT THIS POINT ] n [ EDIT THE FILE AND RERUN ] $ vi hentzau.ent $ valent Input new data from Keyboard [1] from a file [2] Exit [3] 2 Give filename hentzau.ent STAR File Processor (May 18 92: mod 920712 BM) -------------------------------- STAR archive file is hentzau.ent Checking archive file for logical integrity. Checking complete and correct. Check keywords [y|n] ? [ NOW CHECK THE KEYWORDS ] y %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Copyright (c)1992 International Union of Crystallography %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (Hentzau): crystallography_of_carbon_compounds is not in any list crystallography is in the Methods list of_carbon_compounds is not in any list (of) - ignore carbon_compounds is in the Compounds list diamond is in the Compounds list non-crystalline_minerals is not in any list non-crystalline is in the Attributes list minerals is in the Compounds list %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Add to country list [y|n] ? [ LOOKS OK. NOW THE EDITOR WILL EXAMINE THE OTHER FIELDS. HE HAS DECIDED TO KEEP THE ENTRIES IN SEPARATE FILES FOR NOW ] n $ [ SOME CONSIDERABLE TIME LATER, WHEN THE EDITOR HAS COLLECTED AND VALIDATED ALL THE RURITANIAN ENTRIES, HE WILL CREATE HIS COUNTRY LIST AT ONE SITTING. FIRST HE ENSURES THAT ANY OLD COPIES OF HIS COUNTRY LIST ARE GONE ] $ rm ruritania.wdc [ NOW HE TYPES A LOOP INSTRUCTION TO PROCESS EVERY ENTRY IN ALPHABETIC ORDER SO THAT ENTRIES WILL BE ADDED TO THE COUNTRY LIST ALREAD COLLATED. EVEN THOUGH HE HAS TAKEN CARE WITH ALL HIS EDITING AND RERUN QUASAR EVERY TIME HE HAS EDITED A FILE, HE REMAINS CAUTIOUS AND PROCESSES EVERY ENTRY THROUGH VALENT ] $ for F in *.ent > do > valent $F > done [ . . . SOME HOURS LATER ] $ mail -s "WDC country list for Ruritania" teched@iucr.org < ruritania.wdc
Please send your comments and your suggestions to Yves Epelboin, epelboin@lmcp.jussieu.fr .