SLIS 5200: IOP Template, Summer 2008

  • Doc File 156.00KByte



Brian George

Spring 2013

SLIS 5200 TxWI_A

Final Draft 4

Midland Music Library American Music Collection:

Information Organization System

1. Project description

1.1. Collection and information objects

The American music collection is a collection of approximately 3,000 cd's from a wide range of genres including blues, rock, country, and classical, as well as artists as diverse as Bob Dylan, Bruce Springsteen, Garth Brooks, Beethoven and Bach. It is housed at the hypothetical Midland Music Library in Midland, Texas and is frequented at any given time by an average of about 250 students of both Midland College and the University of Texas at the Permian Basin, as well as approximately 350 non-student music lovers ranging in age from approximately 20-45. The library is covered with both public funds and private donations, and adds new music on a regular basis. The collection is an expression of the area's appreciation for well performed, solidly arranged music and partners with surrounding colleges for an expansion of music theory and practice.

1.2. Users' demographics and knowledge

Frequent visitors to the library are Midland residents, both male and female, ranging in age from 20 to about 50. They generally have at least a partial college education and mostly fall into the middle/upper class. The higher socio-economic status enables the majority of these users to afford devices for playing music and the generally white collar jobs allow the free time to enjoy the luxury of music.

There are four types of knowledge that determine a user’s ability to seek out information needs. The level of knowledge a user has for each type generally determines the way in which the user conducts an information search, as well as how successful the attempt will ultimately be. The four types of knowledge this user group exhibits in its search for information are: general, domain, system, and information seeking. General knowledge is that knowledge gathered, processed, and stored by the brain while going through the daily routine of life. Overall, the average level of general knowledge for the user group is moderate. While older patrons do frequent the library the demographics mostly skew younger with less life experience and more to learn. Domain knowledge is the level of knowledge obtained on a given subject area. The patrons’ average level of domain knowledge runs to the high side of the moderate level. The library is one dedicated solely to music. Its only real patrons are users who already have a certain amount of love, and thus a certain amount of knowledge of, music. System knowledge is the ability to recognize and decode the processes, functions, and minutiae of a given system. The general level of systems knowledge for the user group is moderate. The general population runs moderate in computer ability and the students have not yet reached the stage of expert in the field. Information seeking knowledge is the ability of a user to successfully adapt to various, unfamiliar search methods. For much the same reasons as the system knowledge, general information seeking knowledge runs moderate in the user group. The level of knowledge of a given user defines how that user seeks out information.

Based on these characteristics, a system for this group would require a moderate level of complexity which allows for an advanced search by at least artist and album, and likely genre. Because the users tend to know their music needs, and because they tend to have generally sufficient ability to conduct their own searches with minimal required assistance, allowing sufficiently advanced search options is crucial to aiding in the success of the search.

1.3. Users' problems and questions

The users of the Midland Music Library choose the collection generally for two main reasons: they wish to be entertained by the music and they wish to study compositions by ear. They are generally looking for songs they may have heard on the radio, or cd’s created by particular artists, groups, or labels, or simply to find music that is similar to other artists they enjoy often based on genre and other attributes. Attributes are the physical traits that describe the objects. Within the search system a certain amount of precision is necessary to conduct a successful search. Precision is the ability of a search to obtain the exact object or objects desired. Recall is the amount of relevant objects obtained from a search which may or may not be relevant to the subject area. In the case of the users, inquiries sometimes denote high precision and low recall as they come in with specific titles they wish to find. There are also other times when a general genre or record label is sought and the precision decreases which is usually accompanied by an increase in overall recall.

User question 1: I would like to check out all the albums recorded by Bruce Springsteen in the 1970’s?

Object attributes: album, artist, and year

Desired precision: low

Desired recall: high

User question 2: I’m looking for the album Saturate before Using by Jackson Browne. Can you help me find it?

Object attributes: album and artist

Desired precision: high

Desired recall: low

User question 3: I really like Alternative rock music. Can you get me some CDs in that genre?

Object attributes: genre

Desired precision: moderate

Desired recall: moderate

User question 4: I would like to check out some CDs in karaoke style. Can I get a few CDs available to check out in that format?

Object attributes: special features

Desired precision: moderate

Desired recall: moderate

The attributes suggested in the queries are: Album, Artist, Year, Genre, and Special Features (multiple discs, live albums, instrumental/karaoke, none, etc.). Other relevant attributes which may be added to the record are Track List, Record Label (publishing company), Subject, and Classification.

2. Representation of information objects

2.1. Entity level

An entity is any object about which statements regarding the properties of the object may be made. When compiled, the total of all statements about a given entity, comprise that entity’s record. A record is simply a collection of attribute statements which represent the entity. In order to effectively create a record the cataloger must be aware of the entity level of each entity in the collection. The entity level is the full scope of the entity represented by the record. Entity levels may be sized in various ways depending on what object is being described. For instance, an entity level may be an entire book, the chapters in a book, or individual articles within a collection. Depending on the level the attributes For the purposes of this collection the entity level is one whole cd. This is the appropriate entity level for this group because the main search terms generally fall into the album title or artist categories. Searches are rarely performed for individual songs because the album is traditionally the object sought.

2.2. Metadata elements and semantics

Metadata elements represent the physical form and provide a physical description of the information object. The functional Requirements for Bibliographic Records (FRBR) provides a set of four functions the metadata elements and semantics should support in assisting users in obtaining their information needs. These four functions are: find, identify, select, and obtain.

1. The find process is the process whereby users determine whether or not the information object sough is likely to exist. This process is established at the public computer (OPAC) with the initial search.

2. Once they have found their set, users then set out to identify items within the search results. The identify process is where the user determines the items most likely to fulfill his search request. This process takes place just after the initial search often aided by elements such as Author and Title.

3. Identification is followed by selecting the appropriate items within the results which most closely match the desired end information results. The select process allows comparisons between objects that may or may not meet the user’s requirements and needs.

4. Finally, the user travels to the appropriate section in order to obtain (or physically acquire) those items determined to match the user’s information needs. The obtain process is done at the shelf, usually with the book in hand. This process does not usually have a preset list of relevant elements as these elements may not be fully known until the actual obtain process occurs.

Metadata elements aid in the user’s search for information objects in many ways by providing relevant descriptions of the objects in order to help the user differentiate between different objects in order to help speed up selection of the search. The Midland Library contains nine metadata elements. The initial search for objects in the Midland Music Library (finding them) may be aided by the metadata elements of Artist, Album, and to some extent Genre and Subject. These elements tend to encapsulate the initial searches of the users of the library. Once a search has listed results, identifying elements include the Year of the album release, Classification of the object, and the Record Label which further enable the library's users to distinguish between otherwise similar items and objects. Selecting objects within the library often falls to the elements of Special Features (such as foreign language, multiple discs, instrumental/karaoke, and others) as well as Track List which allows the user to determine if specific songs he may be searching for can be found on the disc. Once the selection has been made the user may then advance to the shelves to obtain his item which may include any or all of the above as well as a host of other elements the user may not even be aware of initially.

See Appendix A: Metadata Elements and Semantics for a complete list of metadata elements and semantic physical descriptions.

2.3. Record structure and specifications

Every record in the American Music Collection has eleven fields. The eleven fields include the original nine found in the metadata elements, translated directly, as well as two extra fields: RecordID and RecordDate. RecordID is automatically entered in the database for each new record whereby the record is assigned a unique number to distinguish it from the other records. RecordDate automatically records the time and date the record is entered into the database.

There are four technical specifications which help within the database to denote and describe records. These four specifications are field type, indexing, entry validation, and content validation.

Field type is what determines what goes into a given field. Most records' field type is text based because most descriptions use either numbers or letters (words) to describe. Thus Year, Artist, Album, Record Label, Genre, etc are all text based because they all can be described alphanumerically.

Indexing allows search terms to be linked together through access points. Term and Word indexing determine the ways the database matches the end user search with the initial cataloging of the record in the database. Term indexing matches searches through words or sets of words based on what the user enters into the search and how the record was initially logged. This allows phrases or parts of phrases to be found even when a single word or term is searched. Word indexing is much more exact. A search for the word "existentialism" for example, would not turn up the phrase "existential crisis" in a word index as it would in a term index. Consequently, Term indexed fields provide far more matches in any given search than fields which are Word indexed. All fields in this collection except for Year, RecordID, and Record Date are both Term and Word indexed to allow for maximum results in a search.

Entry validation sets the limits and defines the requirements for all information entered into the database. Entry Validation is split into three categories: single, required, and unique, the knowledge of which is essential for a cataloger to correctly enter information into the record. Single validation allows a given field to contain only one term and no more. It is used most often in fields which may not have significant deviation from one term. The Year field, for instance, may be a place where single validation is used because there is only one copyright date for any particular album; on the other hand fields such as Artist and Album may not be good candidates for this validation because an album may contain more than one artist or title.

Required validation is used when a certain field must contain data input from the cataloger. It is usually reserved for those fields which the users search the most and thus which tend to be necessary fields to contain information. This validation may be unique or single but it nonetheless requires input. In the Midland Collection the three areas most often searched include the Album, Artist, and Year fields. Because these fields are the most popular for access point it is required the cataloger fill them in.

Unique validation is used for those fields which contain a very unique description applicable only to the one record. If the same term is used for two different records when unique validation is enabled the second record will not be allowed to use the term. The Midland Music Collection contains no unique validation because it has no real relevance for any of the fields found in the database.

Content validation controls the display and accessibility of the information. The two types of content validation are lists and masks.

Lists are accessed through a controlled vocabulary which allows the cataloger to select words or terms from a drop down box. This requires that all information entered into the field(s) match a certain pre-authorized format before the data will be accepted into the database. This is most effective only within the parameters of a small number of choices because in particular a drop down list with hundreds or thousands of entries would be too cumbersome to bother with. However, in such situations the list, though small, is quite helpful for the user because it allows every search a certain degree of success even if the knowledge of the user is not the average. There are currently two validation lists in the Midland Music Library under Genre and Special Features. These were chosen for the specific reason that particular Genres and Special Features have found their way into the collection which all users may not be familiar with. There are, however, few enough choices within these fields that navigating the list would not be too time consuming or troubling.

Masks allow the cataloger to control the format in which information is placed. All data entered into a mask must fit a pre-approved format, character limit, or sequence length. This type of control is most effective for dates, for instance, because it allows all the information to be in the same format and thus makes it easier to access by searches because all the information is similar in style. There are no fields within this collection requiring a mask.

See Appendix B for more details on the record structure and specifications of the American Music Collection.

2.4. Record content and input rules

Record content and input rules ensure the content is entered into the database correctly and consistently across all objects entered. Content rules determine and regulate the data in a given field. Input rules govern the mechanical entry of information into the database. They maintain uniformity and standardization across the entire database ensuring one search finds all relevant information. Rather than searching through a multitude of spellings and pseudonyms individually input rules focus the search so all relevant results may be obtained much easier and quicker. They enable a cataloger to view any content he comes in contact with and decide the best way to add the information to the database so the users on the other end of the computer may search similar fields and find the information and objects they seek.

In order to input content into the record the cataloger must have the necessary information. The bulk of a record’s information will be found within the chief source of information. The chief source of information is the place the cataloger goes to find information regarding the metadata elements in order to enter the data into the record’s fields. The chief sources of information for this system are generally the cd case and the cd booklet. This maintains quality and consistency among the objects added by providing universal identifiers that all catalogers may enter in various fields to maintain accurate and cross searchable terms. Appendix C lists all fields identified for each record which includes: the number and name of the field, a semantic description of the field providing accurate physical properties associated with the individual field, the chief source (or sources) of information provided by the object, input rules guiding and providing a foundational platform for catalogers to address field descriptions in a consistent manner, and an example of each field illustrating how the field should look in the final analysis. Appendix G provides sample records of the Midland Music Library American Music Collection illustrating how the complete record should appear as defined by the content and input rules.

3. Access and authority control

Authority control is the process of establishing a standard of preselected terms used to describe subject material and names within a cataloging system. It is important in cataloging because it allows different users at the computer catalog to search within various terms and still obtain the same and similar results for books or other materials. Authority control establishes precedents for users in which words, subjects, names, etc. are spelled and used consistently in order to reduce errors in searching the catalog. Authority control also establishes relationships between similar words and related subjects so the user may find what he seeks much quicker and more accurately. A well thought out control of authority may increase precision and recall significantly. Three types of authority control exist within a system: Controlled vocabulary, name authority control, and subject authority control.

A controlled vocabulary is a set of pre-established words, generally for names or subject headings, and often within a drop down menu through which the user may select words or phrases from a predetermined, i.e. authorized, set of words. A validation list is an example of a controlled vocabulary.

Subject authority control governs the use of sets of authorized subject terms utilizing the controlled vocabulary. Controlled vocabularies set the interrelationships of terms and subject headings semantically. They show the relationships between broad, narrow, and other related terms through systems of similarity. One example of subject authority control is a thesaurus. A thesaurus provides a selection of words and subjects with links between the similar subjects and how those links may be used to find the item sought by changing the search. It employs a cross-reference system, through which a particular word or term may be traced through a long list of similar terms thereby tracing the word through the many which may be used in its place and still obtain the same search results, and a syndetic structure. The subject field is very broad and employs a number of very complex interrelationships between terms making it ideal for a thesaurus.

Name authority control regulates proper names of people, businesses, organizations, groups, and so on. This control normalizes all spelling, capitalization, and other variations between proper names through the use of a name canon (or file). This file cross-references multiple spellings, punctuation, etc. against the proper form of the name. Author is the most likely candidate for name authority control, but Record Label may also employ its use.

4. Representation of information content

4.1. Subject access

Subject access is the method through which a subject or subjects in any given catalog may be accessed and/or searched by the end user. The user searches those fields, or access points, determined to yield the best results in his hunt for information. Subjects are often the best places to provide access to the cataloging system because they contain the main aspects and fields of the cataloging system.

An important area necessary for successfully accessing subject fields is subject representation. Subject representation is simply the way a subject is described within the cataloging system. By providing accurate and detailed information for each object within the collection the cataloger prepares the user for a more successful and satisfying search which yield much greater precision and recall.

Further within the area of subject access is an important step called subject analysis. Subject analysis is determining what the item or object is about. There are two types of subject analysis which must be considered when creating a database: query analysis and document analysis. Document analysis is the cataloger side of the access equation and requires a deep level of knowledge regarding the subject or subjects of the document in order to devise the best way to describe and represent the document in the cataloging system. Query analysis is the end-user side of the equation; this analysis involves studying the search habits of the end users of the system in order to identify those searches, search terms, and other ways in which the users seek their information needs. By applying the two types together a holistic approach to subject analysis forms. Once completed this analysis allows the catalog to contain the best terms, names, vocabularies, and relationships to allow the users to accurately obtain their desired information objects.

In this system, the subject related fields are Subject and Genre. The subject access for Genre is a drop down controlled vocabulary and for the Subject field is a thesaurus. The complex relationships between words and subject categories render a thesaurus the appropriate tool. The relative size and exactness of the Genre field require a narrower search and make a controlled vocabulary the most useful tool for providing the most relevant search results. Here, the user chooses a genre from a drop down set of pre-determined genres chosen by the cataloger according to a prescribed set of rules. This allows the correct genre to provide the most relevant albums to whatever the user is seeking.

4.2. Thesaurus structure

The Subject field is set up under the thesaurus structure. This field contains the largest degree of terms and items, as well as containing the most complicated relationships between words making it the ideal candidate for preparing a thesaurus. A thesaurus is a document which categorizes items by like subjects connecting relationships between similar ideas. The syndetic structure of the thesaurus can be classified in three ways: hierarchical, equivalent, and associative.

The thesaurus is set up this way in order to accommodate the relationships between terms. There are several relationships between words. The classifications of terms include (RT) Related Terms, (BT) Broad Terms, (NT) Narrow Terms, USE or unauthorized terms replaced by authorized terms, and UF which are authorized terms which replace unauthorized terms. BT and NT terms are hierarchical within the syndetic structure of the thesaurus because the broader term relates to a vague or wider concept while the Narrow Term refers to a more specific aspect of the subject. In the music collection, for instance, the term growth is a broad term which encompasses the narrower subject of existentialism. USE and UF are reciprocal and synonymous. Unauthorized terms are not searchable in the database while authorized terms are. In the Midland Music Collection the authorized term Awareness may be substituted for and searched in place of the unauthorized term of Intuition. (RT) Related Terms are in the same basic category and associate with each other such as the related term of affection which is a kind of the term love in the collection.

Thesaurus domain defines the range of the subjects covered whereas scope refers to the limitations imposed upon the domain. This collection has a medium domain as the amount of subjects covered vary a bit but overall have somewhat similar themes.

Specificity in a thesaurus refers to the exact nature and ability of a term to represent the idea it is portraying. In this collection specificity is low to medium as many of the subjects covered in the songs are grand and broad concepts. Exhaustivity refers to the extent to which the number of actual topics are covered by the chosen terms. The more terms chosen to represent the subject of a song the greater the exhaustivity. For this collection the exhaustivity is low as a smaller number of broader terms are sufficient to accurately convey the subjects covered.

4.3. Classification scheme

Classification is the process of categorizing a set of words into like terms. It is useful for establishing connections between two items which may not otherwise be found in one search. Classification methods include hierarchical and faceted. Hierarchical separates terms and words into broader and narrow related categories. This creates classes of terms and sub-classes of other related terms going from the broad, general subjects down to much more specific sub-subjects in the same category. Faceted classification separates the parts from the whole. It recognizes more sideways relationships between words rather than higher to lower. Often several equally broad or narrow facets will be used to comprise a whole object rather than separating the broad from narrow.

This project’s classification scheme can be broken down into three facets. The first identifier for each item is the first three letters in the genre of music into which the item falls. The second identifier is the first three letters of band name or, of the individual artist’s last name. The third identifier is the first letter of the album title. As an example, the record for Jackson Browne’s Saturate Before Using would be ROC.BRO.SBU.1. The ROC for the Rock genre, the BRO for Browne’s last name, the SBU for Saturate Before Using, the album title, and the 1 for the RecordID. (ROC.BRO.SBU.1).

See Appendix E for Classification Scheme.

5. Name authority control

Name authority control is an important aspect of the database search. It provides a connection between names variations of the same name, corporation, group, etc. in order to ensure records are not lost if a minor difference in spelling or different choice of a name is used in a given query.

This is done through the Name authority file which provides the database with a list of acceptable name variations any of which, if searched, will bring up all records containing any of the other acceptable forms of the same person, company, or group. The MML employs name authority control in the field of Artist. This is most helpful for the system because the user may search any variation of a group or artist’s name and still obtain the full list of records of the artist. For instance, if a user searches for albums released by The Wallflowers, the name authority file will pull up all records under the Artists: The Wallflowers, Wallflowers, and Wallflowers, The. This ensures that no matter which variation is entered by any cataloguing employee all relevant records still appear.

6. System evaluation and development

6.1. Performance test

The test environment was a small, quiet room with no one except the system designer and the tester present. The test was performed on a laptop at a table. The tester chosen is a late 30’s white female with a college degree and several years’ experience in the professional world. The user is middle class and has a moderate to high level in the four areas of general, domain, system, and information seeking knowledge, thereby effectively representing the user group demographics for the library. The user was given the thesaurus and instructed in its use but not the name authority file. The name authority file was withheld for the reason that it was an unnecessary component more suited for behind the scenes than learned through instruction. Because each patron of the MML may or may not have his own variation of an artist’s name, the most effective method for producing relevant and complete search results is to simply create the name authority file behind the scenes rather than attempting to explain its use and function to each individual user. The user was then instructed on how to search the system and finally told to read through the questions and search the database for the best results.

User question 1: I would like to check out all the albums recorded by Bruce Springsteen in the 1970’s?

Object attributes: album, artist, and year

Desired precision: low

Desired recall: high

Probable precision: low

Probable recall: low

Query Formulation 1: Searched the Artist field for “Bruce Springsteen” AND the Year field for “1970.”

Analysis of Results: The user determined the artist field would be a good place to start for the particular artist’s albums. In order to find the decade of albums the user made the choice to search 1970 for the Year field determining the one year would provide the full decade’s worth of material. This search offered no results because there were no albums released in 1970 and there was no link to allow the year to bring up any albums released within the time frame nor any option to search a range of dates, which would have been helpful.

Query Formulation 2: Searched the Artist field for “Bruce Springsteen.” This search found the Bruce Springsteen album successfully completing the query.

Analysis of results: This second search found the album results sought.

User question 2: I’m looking for the album Saturate before Using by Jackson Browne. Can you help me find it?

Object attributes: album and artist

Desired precision: high

Desired recall: low

Probable precision: high

Probable recall: low

Query Formulation: The user chose in this instance to search specifically by Album title “Saturate Before Using.”

Analysis of results: For the relative ease of the desired results this simple query was effective in discovering the appropriate album.

User question 3: I really like Alternative rock music. Can you get me some CDs in that genre?

Object attributes: genre

Desired precision: moderate

Desired recall: moderate

Probable precision: moderate

Probable recall: moderate

Query formulation: The user chose to search the Genre field with the keyword Alternative. The query successfully brought up three albums: Breach by The Wallflowers, Nevermind by Nirvana, and Mad Season by Matchbox 20.

Analysis of results:

User question 4: I would like to check out some CDs in karaoke style. Can I get a few CDs available to check out in that format?

Object attributes: special features

Desired precision: moderate

Desired recall: moderate

Probable precision: moderate

Probable recall: low

Query formulation 1: Intuitively, the user chose the field of Genre to search for the keyword “karaoke.”

Analysis of results: The results came back without success.

Query formulation 2: This second search the user chose the field of Special Features to search for “karaoke.”

Analysis of results: In this search the results successfully came back with the one karaoke item in the collection.

The system was inadequate in several ways. Searching by particular fields did not always yield results based on the input of descriptions in those particular fields. One mistake was a mislabeling of a certain keyword which should have been placed in the Genre field but ended up in the Special Features field due to cataloger error.

6.2. Change and development

In going over the test it is apparent there are changes which may be made to the system to ensure it operates at peak efficiency and with minimal worry for end user error. For the type of searches most common in music, a change should be made to incorporate the ability to search multiple years or, preferably, a full time range to better incorporate multiple queries in one search. Often a user enjoys the work of a particular artist at a particular period in the artist’s life. By allowing, for instance, a decade to be searched, the user is much more likely to find the desired information and thus better able to obtain the particular information objects sought. Similarly, this would accommodate those users who enjoy music from all artists within a particular period of time, say ‘70’s classic rock. By making this fairly simple change to the system it becomes that much more effective at offering the most appropriate results of the user search.

Another problem that could be fixed through system development is allowing a multiple field search. For instance, the problem found in the fourth question was the input of the keyword “karaoke” into only one field. By allowing multiple fields to contain the same keywords and information, the power of each search is multiplied by the number of fields that contain the desired information.

Ultimately, the best way for any system to become more efficient is to engage in periodical testing. Providing the database with real life users over multiple periods of time eventually yields the right changes at the right times and to the right degrees.

7. Project summary

I had quite a bit of fun with this project despite a few roadblocks and some misunderstandings in the initial phases. I chose the collection because I love music and it was a lot of fun to create the system, plus I like the author/genre code better than the Dewey System my library employs for music. If I had it all to do over again I would, admittedly, coordinate my music selection better with my questions so more than one question would obtain multiple hits in the search. I also ran into the problem of deciding the best locations to place different descriptions. This came to a head when my database was lost in a computer crash I experienced about two weeks before the final draft 4 was due. Scrambling to recreate the database I was fortunate to have the saved previous drafts that I could use to recreate it, although if I recall there was one error in the database creation. However, the crash served one positive purpose: that of persuading me to look into cloud computing which is much like a library system in that if one cataloging computer goes down the database is kept separately so it’s never lost. This would have been helpful for me in the long run but the recreation was also successful.

My decisions were generally based on my existing knowledge of the music in the collection, which didn’t always work because when you get into large collections there are unknowns and a system would need to take this into account, which this one did to some degree but not always to the extent it would necessarily need to.

I feel a much deeper level of respect and find a newfound appreciation for the cataloging community. It is not by any means an easy job and requires a tremendous level of dedication. It is also a much more interesting topic on the theory side of things and one I could see involving myself with over the years.

Appendix A. Metadata elements and semantics

|No. |Element name |Semantics |

|1 |Artist |The creator or composer of the cd |

|2 |Album |The name of the cd |

|3 |Year |The date the cd was released to the public |

|4 |Genre |Topical subjects of the music in question |

|5 |Special Features |Additions not typical of the average cd |

|6 |Track List |A list of the song names contained on the cd |

|7 |Record Label |The company responsible for producing the cd. |

|8 |Subject |Meaning of work, what it’s about |

|9 |Classification |Organization of information into related categories |

Appendix B. Record structure and specifications

1. Record structure specifications

|No. |Field name |Field type |Indexing |Entry validation |Content validation |

|1 |RecordID |Autonumber |Term |None |None |

|2 |RecordDate |Autodate |Term |None |None |

|3 |Artist |Text |Term and Word |Required |None |

|4 |Album |Text |Term and Word |Required |None |

|5 |Year |Date |Term |Required |None |

|6 |Genre |Text |Term and Word |None |Validation List |

|7 |Special Features |Text |Term and Word |None |Validation List |

|8 |Track List |Text |Term and Word |None |None |

|9 |Record Label |Text |Term and Word |None |None |

|10 |Subject |Text |Term and Word |None |None |

|11 |Classification |Text |Term and Word |Required |None |

2. Textbase structure

Textbase Structure

Textbase Information

Textbase: C:\Users\Brian\Documents\SLIS 5200\brianfile2

Created: 2/21/2013 4:52:59 AM

Modified: 2/28/2013 8:29:32 PM

Field Summary:

1. RecordID: Automatic Number(next avail=5, increm=1), Term

2. RecordDate: Automatic Date(Both Date and Time,When Created), Term

3. Artist: Text, Term & Word

Validation: required

4. Album: Text, Term & Word

Validation: required

5. Year: Date, Term

Validation: required

6. Genre: Text, Term & Word

Validation: valid-list

7. Special Features: Text, Term & Word

Validation: valid-list

8. Track List: Text, Term & Word

9. Record Label: Text, Term & Word

10. Subject: Text, Term & Word

11. Classification: Text, Term & Word

Validation: required

Log file enabled, showing 'RecordID'

Leading articles: a an the

Stop words: a an and by for from in of the to

XML Match Fields:

1. RecordID

Textbase Defaults:

Default indexing mode: SHARED IMMEDIATE

Default sort order:

Textbase passwords:

Master password = ''

0 Access passwords:

No Silent password

Appendix C. Record content and input rules

Field #: 1

Field Name: RecordID

Semantics: A number for the object generated by the database.

Chief Source of Information: Automatic input by the database.

Input Rules: Automatic input by database.

Example: 3

Field #: 2

Field Name: RecordDate

Semantics: The date and time the record was created or modified.

Chief Source of Information: Automatic input by the database.

Input Rules: Automatic input by database.

Example: 2/21/2013 5:00:27

Field #: 3

Field Name: Artist

Semantics: The creator or composer of the cd.

Chief Source of Information: Cover of cd booklet or on the cd itself.

Input Rules: Enter the artist exactly as stated on cd or cover booklet.

Example: Jackson Browne

Field #: 4

Field Name: Album

Semantics: The name of the cd.

Chief Source of Information: Cover of cd booklet or spine of cd.

Input Rules: Enter the album title exactly as stated on cover booklet matching punctuation exactly.

Example: Mad Season

Field #: 5

Field Name: Year

Semantics: The date the cd was released to the public.

Chief Source of Information: Back of cd case or inside cover of cd booklet.

Input Rules: Input the copyright year as noted on the cd or back of the case.

Example: 1975

Field #: 6

Field Name: Genre

Semantics: Topical subjects of the music in question.

Chief Source of Information: Automatic input by the database.

Input Rules: List – choose one or more of the following or, if not available in the list, add to the list. Press F7 between each genre to denote new genre.

Alternative Rock

Classical

Country

Grunge

Metal

Pop

Rap

Rock

Example: Rock

Field #: 7

Field Name: Special Features

Semantics: Additions not typical of the average cd.

Chief Source of Information: Usually contained within the cd booklet.

Input Rules: Choose one or more of the following or, if no available term, add term:

Foreign Release

Instrumental

Karaoke

Live Album

Multiple Discs

None

Example: Two Discs

Field #: 8

Field Name: Track List

Semantics: Titles of the songs contained on the cd.

Chief Source of Information: Back of cd case or possibly within first page of front booklet.

Input Rules: Type the song titles exactly as printed on the cd. Press F7 to add multiple songs within a record.

Example:

; Angry

; Black & White People

; Crutch

; Last Beautiful Girl

; If You're Gone

; Mad Season

; Rest Stop

; The Burn

; Bent

; Bed of Lies

; Leave

; Stop

; You Won't Be Mine

Field #: 9

Field Name: Record Label

Semantics: The company responsible for producing the cd..

Chief Source of Information: Bottom of back of cd case or perhaps on the cd itself.

Input Rules: Note the publishing company in this section. Adding the title Records after the name of each company is unnecessary.

Example: Columbia

Field #: 10

Field Name: Subject

Semantics: Meaning of work, what it’s about

Chief Source of Information: CD case or CD booklet.

Input Rules: Add all authorized terms from the thesaurus in Appendix D.

Example: Love

Field #: 11

Field Name: Classification

Semantics: Organization of information into related categories

Chief Source of Information: CD case or CD booklet.

Input Rules: Find the genre that is entered into the database first, the artist or band name, the album title, and the RecordID. Take the first three letters from the main genre, the first three letters from the artist’s last name or the band’s name (excluding The), the first letter of each main word in the album title, and the RecordID; then separate each one with a period, except for after the RecordID which has no period following it.

Example: ROC.BRO.SBU.1

Appendix D. Sample Thesaurus

KEY:

Bold: Authorized Terms

BT: Broad Term

NT: Narrow Term

RT: Related Term

UF: Authorized Term

USE: Unauthorized Term

Awareness

BT understanding

NT intuition

RT recognition

Realization

UF Insight

Change

BT Transition

NT Metamorphosis

RT Permutation

Refinement

Dream

BT Vision

RT Fancy

Idea

Ideal

Reverie

NT Mysticism

Fun

BT Party

NT Enjoy

RT Enjoyment

Amusing

Merry

Hope

RT longing

Dream

Faith

Desire

Fancy

NT Ambition

Aspiration

Honesty

RT Openness

Truth

Identity

BT Philosophy

NT Existentialism

RT Character

Distinctiveness

Personhood

Insight

USE Awareness

Introspection

USE Reflection

Life

RT Experience

Heart

Spirit

Loss

RT Defeat

Bereavement

Hurt

Harm

Love

BT - Romance

NT - Crush

RT Devotion

Passion

Affection

UF - Passion

Maturity

NT – evolution

RT expansion

Improvement

Progress

Milestone

Passion

USE Love

Promise

RT Vow

Reflection

UF Introspection

Sacrifice

BT Denial

NT Self-denial

RT Refusal

Surrender

Work

RT Blue collar

Exertion

Sweat-of-the-brow

Hold down

NT Opus

Performance

Youth

BT Adolescence

RT Spring of life

Immaturity

Inexperience

Appendix E. Classification scheme

1. Scheme

| |A |B |C |

| |Genre |Artist/Band Name |Album Title |

|1 |ROC (Rock) |First three letters of|First letter of album|

| | |band name or of |title (or second if |

| | |artist’s last name. |two first letters are|

| | | |identical). |

|2 |COU (Country) | | |

|3 |CLA (Classical) | | |

|4 |ALT (Alternative Rock)| | |

|5 |POP (Pop) | | |

|6 |GRU (Grunge) | | |

|7 |MET (Metal) | | |

|8 |RAP (Rap) | | |

2. Notation rules

Facet name: Genre

Chief source of information: Title page of CD jacket.

Notation rules: Use the first three letters of the main Genre with which the artist is involved. Capitalize all three letters. End with a period.

Facet name: Artist/Band Name

Chief source of information: Front of CD jacket, title page of

Notation rules: Use first three letters of author’s last name if the artist is an individual or the first three letters of the band name. Capitalize all three letters in the name. End with a period.

Facet name: Album Title

Chief source of information: CD jacket or case.

Notation rules: Use the first letters of the main words in the album title where possible. Do not include the first letters of any minor words. Capitalize all letters. End with a period.

3. Rule for unique number

Use the RecordID automatically generated from database for the unique number. Do not punctuate.

4. Example: ROC.SPR.BR.3

The classification code ROC.SPR.BR.4 signifies the Bruce Springsteen album Born to Run, which falls into the Genre Rock (ROC), by the artist Bruce Springsteen (SPR) and the album title Born to Run (BR) with the RecordID (3) at the end.

Appendix F. Name authority file

1. Record structure specifications

|No. |Field name |Field type |Indexing |Entry validation |

|1 |RecordID |Autonumber |— |— |

|2 |RecordDate |Autodate |— |— |

|3 |AuthorizedName |Text |Term and Word |Required, Single |

|4 |VariantNames |Text |Term and Word |Required |

|5 |SourcesUsed |Text |Term and Word |Required |

2. Textbase structure

Textbase Structure

Textbase Information

Textbase: C:\Users\BJG\FileNameTextbase

Created: 5/2/2013 1:10:46 AM

Modified: 5/2/2013 1:10:46 AM

Field Summary:

1. RecordID: Automatic Number(next avail=6, increm=1), Term

2. RecordDate: Automatic Date(Both Date and Time,When Created), Term

3. AuthorizedName: Text, Term & Word

Validation: required, single-only

4. VariantNames: Text, Term & Word

Validation: required

5. SourcesUsed: Text, Term & Word

Validation: required

Log file enabled, showing 'RecordID'

Leading articles: a an the

Stop words: a an and by for from in of the to

XML Match Fields:

1. RecordID

Textbase Defaults:

Default indexing mode: SHARED IMMEDIATE

Default sort order:

Textbase passwords:

Master password = ''

0 Access passwords:

No Silent password

3. Record content and input rules

Field #: 1

Field name: RecordID

Semantics: Record number generated by the database.

Input rules: Automatically generated by the database.

Example: 1

Field #: 2

Field name: RecordDate

Semantics: Date and time the record was generated and/or modified.

Input rules: Automatically generated by the database.

Example: 5/2/2013 1:10:46 AM

Field #: 3

Field name: AuthorizedName

Semantics: The name of the author as entered into and stored in the database for search.

Input rules: Before entering the artist or band name check to make sure there is no variant used for the authorized name. Then, for each new artist or band, check the cd book jacket for the appropriate name and enter it exactly as found on the jacket.

Example: Jackson Browne

Field #: 4

Field name: VariantNames

Semantics: The various misspellings, alternate spellings, and known pseudonyms used by the artist or band or made by patrons.

Input rules: Enter any alternate spellings of names, common misspellings, pseudonyms, or any other mistakes patrons have made or might make.

Example: Jackson Brown

Field #: 5

Field name: SourcesUsed

Semantics: Place(s) where the officially authorized spelling of the name is likely to appear.

Input rules: Search the book jacket within the cd case, including the outside cover of the jacket, inside cover, and any pages which may contain the correct spelling.

Example: CD Book Jacket

4. Sample records

RecordID 1

RecordDate 5/2/2013 1:14:33

AuthorizedName Jackson Browne

VariantNames Clyde Jackson Browne

; Browne, Jackson

; Jackson Brown

SourcesUsed CD Book Jacket

$

RecordID 2

RecordDate 5/2/2013 1:16:56

AuthorizedName Matchbox Twenty

VariantNames Matchbox 20

; 20

; Matchbox

; Matbox 20

SourcesUsed CD Book Jacket

$

RecordID 3

RecordDate 5/2/2013 1:18:36

AuthorizedName Lynyrd Skynyrd

VariantNames Linard Skinard

; Lenard Skenard

; Leonard Skinner

; Lynard Skynard

SourcesUsed CD Book Jacket

$

RecordID 4

RecordDate 5/2/2013 1:20:20

AuthorizedName Wolfgang Amadeus Mozart

VariantNames Mozart

; Wolfgang

; Wolfgang Mozart

; Amadeus

; Motsart

; Motzart

SourcesUsed CD Book Jacket

$

RecordID 5

RecordDate 5/2/2013 1:21:50

AuthorizedName The Wallflowers

VariantNames The Wallflower

; Wallflowers

; Wallflower

; Wallflowers, The

SourcesUsed CD Book Jacket

$

Appendix G. Sample records

RecordID 1

RecordDate 2/21/2013 4:53:37

Artist Jackson Browne

Album Saturate Before Using

Year 1972

Genre Rock

Pop

Special Features None

Track List Jamaica Say You Will

A Child in These Hills

Song for Adam

Doctor My Eyes

From Silver Lake

Something Fine

Under the Falling Sky

Looking Into You

Rock Me on the Water

My Opening Farewell

Record Label Asylum

Subject Dream

Identity

Love

Loss

Maturity

Classification ROC.BRO.SBU.1

RecordID 2

RecordDate 2/21/2013 4:56:34

Artist Matchbox Twenty

Album Mad Season

Year 2000

Genre Alternative rock

Pop

Special Features None

Track List Angry

Black & White People

Crutch

Last Beautiful Girl

If You're Gone

Mad Season

Rest Stop

The Burn

Bent

Bed of Lies

Leave

Stop

You Won't Be Mine

Record Label Atlantic

Subject Change

Honesty

Identity

Life

Maturity

Classification ALT.MAT.MS.2

RecordID 3

RecordDate 2/21/2013 4:58:12

Artist Bruce Springsteen

Album Born to Run

Year 1975

Genre Rock

Special Features None

Track List Thunder Road

Tenth Avenue Freeze Out

Night

Backstreets

Born to Run

She's the One

Meeting Across the River

Jungleland

Record Label Columbia

Subject Dream

Hope

Identity

Sacrifice

Work

Youth

Classification ROC.SPR.BR.3

RecordID 4

RecordDate 2/21/2013 5:00:27

Artist Nirvana

Album Nevermind

Year 1991

Genre Grunge

Alternative rock

Special Features None

Track List Smells Like Teen Spirit

In Bloom

Come as You Are

Breed

Lithium

Polly

Territorial Pissing

Drain You

Lounge Act

Stay Away

On a Plain

Something in the Way

Endless, Nameless

Record Label DGC

Subject Awareness

Fun

Classification GRU.NIR.N.4

RecordID 5

RecordDate 4/3/2013 0:10:21

Artist Kansas

Album Somewhere to Elsewhere

Year 2000

Genre Pop

Rock

Special Features None

Track List Icarus II

When the World Was Young

Grand Fun Alley

The Coming Dawn (Thanatopsis)

Myriad

Look at the Time

Disappearing Skin Tight Blues

Distant Vision

Byzantium

Not Man Big

Geodesic Dome (Hidden track)

Record Label Magna Carta

Subject Awareness

Dream

Love

Promise

Reflection

Sacrifice

Classification POP.KAN.SE.5

RecordID 6

RecordDate 4/3/2013 0:15:33

Artist Garth Brooks

Album No Fences

Year 1990

Genre Country

Special Features None

Track List The Thunder Rolls

New Way to Fly

Two of a Kind, Workin' on a Full House

Victim of the Game

Friends in Low Places

Wild Horses

Unanswered Prayers

Same Old Story

Mr. Blue

Wolves

Record Label Capitol

Subject Fun

Love

Sacrifice

Classification COU.BRO.NF.6

RecordID 7

RecordDate 4/3/2013 0:17:30

Artist Lynyrd Skynyrd

Album The Encore Collection

Year 1998

Genre Rock

Special Features Live Album

Track List What's Your Name

We Ain't Much Different

Berneice

Free Bird

Sweet Home Alabama

Home is Where the Heart Is

O.R.R.

Bring it On

You Got That Right

Travelin' Man

Record Label BMG Special Products

Subject Awareness

Change

Honesty

Identity

Reflection

Classification ROC.LYN.EC.7

RecordID 8

RecordDate 4/3/2013 0:18:53

Artist Wolfgang Amadeus Mozart

Album Violinkonzert nos. 1, 2, 3

Year 1987

Genre Classical

Special Features None

Track List Konzert fur Violine und Orchester Nr. 1 Allegro Moderato

Adagio

Presto

Konzert fur Violine und Orchester Nr. 2 Allegro Moderato

Andante

Rondo. Allegro

Konzert fur Violine und Orchester Nr. 3 Allegro

Adagio

Rondo. Allegro

Record Label MCS Ltd.

Subject Identity

Classification CLA.MOZ.VN.8

RecordID 9

RecordDate 4/3/2013 0:21:41

Artist Various

Album Karaoke Opera

Year 1992

Genre Classical

Special Features Multiple Discs

Track List Disc 1

Votre Toast Je Peux Vous Le Rendre (Toreador's Song)

Segudilla

Habanera

Au Fond Du Temple Saint (Orchestra Only)

Au Fond Du Temple Saint (With Baritone)

Au Fond Du Temple Saint (With Tenor)

Mon Coeur S'ouvre A Ta Voix

Belle Nuit, O Nuit D'Amour

Belle Nuit, O Nuit D'Amour

Belle Nuit, O Nuit D'Amour

Un Bell Di Vedremo

Un Bell Di Vedremo

Largo AL Factotum

Brindisi

Che Gelida Manina

Quando M'En Vo

La Donna Mobile

Mio Babbino Caro

Madamina

Vesti La Giubba

Nessun Dorma

Disc 2

Votre Toast Je Peux Vous Le Rendre (Toreador's Song)

Segudilla

Habanera

Au Fond Du Temple Saint (Orchestra Only)

Mon Coeur S'ouvre A Ta Voix

Belle Nuit, O Nuit D'Amour

Un Bell Di Vedremo

Largo AL Factotum

Brindisi

Che Gelida Manina

Quando M'En Vo

La Donna E Mobile

Mio Babbino Caro

Madamina

Vesti La Giubba

Nessun Dorma

Record Label Innovative Music Productions Ltd.

Subject Fun

Reflection

Classification CLA.VAR.KO.9

RecordID 10

RecordDate 4/3/2013 0:26:37

Artist The Wallflowers

Album Breach

Year 2000

Genre Alternative rock

Special Features None

Track List Letters from the Wasteland

Hand Me Down

Sleepwalker

I've Been Delivered

Witness

Some Flowers Bloom Dead

Mourning Train

Up from Under

Murder 101

Birdcage

Babybird

Record Label Interscope

Subject Awareness

Change

Dream

Hope

Identity

Loss

Reflection

Classification ALT.WAL.B.10

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download