Character Encoding in MARC records

Character Encoding in MARC records (MARC-8 and Unicode)

“While MARC records are encoded in MARC21 (if not XML) there is generally a choice on character encoding between MARC8 and Unicode. So, what is MARC8?”

“The newest version of the SobekCM MARC Library includes the most common combinational characters and a mapping between MARC8 to Unicode/UTF8, although alternate character sets are still not supported. Libraries which deal heavily in alternate character sets are more likely to be aware of character encoding issues and export in unicode encoding. What is more common is to have just a spattering of characters which decode incorrectly.”

Mark V. Sullivan is the “primary architect of widely-adopted open-source SobekCM METS Editor software to assist libraries with creation of metadata files for inclusion in digital repositories.”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: