260 likes | 378 Views
DOCUMENT TYPES. Digital Documents. Converting documents to an electronic format will preserve those documents, but how would such a process be organized? And then, how could the electronic documents be distributed? Building a digital library for books and articles by:
E N D
Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized? And then, how could the electronic documents be distributed? Building a digital library for books and articles by: Digitizing books and articles Storing them in an indexed database
Mark-ups Mark-up is everything in a document that is not content.
Procedural mark-up Procedural mark-up are codes that contain information on how a specific application should process the document (example of procedural mark-up formats: Microsoft Word).
Presentational mark-up Presentational mark-up are codes that describe how the document should be presented or laid out, either on a computer screen or on a
Descriptive mark-up Descriptive mark-up are codes that describe the logical structure
printed page (example of presentational mark-up language: HTML). by many different software applications (example of descriptive markup meta-language: XML). Competition
documents Microsoft Word Rich Text Format templates To reduce the time of creating documents of the same type or class, like memos, letters, technical reports, research articles and invoices, document can help you.
Template contains styles sheet that will be used to format this type of document and framework with elements such as a standard front page, headers and footers, a standard set of sections and headings, etc.
Word processing software uses the most common form of procedural mark-up. Word processing format, such as Word, is useful when you have to create or edit a document. The mark-up in a word processor serves to specify how the document should be laid out when printed, and to control the functions of the word processing application.
Using a word processor such as Microsoft Word, you can set the style sheet, apply templates and create a visual structure for your document. Microsoft Word uses a proprietary, binary format: this causes problems in terms of standardization.
To resolve these problems, Microsoft have created another procedural format, RTF, that is a plain text format used as the exchange format between word processing applications.
HTML is an acronym, standing for Hypertext Markup Language. It is a language that can be transferred around the Internet and read by a Web Browser
Simple HTML documents can be created easily using any text editor. All content is defined by the markup "tags" of HTML, that arecontainers for whatever you put in the document. Using HTML you can define basic presentation of a document(headers, paragraphs, lists and tables), hyperlinks and multimedia information.from Word (doc) to HTML/PDF, from Word (doc) toXML, and XML to HTML/PDF.a rendition in a word processing format, such as MicrosoftWord, is useful when creating or editing the document, an HTML rendition is useful when viewing it on the Web,and
a page rendition as a bitmap graphic or PDF format may beuseful when a read-only page layout view is required.Conversion can be carried out:manually, when a person creates the rendition by re-keying the document content, and inserting the mark-up necessary.
using one or more computer programs that automatically convert the document from Document one format to another.Microsoft Word is often chosen as the original document creation application
However, many organizations are beginning to use XML to hold the source documents because it is easy to transform to other renditions; moreover, its mark-up captures the logical meaning of the content, it is open source and well defined with public specifications. There are a number of tools available on the market which can plug in to Word to help make the transformation to XML.
They generally use Word styles to make the transformation and rely on users of the word processor applying word styles in a consistent manner. In this case it is necessary that users have created Word documents using styles and templates correctly. If not, it is quite difficult to make a fully automated transformation from Word to XML.
One of the great advantages of XML is that it is very easy to transform XML mark-up to another format. The Extensible Style sheet Language for Transformations (XSLT) offers a standard way to transform XML and there are many XSLT transformation processors available, both as open source and as commercial products.
There is also a standard way to transform XML into page-formatted renditions such as PDF, Postscript or RTF, the XSL-FO. XSL-FO (XSL Formatting Objects) is a set of XML elements that represent objects such as pages, text blocks, tables, lists, footnotes, etc.
GIF, JPG, PNG The photograph or scanned image is sampled and mapped as a grid of dots or picture elements (pixels). GIF, JPG, PNG
PDF (Portable Document Format ) is a procedural mark-up language that allows page-formatted documents to be viewed and printed in their original format on almost any software platform. PDF is an ideal format for scientific documents that contain unusual symbols, and for multilingual documents.
The compression and incremental loading features of PDF make it well suited for transmission of documents over the Internet. Many software packages can be used to create PDF documents, and PDF viewers are available free of charge.
A PDF document contains a set of pages which are described by three main object types: path objects, image objects and text objects. Embedded TIFFs are PDF documents where the entire pages are TIFF images. XML, born as a profile of SGML, is an open standard for descriptive mark-up, used as exchange format between applications.
An XML document is well formed if it follows the basic rules of XML syntax. A Document Type Definition (DTD) and XML Schema are sets of rules which specify the logical structure that is allowable for a particular type of document. An XML document is valid if it complies with the rules set out in a DTD or XML Schema with which it is associated.
A Cascading Style Sheet (CSS) is a separate style sheet which contains simple rendering instructions for a XML document. Extensible Style sheet Language for Transformations (XSLT) is used to create style sheets which define transformations from XML to other XML or non-XML formats.