120 likes | 289 Views
Natural and programming languages. 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo. Language categories. Programming languages and software development. Definition: URL: http://en.wikipedia.org/wiki/Programming_language
E N D
Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo
Programming languages and software development • Definition: URL: http://en.wikipedia.org/wiki/Programming_language • Huge amount of artificially created programming languages: C, C++, perl, python, assembler, java, C# and so on… • Software development walks in hand-by-hand with natural language – use of clear, consistent, self-describing terminology improves software quality, simplifies communication between development teams. • Reference links: • The Importance Of TerminologyURL: http://www.computing.surrey.ac.uk/ai/pointer/report/section1.html • Case study: Good documentation reduces costs and increase salesURL: http://www.techscribe.co.uk/techw/cssdl.htm • Programming itself does not involve code writing on programming language – it involves documentation as well, for example GNU coding standards require documentation to be written: URL: http://www.gnu.org/prep/standards/standards.html#Documentation • If software is not documented properly – the worst case scenario is to end up something analogue to: • The International Obfuscated C Code Contest - URL: http://www.ioccc.org/ (Highly complicated code, which is understood only by compiler, not by people who read it) • Symbian OS Design Faults, URL: http://www.codeproject.com/KB/mobile/Symbian_OS_design_faults.aspx • Programming language by itself is nothing without natural language.
Natural languages and communication • Definition: URL: http://en.wikipedia.org/wiki/Natural_language • Lack of clear language structure, mostly bounded to history. • Takes time to learn. • Can be easily misused and misunderstood - for example ‘politics’ is quite often associated with “does not end up anywhere” kind of discussion. • “Communication usually fails except by accident” • Synonyms (words with similar meaning) quite often makes it more difficult to identify what is common and what is uncommon for example in software development. (Pollute the language)
Artificial natural languages (or constructed language) • Definition: URL: http://en.wikipedia.org/wiki/Constructed_language • Quite often designed with some particular idea in mind, suitable for people & communication. • Commonly not-widely used – because each language learning takes it’s time. • Language: Lojban • URLs: http://en.wikipedia.org/wiki/Lojban , http://www.lojban.org/tiki/tiki-index.php?page=Home+Page&bl • Not used very widely. Designed to simplify computer parsing. Cannot be used as programming language. • Language: Toki Pona • URL: http://www.tokipona.org/ • Simplified natural language, which demonstrates how efficiently language can be organized – language itself contains only 118 words. • There already exists an attempt to use this language in psychology. • Language: Inform 7 • URL: http://www.inform-fiction.org/I7/Inform%207.html • Attempts to use English as a basis to be able to write literature.
Conclusion • Learning each language takes time, no matter whether it’s programming language or natural language – it’s time for language to offer more than it’s normally offering – meaning natural language and programming language must be merged together. • The desired new language – let’s call it ‘Simple language’ (for time being) should possess features of natural languages (e.g. being native and easily understandable) and features programming language (being able to run on computer). • Starting point of new language design is English (natural). If it does not corresponds to language design needs – it needs to be tweaked in right direction. (Modularization and structurization)
Natural language: More ideas • Every (programming and natural) language has certain lexicon and semantics. • Lexicon and semantics of language allows or permits sentence expression flexibility – what you can and cannot express in that language. • Natural language typically has more flexible rules over programming language, allowing language to be ”compressed” – you can use fewer words to express more information. • This flexibility comes with cost – sentence which is understandable by first person, cannot be understood by another person. • Language also reflects to our mindset. We typically think ”in natural language”. • Language misuse allows language ”shortcut” to happen – like terrorism.
Simple language requirements • Being able to use as natural language (native) • Being able to use as programming language (understood by machine) • As easy to use as possible. • Clear, consistent, understandable • As simply structured as possible. • It should be possible to learn it – for example in parallel with normal natural language – e.g. by having user interface, which would display English and simple language correspondent words in parallel. • Simple language must be designed for people from people perspective, not for computer / compiler architectures. • All design subjects and decisions needs to be documented from very beginning – in case if language would need to be redesigned/restructured from some point.
Simple language: “Words” of simple language • Main data types, will be similar to perl-language: (http://perldoc.perl.org/perlguts.html , http://perldoc.perl.org/perlapi.html) • Scalar (= String or integer or • Array • Hash / associative array • Generalized into ’Word’ (Same as programming language’s ’variable’) • Any more complex data structure will be subset of those three types. (Complex ”type trees” as well) • Design: Typically in C we operate on structures – they are hashes, optimized for processor access. C++ libraries provide wide range of data types, like stl::string, stl::vector – however – all of them lacks of structure recursion. (Hash of hashes of arrays of hashes of scalars…)
Simple language: Compiler / generator • Each program should be possible to re-structure into complex type tree using simple language ”words”. • Back translation into original language (for example source code generation) should be provided. • Simple language compiler will consists of: • Bootstrap (in ansi C) – modules responsible for ”bootstrapping” simple language processor / compiler • Simple language (in simple language) – main modules responsible for everything else. (Code / source generation) • Design: Help to improve existing software development.
Notes to myself… • …
Software: Data type reflection • Some useful links on data type reflection: • Java suffers from its slowness, and reflection is one of the reasons java became so slow in first place. URL: http://www.awprofessional.com/articles/article.asp?p=26872&seqNum=1&rl=1 • C# type information - see previous pages (on intermediate language) • XML is a brand new technology, which heavily suffers from over abstraction and unnecessary complexity. W3C XML Schema provides something similar to type information. XSD links: URL: http://www.w3.org/TR/xmlschema-1/ URL: http://www.w3.org/XML/Schema Relax NG: URL: http://www.relaxng.org/ • RelaxNG compared to W3C XSD specification: http://www.xml.com/lpt/a/2002/01/23/relaxng.html • Nemerle, a new language born with .NET / C# platform, provides its own type information availability. URL: http://nemerle.org/Code_Completion#Building_the_Type_Tree • http://nemerle.org/Image:TypeTreeNew.png (Quite good pictures of a type tree)