I'm in the process of learning Clojure and I can't understand some language design decisions: Why does a language with immutable Strings like Clojure also needs Keywords and Symbols data types? Couldn't strings just have optional namespaces and metadata and all this stuff? For immutable strings comparison could just as well be identity base, no?
Or, since interop with Java is a must have for Clojure, at least have the Java String type and a KeywordSymbol data type.
I find this String/Keyword/Symbol "trichotomy" especially weird since Clojure seems very focused on "purity" and keeping things simple in other aspects.
They fill very different roles within the language:
- Vars are used to give names to things. They implement
runnable
and can be used directly to invoke functions. You cannot run a string.
- Keywords are names by themselves, and look themselves up in maps. They really help Clojure keep its "data driven" flavor. Strings do not implement the required interfaces to look themselves up in maps.
- Strings are just strings. They do what they need to do and not much more.
One of the core principles in the design of Clojure was to embrace your host platform, thus in Clojure strings are Java strings and you never need to wrap a Java string in some convert-to-clojure-string
function in order to get it into the Clojure ecosystem. This necessitated using unmodified Java strings, as well as the numeric types. Keywords and symbols are new constructs that are being added by Clojure, so it is only necessary to make them accessible in a useful way from the rest of the Java ecosystem. Symbols and Keywords make themselves accessible by simply being classes that implement an interface. It was believed in the beginning that in order for a new language to succeed in the JVM ecosystem, it needed to fully embrace Java and minimise the "impedance mismatch" (sorry for the buzzwordism) even if that required adding more to the language than would have been required without this goal.
edit:
You can sort of turn a symbol into a keyword by def
ing it to it's self
user> a
; Evaluation aborted.
user> :a
:a
user> (def a 'a)
#'user/a
user> a
a
user>
keywords evaluate to themselves
I think Clojure values "practicality" (if that's the correct word) somewhat more than "purity". This can be seen in the fact, Clojure has syntax for maps, vectors and sets in addition to lists, and is using it to define the language. In Scheme, which is much more concerned with purity (IMO), you only have syntax for lists.
As Arthur Ulfeldt points out strings, keywords and symbols have their intended use cases. And using them as intended makes it easier to read Clojure code. It's similar to what is happening with HTML 5, which adds semantic mark-up. Things like <article>
and <section>
, which you can represent with <div class="article">
and <div class="section">
in HTML 4.
OH, and you're wrong about comparing strings just by identity. This is guaranteed to work only for interned strings. And you don't want to intern too many strings as they are stored into the so called permgen, which is quite limited in size and never garbage collected.