<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Xophmeister&#039;s World</title>
	<atom:link href="http://xoph.co/feed/" rel="self" type="application/rss+xml" />
	<link>http://xoph.co</link>
	<description>Renaissance Blog</description>
	<lastBuildDate>Tue, 24 Apr 2012 03:45:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Just Married</title>
		<link>http://xoph.co/20120422/just-married/</link>
		<comments>http://xoph.co/20120422/just-married/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 18:30:48 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Personal]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=758</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p> <img src='http://xoph.co/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120422/just-married/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The Xophmeister Distance</title>
		<link>http://xoph.co/20120410/xophmeister-distance/</link>
		<comments>http://xoph.co/20120410/xophmeister-distance/#comments</comments>
		<pubDate>Tue, 10 Apr 2012 15:23:28 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Computational]]></category>
		<category><![CDATA[Human-Computer Interaction]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[Phonetics and Phonology]]></category>
		<category><![CDATA[Theory]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=741</guid>
		<description><![CDATA[A couple of posts back, we briefly touched upon the Levenshtein Distance: an orthographic metric which measures the number of edit operations (inserts, deletes and modifications) needed to turn one string into another. In computer science and information theory, this &#8230; <a href="http://xoph.co/20120410/xophmeister-distance/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A couple of posts back, <a href="http://xoph.co/20111007/building-a-better-search-engine/">we briefly touched upon the Levenshtein Distance</a>: an orthographic metric which measures the number of edit operations (inserts, deletes and modifications) needed to turn one string into another. In computer science and information theory, this &#8212; and variations on the theme &#8212; prove to be quite useful. In this post, I propose an eponymous* linguistic counterpart which <em>may</em> be useful in the fields of phonology and natural language processing.</p>
<p><em>(*) If Vladimir can do it, so can I&#8230;and because I&#8217;m too lazy to check Wikipedia to see whether it already exists <img src='http://xoph.co/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </em><br />
<span id="more-741"></span></p>
<h1>Definitions</h1>
<p>For the sake of simplicity, we shall consider <em>phonetic words</em> to be made up of one or more <em>syllables</em>, which themselves are simply strings of <em>phonetic segments</em> under an <a href="http://en.wikipedia.org/wiki/Autosegmental_phonology">autosegmental model</a>. We will not impose syllabic or word structure, but are aware that such structure does indeed exist and the scrutiny of which may prove useful in refining our definitions, herein.</p>
<p>The <strong>Xophmeister Phoneme Distance</strong> (XPD) is a metric between two phonetic segments to the natural numbers (zero included: we assume this throughout and will no longer mention it explicitly). It is loosely defined to be the featural difference between two segments under the working autosegmental model. Specifically:</p>
<ul>
<li>For binary features: Switching polarity incurs a cost inversely proportional to the feature&#8217;s distance to the tree root on that tier. That is, switching the root&#8217;s first child node incurs the maximum cost for that tier, whereas switching the extreme leaf nodes incurs a cost of 1.</li>
<li>For univalent features: Deleting or inserting such features incurs a cost inversely proportional to the feature&#8217;s distance to the tree root on that tier, as with binary features.</li>
</ul>
<p>The <strong>Xophmeister Syllable Distance</strong> (XSD) is a metric between two syllables to the natural numbers. It is defined to be the sum of segment insertions and deletions, from one syllable to another, multiplied by a <em>segment cost</em>, with the XPDs from modifying existing phonemes.</p>
<p>The <strong>segment cost</strong> is the sum of all necessary features, across all tiers and levels, to uniquely define that segment. Depending on the segment&#8217;s complexity, this may be relatively large.</p>
<p>The <strong>Xophmeister Distance</strong> (XD) is a metric between two phonetic words to the natural numbers. It is defined to be the sum of syllable insertions, deletions <em>and</em> transpositions, from one word to another, multiplied by a <em>syllable cost</em>, with the XSDs from modifying existing syllables. (We include transpositions because, anecdotally, it&#8217;s a fairly common phenomenon.)</p>
<p>The <strong>syllable cost</strong> is the maximum XSD of the two syllables in question.</p>
<h1>Example</h1>
<p>It&#8217;s fairly laborious to work these out by hand, so we shall only consider a very simple example: Chris ~ Dish.</p>
<p>Here, orthographic and phonetic words coincide and each is only one syllable in length: that makes our life much easier! We see the syllable nuclei align, so the XD is reduced to a sum of XPDs. XPDs comprise the smallest changes to our metric, which coincides with our intuition that &#8220;Chris&#8221; and &#8220;Dish&#8221; sound very similar:</p>
<p style="text-align: center;">XD(kʰɹɪs, tɪʃ) = XPD(kʰ, ∅) + XPD(ɹ, t) + XPD(s, ʃ)</p>
<p>The stress and tone tiers in English, for these two words, are irrelevant, and the timing tier only comes into play because of the extra phonemic segment [kʰ]; thus, we&#8217;re only interested in the the segmental tier.</p>
<p>In our autosegmental model &#8212; you&#8217;ll have to trust me a bit, here &#8212; there is a maximum of three layers in the feature geometry. Thus, the cost to insert the [kʰ] is something like 8. [s] and [ʃ] are very similar, just two leaf features change, so that&#8217;s a cost of 2. Finally, the [ɹ] and [t] are reasonably similar and give a distance of 5. Thus, the Xophmeister Distance between &#8220;Chris&#8221; and &#8220;Dish&#8221; is 15.</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120410/xophmeister-distance/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Life with a Smart Phone</title>
		<link>http://xoph.co/20120327/life-with-a-smart-phone/</link>
		<comments>http://xoph.co/20120327/life-with-a-smart-phone/#comments</comments>
		<pubDate>Tue, 27 Mar 2012 09:27:11 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=722</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p><a href="http://xoph.co/wp-content/uploads/2012/03/159.gif" rel="lightbox[722]" title="Loading..."><img class="aligncenter size-full wp-image-723" title="Loading..." src="http://xoph.co/wp-content/uploads/2012/03/159.gif" alt="Loading..." width="128" height="128" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120327/life-with-a-smart-phone/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shame</title>
		<link>http://xoph.co/20120309/shame/</link>
		<comments>http://xoph.co/20120309/shame/#comments</comments>
		<pubDate>Fri, 09 Mar 2012 17:10:26 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Cycling]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[Writing]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=704</guid>
		<description><![CDATA[Suffer through green, Gradient and light; Sun and shadow Fuelling the fight. I exchange blood To taste limitless sky. Sweat, metal and rubber. Three weeks in July. An eruption of grey And cacophonous noise: A circuitous route Without nature or &#8230; <a href="http://xoph.co/20120309/shame/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Suffer through green,<br />
Gradient and light;<br />
Sun and shadow<br />
Fuelling the fight.</p>
<p>I exchange blood<br />
To taste limitless sky.<br />
Sweat, metal and rubber.<br />
Three weeks in July.</p>
<p>An eruption of grey<br />
And cacophonous noise:<br />
A circuitous route<br />
Without nature or poise.</p>
<p>Choked and bloated<br />
Is my life now.<br />
I traded horizons<br />
For a joyless cash cow.</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120309/shame/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lazy Duplication of Records in Oracle</title>
		<link>http://xoph.co/20120308/lazy-duplication-of-records-in-oracle/</link>
		<comments>http://xoph.co/20120308/lazy-duplication-of-records-in-oracle/#comments</comments>
		<pubDate>Thu, 08 Mar 2012 10:37:02 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[PL/SQL]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=697</guid>
		<description><![CDATA[Occasionally, one needs to duplicate a table record; either in the same table, or into another table (e.g., a history or audit table). Of course, when copying data into the same table, you will need to modify a few fields &#8230; <a href="http://xoph.co/20120308/lazy-duplication-of-records-in-oracle/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Occasionally, one needs to duplicate a table record; either in the same table, or into another table (e.g., a history or audit table). Of course, when copying data into the same table, you will need to modify a few fields to avoid breaking any uniqueness constraints.</p>
<p>In vanilla Oracle SQL, this can be done like so:</p>
<pre class="brush: plsql; gutter: true; first-line: 1; highlight: []; html-script: false">insert into myTable (id, description, cost)
values      select id + 1,
                   description,
                   150
            from   myTable
            where  id = 123;</pre>
<p>Fine. However, I&#8217;m a lazy programmer and what happens if there are 50 fields in your table and you only need to change two of them? Even in the above example, having three fields is pushing it!</p>
<p>One approach would be to create a temporary table (<abbr title="Create Table As Select">CTAS</abbr>) and then <code>insert</code> from there:</p>
<pre class="brush: plsql; gutter: true; first-line: 1; highlight: []; html-script: false">create table temp as select *
                     from   myTable
                     where  id = 123;

update temp set id   = 124,
                cost = 150;

insert into myTable values select * from temp;

drop table temp;</pre>
<p>What a pain in the arse! Moreover, DDL commands like <code>create</code> and <code>drop</code> will commit your transaction, which isn&#8217;t necessarily what you want. So here&#8217;s an alternative, using PL/SQL:</p>
<pre class="brush: plsql; gutter: true; first-line: 1; highlight: []; html-script: false">declare
  myRecord myTable%rowtype;
begin
  select * into myRecord
  from   myTable
  where  id = 123;

  myRecord.id   := 124;
  myRecord.cost := 150;

  insert into myTable values myRecord;
end;</pre>
<p>Sorted! Obviously you incur some penalty from using a PL/SQL block, but it&#8217;s negligible compared to the saving in coding and probably equivalent to the CTAS approach (but without the implicit <code>commit</code>). Moreover, clearly, by only changing the field values that you need to, it mitigates against programmer error: So lazy is good!</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120308/lazy-duplication-of-records-in-oracle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>你好</title>
		<link>http://xoph.co/20120216/ni-hao/</link>
		<comments>http://xoph.co/20120216/ni-hao/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 21:20:34 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Phonetics and Phonology]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[中文]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=668</guid>
		<description><![CDATA[My other half is Chinese, so naturally I am learning Mandarin &#8212; and some Shanghainese &#8212; from her. (What can I say: I&#8217;m the scholarly type and she&#8217;s a good teacher!) Anyway, like, I&#8217;m sure, many a Westerner embarking on &#8230; <a href="http://xoph.co/20120216/ni-hao/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>My other half is Chinese, so naturally I am learning Mandarin &#8212; and some Shanghainese &#8212; from her. (What can I say: I&#8217;m the scholarly type and she&#8217;s a good teacher!) Anyway, like, I&#8217;m sure, many a Westerner embarking on learning a tonal language, I find it difficult to properly modulate my voice: so let me share a phonetic technique that I&#8217;ve developed!</p>
<p>Try to retract the base of your tongue, flattening and broadening the back, while speaking. This has a three-fold effect, conducive to speaking Chinese:</p>
<ul>
<li>By tensing the muscle, you have slightly more control over your tongue, so you can move it more agilely. This is necessary for the quick response required when changing tones.</li>
<li>Retracting the tongue opens the oral cavity, giving more potential to change your voice&#8217;s frequency and hence produce the different tones.</li>
<li>A bonus effect is that doing this puts the tip of your tongue in a better place to produce the many sibilant and retroflex sounds which are common in Mandarin.</li>
</ul>
<p>Having said all that, I&#8217;m still pretty bad at it! I find it particularly difficult to do falling tones at the beginning of words. (I cannot say 姐姐 to save my life!) My guess as to why this might be is because, while English obviously doesn&#8217;t have tones, an intonation pattern is manifested through stress, giving &#8212; as English usually stresses the first syllable &#8212; a rising tonal quality.</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120216/ni-hao/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>My Friend Lily</title>
		<link>http://xoph.co/20120208/my-friend-lily/</link>
		<comments>http://xoph.co/20120208/my-friend-lily/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 22:28:31 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Linguistics]]></category>
		<category><![CDATA[Syntax]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=662</guid>
		<description><![CDATA[Here&#8217;s a curious bit of syntax. I make the following grammaticality judgements: I made it friendly. I made something. That thing is friendly. I made it friendlily. I made something. I did so in a friendly manner. * I wrote &#8230; <a href="http://xoph.co/20120208/my-friend-lily/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a curious bit of syntax.<br />
<span id="more-662"></span><br />
I make the following grammaticality judgements:</p>
<ol>
<li>I made it friendly.<br />
<em>I made something. That thing is friendly.</em></li>
<li>I made it friendlily.<br />
<em>I made something. I did so in a friendly manner.</em></li>
<li>* I wrote it friendly.</li>
<li>I wrote it friendlily.<br />
<em>I wrote something. That thing is friendly. ≫ I wrote something. I did so in a friendly manner.</em></li>
</ol>
<p><em>Friendlily</em> is a bit of an unusual word, but it is a legitimate adverb. What&#8217;s strange is that in my usage in (4), my default reading is that of an adjective; whereas the adverbial reading is marked. Stranger still is that when using the real adjective &#8212; <em>friendly</em> &#8212; I find this ungrammatical; or, at best, barely acceptable. I can correct (3), according to my grammar, by adding an infinitival copula or by being (presumably) explicit about the semantic selection:</p>
<ul>
<li>I wrote it <em>to be</em> friendly.</li>
<li>I wrote it <em>in a</em> friendly style.</li>
</ul>
<p>However, paradoxically, this has changed the semantics. Something along the lines of, &#8220;I wrote something. I did so to be friendly.&#8221; This is particularly true in the first example, whereas the binding in the second example is ambiguous; both err towards the adverbial reading.</p>
<p>What&#8217;s going on here? Why does <em>make</em> work as you&#8217;d expect, but <em>write</em> doesn&#8217;t? (Indeed, it seems that most verbs don&#8217;t work.) Why is it that, if you change <em>it</em> to something non-pronominal, it fixes things:</p>
<ul>
<li>I wrote something friendly.<br />
<em>I wrote something. That thing is friendly.</em></li>
<li>I wrote something friendlily.<br />
<em>I wrote something. I did so in a friendly manner.</em></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120208/my-friend-lily/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The World&#8217;s Prose Mix</title>
		<link>http://xoph.co/20120123/the-worlds-prose-mix/</link>
		<comments>http://xoph.co/20120123/the-worlds-prose-mix/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 15:07:35 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Ethnography]]></category>
		<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=552</guid>
		<description><![CDATA[According to Wikipedia, at the time of writing there are a total of 205 sovereign states; including those that are disputed. That&#8217;s at least 205 different cultures, so the question is: Are there any concentrations of peoples in the world &#8230; <a href="http://xoph.co/20120123/the-worlds-prose-mix/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>According to <a href="http://en.wikipedia.org/wiki/List_of_sovereign_states">Wikipedia</a>, at the time of writing there are a total of 205 sovereign states; including those that are disputed. That&#8217;s <em>at least</em> 205 different cultures, so the question is: Are there any concentrations of peoples in the world that represent all of these? If I were to look, the first places which I&#8217;d examine are the so called <a href="http://en.wikipedia.org/wiki/Global_city">&#8220;Global Cities&#8221;</a>. The most prominent and, anecdotally, most diverse of which being London and New York City.</p>
<p>Now, suppose we can select 205 people from one of these cities, each representing their own home country, what now? Well, we&#8217;re all human and we all have stories, flavoured by our cultural heritage; so we write! That is, we set up a <a href="http://en.wikipedia.org/wiki/Round-robin_story">round-robin story</a> where everyone in our group gets an equal share (e.g., a page each).</p>
<p>Potentially, all our authors would write in their own language, so we&#8217;d need a cohort of translators so everyone understands what&#8217;s going on, as well as producing a coherent end product. Furthermore, some kind of editorial control would need to be exercised, that steers and plans the characters and plot; however, beyond that, creative freedom is afforded to our authors.</p>
<p>An alternative &#8212; although much more ambitious &#8212; idea would be to focus on linguistic diversity, rather than sovereign cultures. This pushes the number of potential contributors through the roof and, with it, the potentially impossible task of translation! (Let alone the logistical burden of finding volunteers across the world.) Therefore, instead of writing a round-robin story, how about a dialogue-free film, where each person acts as screenwriter and co-director for a scene?</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120123/the-worlds-prose-mix/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apostrophe&#8217;s War</title>
		<link>http://xoph.co/20120116/apostrophes-war/</link>
		<comments>http://xoph.co/20120116/apostrophes-war/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 10:45:18 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Linguistics]]></category>
		<category><![CDATA[Orthography and Grammatology]]></category>
		<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=582</guid>
		<description><![CDATA[With a title like &#8220;Apostrophe&#8217;s War&#8221;, you&#8217;ll be forgiven in thinking that this is yet another trite rant on the supposed abuse of the apostrophe in contemporary written English. However, those who know me as a linguist may think I &#8230; <a href="http://xoph.co/20120116/apostrophes-war/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>With a title like &#8220;Apostrophe&#8217;s War&#8221;, you&#8217;ll be forgiven in thinking that this is yet another trite rant on the supposed abuse of the apostrophe in contemporary written English. However, those who know me as a linguist may think I am writing to counter this barbarian prescriptivism with scientific purity.</p>
<p>No and no!<br />
<span id="more-582"></span><br />
Many will be familiar with the notion of &#8220;Grammar Nazis&#8221;: a cohort of lunatics and those whom have too much time on their hands who believe that proper English usage is that which has been passed on to them through hallowed, Victorian texts. Immutable laws that afford derision towards anyone who doesn&#8217;t adhere!</p>
<p>Of course, these laws are largely bullshit. They were derived from, what was considered at the time, the pure languages of the classical world: Latin and Ancient Greek. Things like, &#8220;don&#8217;t split infinitives&#8221; and &#8220;you can&#8217;t end a sentence with a preposition&#8221;. These processes, if they can be done at all, lead to ungrammaticality in these languages; but English is not Latin or Ancient Greek. Of course, it borrows a lot and shares its ancestry with their Indo-European roots, but the point about language is that it is not immutable. There is a notion of grammaticality &#8212; you can&#8217;t just randomly string words together &#8212; but it is much more subtle and innate than these prescriptive rules abide. Language is a fluid, living entity.</p>
<p>So what about the humble apostrophe?</p>
<p>Well, in <em>written</em> English, the apostrophe is used in several circumstances: it indicates possession (i.e., a genitive case marker, of sorts) and also contraction (morphosyntactic elision). The confusion comes when both of these effects are applied simultaneously, or with plurals, which share a morpheme. For example, this infamously occurs with pronouns: &#8220;its&#8221; is the possessive form of &#8220;it&#8221;, whereas &#8220;it&#8217;s&#8221; is short for &#8220;it is&#8221;; given that that genitive usage is perhaps more marked, this is a mistake familiar to any English teacher, proofreader and editor alike!</p>
<p>This misuse is what upsets people. There&#8217;s even an &#8220;<a href="http://www.apostrophe.org.uk/">Apostrophe Protection Society</a>&#8221; for the truly militant! A recent article on <a href="http://languagelog.ldc.upenn.edu/nll/?p=3703">Language Log</a>, however, highlights how misinformed this attitude is. The author makes the point very well by countering the argument with numerous and esteemed counterexamples and, importantly, without turning it into an argument between prescriptivists and descriptivists. Allow me to build on this:</p>
<p>From a morphological and phonological point-of-view, it is clear that the genitive morpheme is identical to the plural morpheme and the contracted copula. That is, for example, you can&#8217;t disambiguate between &#8220;dogs&#8221; [dɒgz] and &#8220;dog&#8217;s&#8221; [dɒgz], nor &#8220;its&#8221; [ɪts] and &#8220;it&#8217;s&#8221; [ɪts]. Syntactically, however, we can easily make a distinction based on the local context: our aforementioned innate ability to parse a sentence&#8217;s grammatical structure.</p>
<p>You&#8217;ll have noticed that I emphasised &#8220;written English&#8221; above: this wasn&#8217;t an accident! We thus see the point that the apostrophe is nothing more than an orthographic convention &#8212; merely part of the writing system &#8212; used to disambiguate between the various, overloaded forms. Just as commas are used to hint at prosodic phrase boundaries, its purpose is to make assimilating linguistic information in written form (i.e., reading) easier through consistency.</p>
<p>So now, back to the argument against apostrophe misuse. Personally, I would have to agree: not from a linguistic point-of-view, but from the belief that orthographic conventions, even those as bizarre as English&#8217;s, aren&#8217;t an inherently bad thing. By using a common style &#8212; spelling and punctuation &#8212; we can help to ensure comprehensibility and understanding in writing.</p>
<p>Of course, it&#8217;s not unto me (nor anyone) to say what is orthographically right and wrong. It, like any aspect of linguistics, must be allowed to evolve organically: in Shakespeare&#8217;s time, as the Language Log article demonstrates, apostrophes were not used; nowadays, they are expected, but seemingly in decline. The momentum of the corpus is of course a factor &#8212; writing systems obviously change more slowly than natural, spoken language &#8212; but change is indeed inevitable.</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120116/apostrophes-war/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Endianness</title>
		<link>http://xoph.co/20120110/endianness/</link>
		<comments>http://xoph.co/20120110/endianness/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 13:59:14 +0000</pubDate>
		<dc:creator>Xophmeister</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Orthography and Grammatology]]></category>
		<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://xoph.co/?p=571</guid>
		<description><![CDATA[In computer science, there&#8217;s a concept known as &#8220;endianness&#8221;. It refers to the order in which data is stored: &#8220;big endian&#8221; means that we start with the most significant bit and move to the least; &#8220;little endian&#8221; is the other &#8230; <a href="http://xoph.co/20120110/endianness/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In computer science, there&#8217;s a concept known as &#8220;endianness&#8221;. It refers to the order in which data is stored: &#8220;big endian&#8221; means that we start with the most significant bit and move to the least; &#8220;little endian&#8221; is the other way around. For example, the decimal number 123, in 8-bits, can be represented as either 01111011 (<img src='http://s0.wp.com/latex.php?latex=2%5E6+%2B+2%5E5+%2B+2%5E4+%2B+2%5E3+%2B+2%5E1+%2B+2%5E0&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='2^6 + 2^5 + 2^4 + 2^3 + 2^1 + 2^0' title='2^6 + 2^5 + 2^4 + 2^3 + 2^1 + 2^0' class='latex' />) in big endian, or 11011110 (<img src='http://s0.wp.com/latex.php?latex=2%5E0+%2B+2%5E1+%2B+2%5E3+%2B+2%5E4+%2B+2%5E5+%2B+2%5E6&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='2^0 + 2^1 + 2^3 + 2^4 + 2^5 + 2^6' title='2^0 + 2^1 + 2^3 + 2^4 + 2^5 + 2^6' class='latex' />) in little. There are engineering reasons as to why one would choose one representation over the other.</p>
<p>Human, rather than binary encoded data often &#8212; at least from my anecdotal experience &#8212; follow the big endian model. For example, while this may be an artefact of the technology behind it, phone numbers follow a pattern like &#8220;country, area, number, extension&#8221;. Arguably, it&#8217;s much more logical as the most important part comes first, then tailing off. However, there are a number of inconstancies in some widely used formats; perhaps arising from the definition of &#8220;most important&#8221; differing from a purely quantitative scale.<br />
<span id="more-571"></span></p>
<h1>Examples</h1>
<h2>Time and Date</h2>
<p>In British English, we say the date as &#8220;day of the month, month, year&#8221;: today is 10<sup>th</sup> January, 2012. This is little endian as the year is the category that has the most weight. Paradoxically, time (as in the time of day) is expressed in big endian: &#8220;hours, minutes, seconds, etc.&#8221;. Time of day and date are both forms of time &#8212; indeed, the date can be seen as the next order of magnitude, after time of day &#8212; so why the mix?</p>
<p>Presumably, this is because the time of day and date are not often expressed simultaneously, in every day conversation, and because we (or, at least, the British) have different priorities when referring to each. That is, it&#8217;s more important to know the day &#8212; something one tends to forget &#8212; over the year.</p>
<p>Of course, in other cultures, we see different models. Chinese, as we shall see in the next section, is consistently big endian: the (Western) date is expressed in the form &#8220;2012年1月10日&#8221;, the same order as time. This is also the format taken by the ISO, the International Standards Organisation, in <a href="http://en.wikipedia.org/wiki/ISO_8601">ISO 8601</a>.</p>
<p>Then there&#8217;s US English, to really muddle things, which expresses the date as &#8220;month, day of the month, year&#8221; (e.g., January 10<sup>th</sup>, 2012). This ordering is grossly inconsistent &#8212; it&#8217;s neither big nor little endian &#8212; and my only assumption behind its inception is some kind of prescriptive or stylistic custom.</p>
<h2>Postal Addresses</h2>
<p>When we send a letter, in the West, the person we send it to is the most important, followed by their home, town, region and country. With the exception of postal codes, which are usually appended near the end of the address as a mechanical routing aid, this little endian format is ubiquitous.</p>
<p>In China, however, it&#8217;s written the other way around: the country comes first, reducing finally to the recipient&#8217;s name. At first glance, the big endian format appears to be a tad impersonal. Perhaps, but think how much easier it is for the postal service to route: they can look at the first line and know which district to send it to, then that district office can forward it to the appropriate city by reading the next line, continuing the process until we reach our destination. In a country the size of China, the big endian format could well speed up delivery time; which doesn&#8217;t seem so impersonal, after all!</p>
<h2>Domain Names, URLs and E-Mail Addresses</h2>
<p>Again, these examples are somewhat biased by their technological underpinnings, but bear with me!</p>
<p>The domain name system is little endian: the top level domain (<code>.</code>) comes at the right and we traverse downwards by writing leftwards. For example, in <code>xoph.co</code>, <code>.xoph</code> is a subdomain of <code>.co</code>, which sits below the root. Again, this is probably because the lower nodes in the domain tree are &#8220;more important&#8221;, in human terms, than the umbrella levels: You don&#8217;t care about <code>.co</code> &#8212; nothing sits there, anyway &#8212; but the <code>.xoph</code> subdomain is where things get interesting. This is also the way in which we, at least in English, read: e.g., the Amazon company, in the UK (<code>amazon.co.uk</code>), or the mail server at the Wikipedia organisation (<code>mail.wikipedia.org</code>), etc.</p>
<p>However, then it&#8217;s mixed into the big endian URL system, which starts with the domain name &#8212; which itself is in little endian but is the most significant part of the URL &#8212; followed by an increasingly variegated filesystem path. E-mail addresses, on the other hand, preserve the little endianness of the domain name only if the account name, which is largely arbitrary, maintains little endianness: e.g., <code>j.doe@example.com</code> is little endian, but <code>doej@example.com</code> is mixed.</p>
<h2>Bookshelves</h2>
<p>If you read from left-to-right, the front cover of a book is the left side and you progress through. When you&#8217;re done, you put your book on the shelf and, because the spine faces outwards, the front of the book is now on the right-hand-side. This effect is compounded with multiple volumes, which would be stacked left-to-right, but whose contents (from the perspective of someone looking at them in a bookshelf) are now mixed endian.</p>
<p>This effect isn&#8217;t limited to left-to-right writing systems. Right-to-left would have the same problem, just that everything is in the opposite direction. The only possible exception I can think of, when limited to horizontal writing systems, is modern Japanese: this is usually written left-to-right (like English), but starts on the far-right page (what a Westerner would call the last page) and works its way leftwards (towards the Western front page). Thus, when shelved, endianness is monotonic; but, of course, at the expense of a mixed endian page order!</p>
<p>The change in orientation, when shelving, means that we must always have mixed endianness somewhere, in this situation!</p>
]]></content:encoded>
			<wfw:commentRss>http://xoph.co/20120110/endianness/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

