Just Married

:)

Posted in Personal | 5 Comments

The Xophmeister Distance

A couple of posts back, we briefly touched upon the Levenshtein Distance: an orthographic metric which measures the number of edit operations (inserts, deletes and modifications) needed to turn one string into another. In computer science and information theory, this — and variations on the theme — prove to be quite useful. In this post, I propose an eponymous* linguistic counterpart which may be useful in the fields of phonology and natural language processing.

(*) If Vladimir can do it, so can I…and because I’m too lazy to check Wikipedia to see whether it already exists :P
Continue reading

Posted in Computational, Human-Computer Interaction, NLP, Phonetics and Phonology, Theory | 2 Comments

Life with a Smart Phone

Loading...

Posted in Rants | Leave a comment

Shame

Suffer through green,
Gradient and light;
Sun and shadow
Fuelling the fight.

I exchange blood
To taste limitless sky.
Sweat, metal and rubber.
Three weeks in July.

An eruption of grey
And cacophonous noise:
A circuitous route
Without nature or poise.

Choked and bloated
Is my life now.
I traded horizons
For a joyless cash cow.

Posted in Cycling, Personal, Writing | Leave a comment

Lazy Duplication of Records in Oracle

Occasionally, one needs to duplicate a table record; either in the same table, or into another table (e.g., a history or audit table). Of course, when copying data into the same table, you will need to modify a few fields to avoid breaking any uniqueness constraints.

In vanilla Oracle SQL, this can be done like so:

insert into myTable (id, description, cost)
values      select id + 1,
                   description,
                   150
            from   myTable
            where  id = 123;

Fine. However, I’m a lazy programmer and what happens if there are 50 fields in your table and you only need to change two of them? Even in the above example, having three fields is pushing it!

One approach would be to create a temporary table (CTAS) and then insert from there:

create table temp as select *
                     from   myTable
                     where  id = 123;

update temp set id   = 124,
                cost = 150;

insert into myTable values select * from temp;

drop table temp;

What a pain in the arse! Moreover, DDL commands like create and drop will commit your transaction, which isn’t necessarily what you want. So here’s an alternative, using PL/SQL:

declare
  myRecord myTable%rowtype;
begin
  select * into myRecord
  from   myTable
  where  id = 123;

  myRecord.id   := 124;
  myRecord.cost := 150;

  insert into myTable values myRecord;
end;

Sorted! Obviously you incur some penalty from using a PL/SQL block, but it’s negligible compared to the saving in coding and probably equivalent to the CTAS approach (but without the implicit commit). Moreover, clearly, by only changing the field values that you need to, it mitigates against programmer error: So lazy is good!

Posted in Oracle, PL/SQL | Leave a comment

你好

My other half is Chinese, so naturally I am learning Mandarin — and some Shanghainese — from her. (What can I say: I’m the scholarly type and she’s a good teacher!) Anyway, like, I’m sure, many a Westerner embarking on learning a tonal language, I find it difficult to properly modulate my voice: so let me share a phonetic technique that I’ve developed!

Try to retract the base of your tongue, flattening and broadening the back, while speaking. This has a three-fold effect, conducive to speaking Chinese:

  • By tensing the muscle, you have slightly more control over your tongue, so you can move it more agilely. This is necessary for the quick response required when changing tones.
  • Retracting the tongue opens the oral cavity, giving more potential to change your voice’s frequency and hence produce the different tones.
  • A bonus effect is that doing this puts the tip of your tongue in a better place to produce the many sibilant and retroflex sounds which are common in Mandarin.

Having said all that, I’m still pretty bad at it! I find it particularly difficult to do falling tones at the beginning of words. (I cannot say 姐姐 to save my life!) My guess as to why this might be is because, while English obviously doesn’t have tones, an intonation pattern is manifested through stress, giving — as English usually stresses the first syllable — a rising tonal quality.

Posted in Personal, Phonetics and Phonology, Random, 中文 | 2 Comments

My Friend Lily

Here’s a curious bit of syntax.
Continue reading

Posted in Linguistics, Syntax | 2 Comments

The World’s Prose Mix

According to Wikipedia, at the time of writing there are a total of 205 sovereign states; including those that are disputed. That’s at least 205 different cultures, so the question is: Are there any concentrations of peoples in the world that represent all of these? If I were to look, the first places which I’d examine are the so called “Global Cities”. The most prominent and, anecdotally, most diverse of which being London and New York City.

Now, suppose we can select 205 people from one of these cities, each representing their own home country, what now? Well, we’re all human and we all have stories, flavoured by our cultural heritage; so we write! That is, we set up a round-robin story where everyone in our group gets an equal share (e.g., a page each).

Potentially, all our authors would write in their own language, so we’d need a cohort of translators so everyone understands what’s going on, as well as producing a coherent end product. Furthermore, some kind of editorial control would need to be exercised, that steers and plans the characters and plot; however, beyond that, creative freedom is afforded to our authors.

An alternative — although much more ambitious — idea would be to focus on linguistic diversity, rather than sovereign cultures. This pushes the number of potential contributors through the roof and, with it, the potentially impossible task of translation! (Let alone the logistical burden of finding volunteers across the world.) Therefore, instead of writing a round-robin story, how about a dialogue-free film, where each person acts as screenwriter and co-director for a scene?

Posted in Ethnography, Random | Leave a comment

Apostrophe’s War

With a title like “Apostrophe’s War”, you’ll be forgiven in thinking that this is yet another trite rant on the supposed abuse of the apostrophe in contemporary written English. However, those who know me as a linguist may think I am writing to counter this barbarian prescriptivism with scientific purity.

No and no!
Continue reading

Posted in Linguistics, Orthography and Grammatology, Rants | Leave a comment

Endianness

In computer science, there’s a concept known as “endianness”. It refers to the order in which data is stored: “big endian” means that we start with the most significant bit and move to the least; “little endian” is the other way around. For example, the decimal number 123, in 8-bits, can be represented as either 01111011 (2^6 + 2^5 + 2^4 + 2^3 + 2^1 + 2^0) in big endian, or 11011110 (2^0 + 2^1 + 2^3 + 2^4 + 2^5 + 2^6) in little. There are engineering reasons as to why one would choose one representation over the other.

Human, rather than binary encoded data often — at least from my anecdotal experience — follow the big endian model. For example, while this may be an artefact of the technology behind it, phone numbers follow a pattern like “country, area, number, extension”. Arguably, it’s much more logical as the most important part comes first, then tailing off. However, there are a number of inconstancies in some widely used formats; perhaps arising from the definition of “most important” differing from a purely quantitative scale.
Continue reading

Posted in Computer Science, Orthography and Grammatology, Random | Leave a comment