What does it mean to think? 

It may surprise you to know, but I was once a philosopher.  To be more accurate, I was once a clueless college student who thought “philosophy” would be a good major.  I eventually switched to a science major, but not before I took more philosophy classes than most folks ever intend to. 

A concept that was boring back then, but relavent now, is that of the “Chinese Room.”  John Searle devised this thought experiment to prove that machines cannot actually think, even if they pass Turing Tests.  The idea goes something like this: 

Say we produce a computer program which takes in Chinese Language inputs and returns Chinese Language outputs, outputs which any speaker of Chinese can read and understand.  These outputs would be logical responses to whatever inputs are given, such that the answers would pass a Turing Test if given in Chinese.  Through these inputs and outputs, this computer can hold a conversation entirely in Chinese, and we might describe it as being “fluent” in Chinese, or even say it can “think” in Chinese. 

But a computer program is fundamentally a series of mathematical operations, “ones and zeros” as we say.  The Chinese characters which are taken in will be converted to binary numbers, and mathamatical operations will be performed on those numbers to create an output in binary numbers, which more operations will then turn from binary numbers back into Chinese characters.   

The math and conversions done by the computer must be finite in scope, because no program can be infinite.  So in theory all that math and conversions can themselves be written down as rules and functions in several (very long) books, such that any person can follow along and perform the operations themselves.  So a person could use the rules and function in these books to: 1.) take in a series of Chinese characters, 2.) convert the Chinese to binary, 3.) perform mathamatical operations to create a binary output, and 4.) convert that binary output back into Chinese. 

Now comes the “Chinese Room” experiment.  Take John Searle and place him in a room with all these books described above. John sits in this room and recieves prompts in Chinese.  He follows the rules of the books and produces an output in Chinese.  John doesn’t know Chinese himself, but he fools any speaker/reader into believing he does.  The question is: is this truly a demenstration of “intelligence” in Chinese?  John says no. 

It should be restated  that the original computer program could pass a Turing Test in Chinese, so it stands to reason that John can also pass such a test using the Chinese Room.  But John himself doesn’t know Chinese, so it’s ridiculous to say (says John) that passing this Turing Test demonstrates “intelligence.”   

One natural response is to say that “the room as a whole” knows Chinese, but John pushed back against this.  The Chinese Room only has instructions in it, it cannot take action on its own, therefore it cannot be said to “know” anything.  John doesn’t know Chinese, and only follows written instructions, the room doesn’t know Chinese, in fact it doesn’t “know” anything.  Two things which don’t know Chinese cannot add up to one thing that does, right? 

But here is where John and I differ, because while I’m certainly not the first one to argue so, I would say that the real answer to the Chinese Room problem is either that “yes, the room does know Chinese” or “it is impossible to define what “knowing” even is.” 

Let’s take John out of his Chinese Room and put him into a brain.  Let’s shrink him down to the size of a neuron, and place him in a new room hooked up to many other neurons.  John now receives chemical signals delivered from the neurons behind him.  His new room has a new set of books which tell him what mathematical operations to perform based on those signals.  And he uses that math to create new signals which he sends on to the neurons in front of him.  In this way he can act like a neuron in the dense neural network that is the brain. 

Now let’s say that our shrunken down John-neuron is actually in my brain, and he’s replaced one of my neurons.  I actually do speak Chinese.  And if John can process chemical signals as fast as a neuron can, I would be able to speak Chinese just as well as I can.  Certainly we’d still say that John doesn’t speak Chinese, and it’s hard to argue that the room as a whole speaks Chinese (it’s just  replacing a neuron after all).  But I definitely speak Chinese, and I like to think I’m intelligent.  So where then, does this intelligence come from? 

In fact every single neuron in my brain could be replaced with a John-neuron, each one of which is now a room full of mathematical rules and functions, each one of which takes in a signal, does math, and gives an input to the neurons further down the line.  And if al these John-neurons can act as fast as my neurons, they could all do the job of my brain, which contains all of my knowledge and intelligence, even though John himself (and his many rooms) know nothing about me.   

Or instead each one of my neurons could be examined in detail and turned into a mathematical operation.  “If you recieve these specific impulses, give this output.”  A neuron can only take finitely many actions, and all the actions of a neuron can be defined purely mathematically (if we believe in realism).   

Thus every single neuron of my brain could be represented mathematically, their actions forming a complete mathematical function, and yet again all these mathematical operations and functions could be written down on books to be placed in a room for John to sit in.  Sitting in that room, John would be able to take in any input and respond to it just as I would, and that includes taking in Chinese inputs and responding in Chinese.  

You may notice that I’m not really disproving John’s original premise of the Chinese Room, instead I’m just trying to point out an absurdity of it.  It is difficult to even say where knowledge begins in the first place.   

John asserts that the Chinese room is just books with instructions, it cannot be said to “know” anything.  And so if John doesn’t know Chinese, and the Room doesn’t know Chinese, then you cannot say that John-plus-the-Room knows Chinese either, where does this knowledge come from? 

But in the same sense none of my neurons “knows” anything, they are simply chemical instructions that respond to chemical inputs and create chemical outputs.  Yet surely I can be said to “know” something?  At the very least (as Decarte once said) can’t I Know that I Am? 

And replacing any neuron with a little machine doing a neuron’s job doesn’t change anything, the neural net of my brain still works so long as the neuron (from the outside) is fundementally indistinguishable from a “real” neuron, just as John’s Chinese Room (from the outside) is fundementally indistinguishable from a “real” knower of Chinese. 

So how do many things that don’t know anything sum up to something that does?  John’s Chinese Room  is really just asking this very question.  John doesn’t have an answer to this question, and neither do I.  But because John can’t answer the question, he decides that the answer is “it doesn’t,” and I don’t agree with that.   

When I first heard about the Chinese room my answer was that “obviously John *can’t* fool people into thinking he knows Chinese, if he has to do all that math and calculations to produce an output, then any speaker will realize that he isn’t answering fast enough to actually be fluent.”  My teacher responded that we should assume John can do the math and stuff arbitrarily fast.  But that answer really just brings me back to my little idea about neurons from above, if John can do stuff arbitrarily fast, then he could also take on the job of any neuron using a set of rules just as he could take on the job of a Chinese-knower. 

And so really the question just comes back to “where does knowledge begin.”  It’s an interesting question to raise, but raising the question doesn’t provide an answer.  John tries at a proof-by-contradiction by saying that the Room and John don’t know Chinese individually, so you cannot say that together they know Chinese.  I respond by saying that none of my individual neurons know Chinese, yet taken together they (meaning “I”) do indeed know Chinese.  I don’t agree that he’s created an actual contradiction here, so I don’t agree with his conclusion. 

I don’t know where knowledge comes from, but I disagree with John that his Chinese Room thought experiment disproves the idea that “knowledge” underlies the Turing Test. Maybe John is right and the Turing Test isn’t useful, but he needs more than the Chinese Room to prove that.

Ultimately this post has been a huge waste of time, like any good philosophy.  But I think wasting time is sometimes important and I hope you’d had as much fun reading this as I had writing it.  Until next time. 

More Arabic

كل يوم انا اشعر بالتعب جداً. هذا السبب الذي انا عم لا اكتب في هنا كل الاسبوع. لان اذهب الى الشغل بالدراجة، في هذا الوقت في السنة، اعود الى بيتي بعد الساعة الصباح.واركب في الظلام ايداً.

انا اتمنى عنضي الطاقة. اتمنى ان عندما اعود الى بيتي، في ذلك الوقت اراد ان افعل شي مثل اكتب في بلوغي واكتب برنامج كمبيوتر. لكن في هذا الايام، ليس عندي اي طاقة لاي شئ الا اشاهد شي في يوتوب.

اراد ان امارس اللغات ايضاً، هذا لان انا عم اكتب هذا بوست.

انا اشعر بااغضب ايضاً. الطريق في هذا المدينة خطيرة جداً. لكن المدينة تهتم.

ارادت ان اكتب اكثر من هذا لكن اراد ان بوست الان.

The words of Jesus on the Cross

It’s a little late, but since it’s still Easter season I was thinking about languages and in particular the language of Jesus. The gospels of course record different versions of what exactly Jesus said when he died on the cross. But Matthew and Mark record a version that sounds like it could be historical.

Matthew and Mark both record Jesus’s final sentence as “Eli, Eli, lema Sabachthani,” which means “My God, My God, why have you forsaken me?” Although they differ slightly in their spelling. Now, Jesus’s native language was Aramaic, but I’ve always been intrigued by how similar this phrase sounds to the Arabic I learned in school.

To start with “Eli” is almost identical to how you would say “My God” in modern Arabic, El = God and the noun ending -i makes it possessive for “my.” Sabach isn’t a verb I ever learned in Arabic, but if it does mean “to forsake,” then Sabachthani is also very close to how you would conjugate it in Arabic for this sentence.

But the part I’ve always been most interested in is “lema.” Now the Arabic word for “why” is “lematha,” but it’s made up of two pieces: “le” means “for” and “matha” means “what.” So “lematha” = “for what” = “why.” But there’s another word for “what” in Arabic, you can say “ma” instead of “matha” in many cases. So can you also say “lema” = “for what” = “why” in Arabic as well? I don’t know for sure, but it sounds like a likely etymology for the Aramaic word as well.

The bible gives us very few direct quotes in Aramaic, the native language of Jesus and many people in his day. It’s good to hear what we can from in their own tongue.

Happy Easter

My kingdom for a venv

I’ve never enjoyed using Python. I think my feelings on it can be summed up by this video. But for whatever reason, Python is unavoidable if you want to do anything with AI/machine learning. And so as someone wanting to get into AI, I have no choice but to use it.

But I don’t have to learn to code it of course, because all the tools you need for AI area already written and available. ChatGPT is of course easy to use on the web. But what if you wanted to have a version of ChatGPT that was snarkier, or wrote better jokes, or was in whatever way tuned specifically for your needs and wants? In that case, you can always make a fine-tuned language model and use it yourself.

But that’s where Python rears its ugly head. I wanted to fine tune a language model. So I installed LLaMA, downloaded a simple model from huggingface, and got to work. 

To fine-tune a model for your own needs, you need to have data and you need to annotate that data. No time to explain how annotations work, but there are programs that make it easy. There is a program called Label Studio that I thought I could use. The instruction say to just download python, make a venv (virtual environment) and have pip (a python installer) install Label Studio. Sounds easy, right? Just 3 lines of code.

The trouble started almost immediately because despite Label Studio telling me it was available for Windows, the install instructions were actually written for Linux. I realized this and corrected it, but the trouble didn’t stop. Once I created the venv, I tried to install Label Studio, but one of the dependencies failed to install so the whole process failed.

Uh… what? Why is this program, which is available as a paid enterprise product by the way, failing to install itself due to a dependency issue? I find the missing dependency and try installing it directly to the venv, hoping that fixes the issue. But no, it still errors out. What am I missing?

So it turns out that when I directly install that dependency, it installs the latest version of it. But Label Studio is looking for a specific older version, so it still tries to install the older version when installing itself. I tried to install the specific older version, and that fails too. Apparently I can install the new version with no issues, but not the old version.

Reading the message closely, it says that to install the old version I need to have another python module installed and also add that other module to the system path. Now we’re getting into part of why I hate venv. The thing about Python is that if you install itself outside of a contained environment, it infects your computer and doesn’t get out. Ask an amateur pythonist how to remove an old version of Python, and see the blank look on their face. Just deleting the folder doesn’t fix it.

And this old version/new version bs can mess you up something fierce, because some other python module will start looking for what it needs, and find the old version instead of the new version. Or it will be sent to where the old version used to be, but finding nothing there it will error out. Venv is supposed to fix all this so you only install things into designated containers where they can’t escape.

But I can’t do that, because to install something into this venv, I have to install another package and add it to the path of my entire Windows system. So the venv isn’t even doing what it’s supposed to do!

So I gave up. I hate having to use python like this, normal programs will just come to you as an executable or a zip and you use them. Python always needs to install itself everywhere and then usually fails even then. So I won’t use Label Studio and will look for another tool instead.

If anyone knows of a good annotation tool for LLM data, hit me up.

我觉得不太好

我觉得不太好。我的工作现在不太好,我做的时候不好可是我不知道为什么。 我应该净化这些蛋白质可是我只净化错的蛋白质。我的ferritin帖子是因为我不知道什么净化真的蛋白质,我每一天试一下净化真的蛋白质我只找得到ferritin。所以那时我的问题。

我的电脑只可以写简体字,可是最多的我的中文说的朋友是台湾人,他们用繁体字。所以我希望他们不是offended我在用简体字。

الكتبة بالعربية صعب جداً

امس كتبتُ بالصينية عن اشيأ، واليوم اريد ان اكتب بالعربية قليل. لكن الكتابة بالعربية صعب جداً. عندما اكتب بالصينية, اكتب حروف إنجليزي والكومبيوتور يعطيني الكلمات الصينية. لكن لا افعل هذا بالعربية. فالحقيقة ما في العربية اَي حروف إنجليزي مثل فالصينية. بسبب ذلك لا أستطيع ان اشوف الى مفاتيحي واستخدمها. من اللازم عن استخدم مفاتيح عربي أو أعرف اين كل الحروف بدون اقراء اليها. 

فكيف أنا اكتب هذا؟ بايفون. ايفوني عنده المفاتيح العربي فأستطيع ان أشوفها. لكن ما في أندرويد (Android) هذا المفاتيح. فلازم استخدم ايفون اذا اود ان افعل هذا. 

اكتب بالعربية بطيء جداً. ايداً مهارتي بالعربية ليس كثير  وأنا خطأ إملائي كثيراً. بس اود ان احاول هذا فلن ازال هن هذا في المستقبل. 

هذا كان فسير جذاً. مذا اخر اريد ان أقول؟ أنا عم العب لعبة فيديو عن اشيأ في الوقت تصنيع. هي “التاريخ المغاير” وفيها العب عن المصر بعد احرب مع “اوتوماني” (Ottomans) وأنا اتحدت كل المصري والبدوي والمشرقي (هل “مشرقي” كالمة الخقيقة؟) وأتحداهم في بلد العربي وحاربتُ البلدين الأوروبي.  كان كثير من الحرب في الوقت التصنيع. ابي يقول هو لا احب لعباتي لان كلها عن الحرب لكن هو كل وقت يشاهد الفيلم عن أو في الحرب العالمية في ١٩٤٠ أو تلك وقت. فالخقيقة هو لا يحب اللعبة الفيديو وهذا اوكي بس هذا ليس عن حرب أو لا حرب. 

فهذا كان مرح واعرف كثير معه ليس سحيح بالعربي بس اود ان افعل هذا مرة أخرى!

我在想用中文写一个报

我觉得报的意思是“report”所以因为我不知到怎么说”post”用中文所以我说报。

我现在必须写一个工作的报告,可是我告诉我的自己我会每一个天在我的blog写一个报。所这是我的报。我想在工作找得到新朋友,可是这是特别难的。每一个工人我们都工作以后回家,不做好玩的东西。所以我跟同工不花时间,所以我跟同工不当朋友。

我也不知道什么我想用中文说。不知道怎么关于科学用中文写。我的工作用蛋白质,我可以说那。我们学病毒,我可以说那。可是怎么说别的东西?

这个包是不太条可是我没昨天晚上写,我今天在写。所以我应该做工作,所以不太条不太错。

How do you read in a language you only half understand?

Whenever I learn a new language, there always comes a time when I start to get good enough at it to recognize and understand certain words, but not good enough to know every word I come across.  I can read half a sentence but not the whole sentence, understand half a paragraph but not the whole paragraph.  This is a difficult time for a learner because you’re just on the cusp of truly using the language to read, but you don’t feel good enough to actually use it because you only understand half of what you read.  How do you get better?

The answer (so I’ve been taught) is you still try to read.  Even if you don’t understand everything, even if you only understand half of it, you try to read what you can so you can get familiar with the language and start learning by using.  Most words we know were probably never defined to us specifically, did anyone ever define to word “anyone” to you?  Instead as learners we pick them up by context clues and other hints, and start using them the way we read or heard them.  This can occasionally lead to hilarity, like how I once heard someone describe a child as homely instead of comely, but it can also lead to learning as you start to use and understand each new word you read.

So if I’m reading something and I come upon words I don’t understand, I was taught not to look each one of them up, but instead to just keep reading and try to figure them out as I go.  I may read a sentence that says “he went to the 餐厅, and after he’d finished his meal he…”.  Although I don’t know what 餐厅 means directly, it seems that “he” ate their, so it must be some sort of eating place.  Now whenever I see that word again I see if it seems to have something to do with eating, and if it does then I can learn by usage that 餐厅 means “a place where you eat.” Through this process I can slowly pick up the language through usage rather than trying to stop and look up every word.

But here’s the secret: this trick also works with scientific writing.  Scientific writing is filled to the brim with jargon and odd definitions.  What is an SDS-PAGE?  What is an HPLC?  And not only are the words difficult, the concepts are difficult, why did they use centrifugation to separate out the nucleus?  Why does electron microscopy not let you visualize the less-rigid parts of a protein?  When you start out as a scientist, you are often told to read scientific papers, and scientific papers can feel like you’re reading a foreign language!  But the same rules apply as reading a foreign language, you don’t always have to know every word when you’re starting out, or even every concept.  It’s more important to develop scientific language fluency so that you can get the big idea out of a paper and understand it when speaking with others.  For example, they used HPLC to separate a protein of interest from all the other proteins in a cell.  OK so HPLC is a purification technique, I don’t need to know how it works if all I’m interested in is that protein of interest.  I can move on to what the paper says about the protein secure in the knowledge that it is indeed pure.  If later on if HPLC becomes more important then I can do a quick search or deep dive to understand more of it, but it isn’t always necessary to know every single word or technique in a paper. Reading scientific papers is a skill, one I’ve had to devote a lot of time to getting better at, but once you develop knowledge of the jargon and techniques it gets a lot easier, and importantly you develop the skills necessary to learn any new jargon or techniques that you come across.  And that is the real skill, not the knowledge of specific things but the ability to learn new things.  That is what truly makes a scientist.

Who is the “protagonist” of a narrative spanning over a century?

A few days ago I posted about my favorite Chinese-language media, and included in that list the Three Kingdoms TV show that can be watched on Youtube. The TV show is heavily based on the “Romance of the 3 Kingdoms” novel written hundreds of years ago, and I remember reading a(n abridged) version of the novel when I was in University.

One of the most interesting conversations I had was with a Chinese friend of mine who had read the book in middle school. I basically told him “I really like this book, and it’s got some cool characters like Cao Cao, he seems to be the main character.” My friend said “really? I remember Zhuge Liang being the main character.” At that point I hadn’t even met Zhuge Liang in the book so was confused. In the sections I read, Cao Cao was in many ways the driving force behind the narrative: he tried to assassinate Dong Zhou, he helped raise a rebel army, many of the plot threads were from his perspective as he warred across the Central Plain.

And yet my friend’s memory was correct, as soon as Zhuge Liang enters the narrative, HE is the clear protagonist of the story. He is very clearly shown as the smartest, wisest, most dedicated general, and anyone who is in any way cool will at some point get shown up by Zhuge Liang to prove that Zhuge Liang is even cooler. Perhaps his only drawback is that he is too smart, I remember a conversation sometime before the Battle of Red Cliffs where someone admonishes him to remember that not everyone understands what he’s saying or doing because they aren’t as smart as him.

But of course the narrative lasts a very long time, many of these characters grow old and die before it is finished. So in a long-running character-spanning narrative, how do you even define who the main protagonist is? I guess in a way you don’t, Three Kingdoms is more an ensemble cast of characters who rise and fall throughout the narrative, and that’s part of what makes it so great.

Language Post: my favorite Chinese media

Whenever I’ve studied a language, the most common advice I’ve been given is to consume as much media as possible in that language so that I can learn to use it naturally and with more fluency than how it is taught in a classroom. There are only so many hours in a day for in-class teaching, and most classes don’t have enough time to dedicate to actual language use, rather you spend most of your time studying the structure and fundamentals of the language so you can better pick up the language when you do use it, which the teacher hopes will be done outside the classroom. Also language, like most skills, operates on “use it or lose it,” and the more you use it (by consuming foreign-language media) the less likely you are to lose those lessons you picked up in the classroom.

But I live in a predominantly English speaking society, and don’t have much exposure to foreign-language media, so for a long time I didn’t know where or how I could find foreign language media. I’ve eventually found some media that I enjoy, and I’d like to share it so anyone else learning languages can also practice and enjoy. Pretty much all the media I’ve found is Chinese-language mainly, so if anyone has their own media from other languages, feel free to share.

In terms of TV, there was a long-running Chinese TV drama called Three Kingdoms, a retelling of the famous Romance of the Three Kingdoms novel. The whole TV show can still be watched on youtube for free, and is an excellent way to at least listen to some Chinese, even if the speaking style is old-fashioned.

For music, the band Transition is a really fun band made up of some English gentlemen living in Taiwan. They sing in Chinese but what’s also great is that as English speakers themselves, they have a bit of an “English accent” to their Chinese which is recognizable to me since that’s how all my friends spoke in Chinese class. I sometimes recognize a word sung by them where I wouldn’t recognize it otherwise because the accent is familiar.

For video games, Pokemon is actually a really good one, the 3DS games (and possible the newer ones too) usually have an option to pick your language settings before the game starts. The games are simple enough that you shouldn’t have a problem beating them even in a foreign language, and it gives you a lot of opportunities to read the language. I played Pokemon Ultra Sun in Chinese, which is also great as the story of that game is that your character is an immigrant from Kanto to Alola, and playing in a foreign language lets you roleplay some of the immigrant experience. The game also is noticeable for pretending your character has agency while never actually letting you talk, so I pretended I was someone who was unconfident in the language so didn’t speak as much.

For books, I actually haven’t found as many good ones. I’ve read a few Chinese/Taiwanese kids books as well as the first Harry Potter book in Chinese, but they don’t keep my interest as much as something like Pokemon. If anyone has any suggestions for good Chinese lit that’s accessible for a non-fluent foreign speaker, let me know.