The Martian Way - Novella

Enjoyed reading "The Martian Way" story by Asimov. Takes place in a future setting where there are dwellers in mars, and Earth has decided to ration the supply of water to mars. Martians get worried, but a group, adventurous enough, decide to get the water from "elsewhere" in the galaxy.

Still thinking like an earthling?

Get out of your rut, open your mind—there’s a whole universe waiting. It’s waiting for people who aren’t afraid of thinking in new ways, of doing new things.

Can you imagine…

…mining the skies for water?

…building a new world beneath the surface of a strange planet?

…making pets out of alien explorers?

…putting your life in the hands of a teen-aged computer?

If you can, you’re well on your way to becoming a member of the space generation—you’re thinking The Martian Way.

Also, asimovreviews.net review for the martian way

There will be code

There is a wired story which predicted that advancements in artificial intelligence will result in programs getting trained rather than being written. This is keeping in line with machine learning advancements popular these days where the number of people who are doing "machine learning" need not learn "how to write the machine learning algorithms", instead they just need to learn "how to use machine learning algorithms" on data. They should be efficient at using "algorithms".

That was total alien concept to me until I saw a whole category and industry flourish with that concept. Given the background, the wired articles prediction was that programs will be trained instead of being written.

I found an alternate argument, from a credible and respectable source Clean Code By: Robert C. Martin It goes like this.

There Will Be Code

One might argue that a book about code is somehow behind the times—that code is no longer the issue; that we should be concerned about models and requirements instead. Indeed some have suggested that we are close to the end of code. That soon all code will be generated instead of written. That programmers simply won’t be needed because business people will generate programs from specifications.

Nonsense! We will never be rid of code, because code represents the details of the requirements. At some level those details cannot be ignored or abstracted; they have to be specified. And specifying requirements in such detail that a machine can execute them is programming. Such a specification is code.

I expect that the level of abstraction of our languages will continue to increase. I also expect that the number of domain-specific languages will continue to grow. This will be a good thing. But it will not eliminate code. Indeed, all the specifications written in these higher level and domain-specific language will be code! It will still need to be rigorous, accurate, and so formal and detailed that a machine can understand and execute it.

The folks who think that code will one day disappear are like mathematicians who hope one day to discover a mathematics that does not have to be formal. They are hoping that one day we will discover a way to create machines that can do what we want rather than what we say. These machines will have to be able to understand us so well that they can translate vaguely specified needs into perfectly executing programs that precisely meet those needs.

This will never happen. Not even humans, with all their intuition and creativity, have been able to create successful systems from the vague feelings of their customers. Indeed, if the discipline of requirements specification has taught us anything, it is that well-specified requirements are as formal as code and can act as executable tests of that code!

Remember that code is really the language in which we ultimately express the requirements. We may create languages that are closer to the requirements. We may create tools that help us parse and assemble those requirements into formal structures. But we will never eliminate necessary precision—so there will always be code.

The Man Who Knew Infinity

"An equation for me has no meaning, unless it represents a thought of God." - Srinivasa Ramanujan

"The man who knew infinity" is the story of Srinivasa Ramanujan. This phrase, associated with Ramanujan should be familiar to many due to a very popular book with the same title. There is a motion picture directed by Matt Brown based upon this book.

I watched his movie on a Saturday at Berkeley. I was excited to listen to the QA with the director, and Mathematicians such as Richard Borcheds , Olga Holz and Ken Ribet following the screening.

In India, we have been exposed to Ramanujan quite early in our lives. I had stories in school book about great Indian mathematicians like Aryabhata, Brahmangupta, and Bhaskaracharya. Indian Satellites to Space launched by ISRO bear their names too. Along with those personalities, we have Srinivasa Ramanujan and whom we know for his invention of Magic Squares, and Ramanujan Number, the smallest number expressible as sum of two cubes.

\begin{equation*} 1729 = 1^3 + {12}^3 = 9^3 + 10^3 \end{equation*}

Beyond that, I had not known much about the significance of this number or Ramanujan's work.

Years later, when taking online courses, I had come to know about Hardy and Ramanujan's research in Number Theory being at the core of secure communication and cryptography. Do you use Credit Cards online and feel confident that someone will not maliciously take your money? All this is was because of the research in number theory.

There is also famous anecdote associated with Ramanujan, known to many from South India, Ramanujan prayed to a goddess, and she gave inspiration for his work. He mentions that his theorems had come up like a dream, a boon granted by the Goddess, and he would write formulae in his book. That's how he invented those theorems.

If it happened like that, imagine how excited young students from Madurai will be, where there a temple every hundred feet. :) It needs to be clarified that, Ramanujan was a genius and he also worked very hard on his subjects.

A better understanding of this phenonmemon, in general, is now known. The style of discovery is called " diffused mode learning", wherein after an intense work on challenging problem, the solution suddenly comes up during a restful time.

All these are portrayed well in this movie. The relationship between Hardy and Ramanujan, the scientific culture at Trinity College, London revealed in real detail. The movie has a significant portion dealing with how Hardy mentors Ramanujan, and strives to bring his work to the modern world.

In the question and answer session that followed, two questions were of interest to me.

How do mathematicians study Ramanujan's work when he has not left many formula proofs with his equation?

Richard Borcheds, who is an accomplished mathematician, replied that Ramanujan's work was published in the form of series of notebooks. He left behind three notebooks containing almost 3000 theorems, virtually all without proof. The reason he could have done that is perhaps he grew up in a time when the paper was very expensive for him and he wanted to be economical.

(It did not answer the question, but provided a good perspective).

Question 2: In the movie, Ramanujan is shown to be desisting writing formal proofs. Is that true?

Richard Borcheds shared that, it is bit exaggerated in the movie. Ramanujan always knew the proof of his work and could state it if he wanted to. But he usually did not.

I enjoyed watching this movie and listening to the perspectives associated with the genius from kumbakonam.

Story of Arthur's Self Discipline

This short story of Aurther, his transformation from a paratrooper, to a person who could not carry himself and later he "regaining himself" is truly inspiration.

His quote is a great to hold on to.

"If you cannot do it today, it does not mean you cannot do it someday." - Arthur

Oxford Comma

Oxford Comma is the final comma in the list of things mentioned in a sentence. In this statement,

I like computer science, maths, and programming.

The comma after the word maths and before the conjunction, and, is considered the Oxford comma. Other commonly used terms that refer to this style are the serial comma, a series comma, or a Harvard comma.

In my writing, before this essay, I had never added a comma on the final element of the conjunction. This topic piqued my interest, and I started doing research on this entity.

The term, Oxford comma, got its name from the Oxford University Press style guide. It was recommended by the Oxford style guide even when many British journals and publications were not recommending it. My first conscious encounter with it was during the diagnostic test. I was not able to correct the sentence by placing a serial comma. I had never considered a statement, with a list of items, wrong if it did not have a serial comma. We always try to infer the meaning from the context. But in fact, as will see shortly, it may not be possible at all times and the reason to include Oxford comma is to reduce ambiguity in the sentence.

The reason we do not notice the lack Oxford comma is, the Associated Press style guide, which many newspapers follow, does not require us to use Oxford comma, and thus, it is often left out. Let's notice an ambiguity when we lack the Oxford comma.

To my parents, Alicia and Steve Jobs.

There is an ambiguity about the writer's parentage here because Alicia and Steve Jobs can be read as in apposition to his parents, leading the reader to believe that writer claims Alicia and Steve Jobs are his parents. Whereas, placing the Oxford comma after Alicia will resolve the ambiguity.

To my parents, Alicia, and Steve Jobs.

It articulates clearly that the writer is referring to 3 separate entities in his dedication. There are cases when inclusion of Oxford comma can make a sentence ambiguous too. For example, this statement:

To my father, Steve Jobs, and Alicia.

The serial comma after my father creates ambiguity about the writer's father because it uses punctuation identical to that utilized for an appositive phrase, suggesting that Steve Jobs is the writer's father. It is unclear whether there are three people (1. my father; 2. Steve jobs; and 3. Alicia) or it is only two (1. Steve Jobs and 2. Alicia. Steve Jobs is writer's father). The common way to disambiguate this sentence is to refer to each entity as a noun. For e.g.

To my father, to Steve Jobs, and Alicia.

Thus, the placement of the comma can significantly affect the meaning of the entire sentence. By appropriately choosing words and with always using Oxford comma, we can write less ambiguous phrases. Also, Oxford comma matches with the spoken cadence of sentences better.

If you need only one reference to remember this concept, then keep this picture in your mind.

https://pbs.twimg.com/media/Cd_uZX6WAAIQN4f.jpg

One more thing.

Listen to "Weird Al" Yankovic - Word Crimes before you move beyond this page.

The Good Doctor

"Oh, yes," said Siddhartha, "I make one-off changes to Simulations on Earth."

Govind adjusted his glasses and wondered. "Really?."

"I mean it. I make real people come to readings and discussions of their stories."

"And?".

"The characters themselves evaluate how they are understood. I introduced Asimov into a class which was doing a discussion of his short story 'The Immortal Bard'."

"Oh wow, exciting."

"Asimov was excited too since it was his favorite subject. But, then."

"Hmm.?"

"He quickly became dismayed. His original writing of Immortal Bard was to illustrate, given enough time, how experts of this world create their meaning of original sources. When creators themselves come to face them, they so much believe in their interpretation and revisions that they fail the creator himself."

"That's what happened to Shakespeare in that story. Yeah, I know, that was a hilarious one. "

"Right, but what Asimov found was, the discussion was not about what he felt the readers might enjoy from his story. Instead, everyone went on to prove, why the immortal bard will fail when he brings him back."

"Amusing."

"Yeah, Asimov pondered, how the story could be perceived and readers change the intention when asked to prove something in an assignment. The Good Doctor that he was, continued with his writing, thinking that someday readers will be able to enjoy his stories as he meant it, perhaps with the help of an artificially intelligent agent."

Senthil Kumaran


The story is based upon The Immortal Bard by Issac Asimov.

Good Algorithms Matter

This screenshot from a slide presents a compelling reason as why good algorithms matter.

https://dl.dropbox.com/s/xav99k9cu4y3f9a/good_algorithms.png

Choice of the algorithm determines the difference between solving a problem in few minutes vs., not able to solve it over a lifetime. Just compare sorting a billion items using insertion sort and quick sort in the above chart. It speaks for itself.

Gravitational Waves

Our universe is complex and our knowledge of it is limited. It is very hard to keep track of various breakthroughs and connect the dots. Moreover, as we focus on narrow pursuits, we often tend to lose our path in fundamental sciences. We become five-year-olds in fundamental sciences. Reddit's "Explain like I am Five" is one of the first things I look for whenever the world comes across a major scientific breakthrough.

When the existence of gravitational waves was proven on Feb 11, 2016, the question on everyone's mind was "what are gravitational waves", "why is it a big deal?", "Should I even care?". Well, honestly, many of us and our children can live for 100s of years without caring, but generations beyond those, assuming humanity survives, will find Feb 11 2016, discovery to be useful. Such breakthroughs are not new, just a few hundred years ago, we did not believe that earth was spherical.

So, what are gravitational waves?

When you consider the space as matter, and when bodies like stars and sun move that matter, they produce waves on that surface. Using instruments it is confirmed that these waves exists and we actually "listened" to them for real!

To follow along in simple terms, this is the excerpt of the conversation on Reddit.

A user loljetfuel explained the concept like.

Since I actually tried to explain this to a pair of 5-year-olds today, I figure why not share :) You know how when you throw a rock in a pool, there are ripples? And how if we throw bigger rocks in, they make bigger ripples? Well, a long time ago, a really smart guy named Einstein said that stars and planets and stuff should make ripples in space, and he used some really cool math to explain why he thought that. Lots of people checked the math and agree that he was right. But we've never been able to see those ripples before. Now some people built a really sensitive measuring thing that uses lasers to see them, and they just proved that their device works by seeing ripples from a really big splash. So now we know how to see them and we can get better at it, which will help us learn more about space.

The scientist from LIGO, dwarfboy1717 further supported this answer stating:

LIGO scientist here! Great explanation! I'll add:

If Einstein is right (hint: HE IS), gravitational waves would travel outward from (for instance) two black holes circling each other just like the ripples in a pond. When they come to Earth and pass through the detectors, a signal can tell us not only that the gravitational wave has been found, but it can also tell us lots of information about the gravitational wave! As you track what the gravitational waves look like over a (very) short amount of time, you can tell what kind of event caused them, like if it was two black holes colliding or a violent supernova... along with other details, like what the mass of these stars/black holes would have been! This discovery has ushered in an awesome new era of astronomy. BEFORE we started detecting gravitational waves, looking out at the universe was like watching an orchestra without any sound! As our detectors start making regular observations of this stuff, it will be like turning on our ears to the symphony of the cosmos!

Big Deal!. It's wonderful we are able to share this news in a way that every one of us can understand and relate to!

Further Reading

Watching

Watch the video Allan Adams, theoretical physicist from MIT, explaining the significance of gravitational waves.

First day at Okta

Joined my new company, Okta, today. Eventful day with lot of people and process training, but a meaningful one wherein I got the feel of the pending tasks, by making change, going through the process, witnessing the test runs and committing the code on the first day itself.

Onwards with Always On. :)

Comma Free Codes

We awe at Donald Knuth. I wondered, if I can understand a subject taught by Knuth and derive satisfaction of learning something directly from the master. I attended his most recent lecture on "comma free codes", felt that it was accessible and could be understood by putting some effort. This is my attempt to grasp the topic of "comma free codes", taught by Knuth for his 21st annual christmas tree lecture on Dec 2015. We will use some definitions directly from Williard Eastman's paper, reference the topics in wikipedia, look at Knuth's explanation.

We talk of codes in the context of information theory. A code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form or representation. A sequence of symbols, like a sequence of binary symbols, sequence of base-10 decimals or a sequence of English language alphabets can all be termed as "code". A block code is a set of codes having the same length.

Comma Free Block Code

Comma free code is a code that can be easily synchronized without any external unit like comma or space, "likethis". Comma free block code is set of same length codes having the comma free property.

The four letter words in "goodgame" is recognizable, it easy to derive those as "good" and "game". Other possible substring four letter words in that phrase "oodg", "odga", "dgga" are invalid words in english (or non code-words) and thus we did not have any problem separating the codewords when they were not separated by delimiters like space or comma. Anecdotally, Chinese and Thai languages do not use space between words.

Take an alternate example, "fujiverb". Can you say deterministically if the word "jive" is my code word? Or my code words consists only of "fuji" and "verb". You cannot determine it from this message and thus, "fuji" and "verb" do not form valid a "comma free block codes".

The same applies to a periodic code word like "gaga". If a message "gagagaga" occurs, then the middle word "gaga" will be ambiguous as it is composed of 2-letter suffix and a 2-prefix of our code word and we wont be able to differentiate it.

Mathematical definition

Comma free code words are defined like this.

A block code, C containing words of length n is called comma free if, and only if, for any words \(w = w_1, w_2 ... w_n. \: and \: x = x_1, x_2 ... x_n\) belonging to C, the n letter overlaps \(w_k ... w_nx_1 .... x_{k-1} (k = 2, ... n)\) are not words in the code.

This simply means that if two code words are joined together, than in that joined word, any substring from second letter to the last of the block code length should not be a code word.

How to find them?

Backtracking.

The general idea to find comma free block codes is use a backtracking solution and for every word that we want to add to the list, prune through through already added words and find if the new word can be a substring of two words joined together from the existing list. Knuth gave a demo of finding the maximum comma free subset of the four letter words.

commafree_check.py (Source)

def check_comma_free(input_string):
  if check_periodic(input_string):
    print("input string is periodic, it cannot be commafree.")
    return
  if len(comma_free_words) == 0:
    comma_free_words.append(input_string)
  else:
    parts = get_parts(input_string)
    for head, tail in parts:
      if (any_starts_with(head) and any_ends_with(tail)) or (any_starts_with(tail) and any_ends_with(head)):
        print("%s|%s are part of the previous words." % (head, tail))
        return
    comma_free_words.append(input_string)

This logic is dependent on the order in which comma free block codes are analyzed. For finding a maximal set in a given alphabet size in any order a proper backtracking based solution should be devised, which considers all the cases of insertions.

How many are there?

Backtracking based solution requires us to intelligently prune the search space. Finding effective strategies for pruning the search space becomes our the next problem in finding the comma free codes. We will have to determine how many comma free block codes are possible for a given alphabet size and for a given length.

For 4 letter words, (n = 4) of the alphabet size m, we know that there are \(m^4\) possible words (permutation with repetition). But we're restricted to aperiodic words of length 4, of which there are \(m^4 - m^2\). Notice further that if word, item has been chosen, we aren't allowed to include any of its cyclic shifts temi, emit*, or mite, because they all appear within itemitem. Hence the maximum number of codewords in our commafree code cannot exceed \((m^4 - m^2)/4\).

Let us consider the binary case, m = 2 and length n = 4, C(2, 4). We can choose four-bit "words" like this.

[0001] = {0001, 0010, 0100, 1000},

[0011] = {0011, 0110, 1100, 1001},

[0111] = {0111, 1100, 1101, 1011},

The maximum number of code words from our formula will be \(2^4 - 2^2/4 \: = \: 3\). Can we choose three four-bit "words" from the above cyclic classes? Yes and choosing the lowest in each cyclic class will simply do. But choosing the lowest will not work for all n and m.

In the class taught by Knuth, we analyzed the choosing codes when m = 3 {0, 1, 2} and for n = 3, C(3, 3). The words in the category were

000 111 222 # Invalid since they are periodic

001 010 100 # A set of cyclic shifts, only one can taken as a valid code word.

002 020 200

011 110 101

012 120 201

021 210 102

112 121 211

220 202 022

221 212 122

The number 3-alphabet code words of length 3 is 27 ( = \(3^3\)). The set of valid code words in this will be \((3^3-3) / 3 = 8\).

Choosing the lowest index will not work here for e.g, if we choose 021 and 220, and we send the word 220021 the word 002 is conflicting as it is part of our code word. With any back-tracking based solution, we will have to determine the correct non-cyclic words to choose in each set to form our maximal set of 8 code words.

The problem of finding comma free code words increases exponentially to the size of the length of the code word and on the code word size. For e.g, The task of finding all four-letter comma free codes is not difficult when m = 3, and only 18 cycle classes are involved. But it already becomes challenging when m = 4, because we must then deal with \((4^4 - 4^2) / 4 = 60\) classes. Therefore we'll want to give it some careful thought as we try to set it up for backtracking.

Willard Eastman came up with clever solution to find a code word for any odd word length n over an infinite alphabet size. Eastman proposed a solution wherein if we give a n letter word (n should be odd), the algorithm will output the correct shift required to make the n letter word a code word.

Eastman's Algorithm

Construction of Comma Free Codes

The following elegant construction yields a comma free code of maximum size for any odd block length n, over any alphabet.

Given a sequence of \(x =x_0x_1...x_{n-1}\) of nonnegative integers, where x differs from each of its other cyclic shifts \(x_k...x_{n-1}x_0..x_{k-1}\) for 0 < k < n, the procedure outputs a cyclic shift \(\sigma x\) with the property that the set of all such \(\sigma x\) is a commafree.

We regard x as an infinite periodic sequence \(<x_n>\) with \(x_k = x_{k-n}\) for all \(k \ge n\). Each cyclic shift then has the form \(x_kx_{k+1}...x_{k+n-1}\). The simplest nontrivial example occurs when n = 3, where \(x=x_0 x_1 x_2 x_0 x_1 x_2 x_0 ...\) and we don't have \(x_0 = x_1 = x_2\). In this case, the algorithm outputs \(x_kx_{k+1}x_{k+2}\) where \(x_k > x_{k+1} \le x_{k+2}\); and the set of all such triples clearly satisfies the commafree condition.

The idea expressed is to choose a triplet (a, b, c) of the form.

\begin{equation*} a \: \gt b \: \le c \end{equation*}

Why does this work?

If we take two words, xyz and abc following this property, combining them we have,

\begin{equation*} x \: \gt y \: \le z \quad a \: \gt b \: \le c \end{equation*}
  • yza cannot be a word because z cannot be > than y.
  • zab cannot be a word because a cannot be < than b.

There by none of the substrings will be a code word and we can satisfy the comma free property.

And if we use this condition to determine the code words in our C(3,3) set, we will come up with the following codes which can form valid code words.

000 111 222
001 010 100
002 020 200
011 110 101
012 120 201
021 210 102
112 121 211
220 202 022
221 212 122

The highlighted words will form valid code words and all of these satisfy the criteria, \(a \: \gt b \: \le c\) Now, if you are given a word like 211201212, you know for sure that they are composed of 211, 201 and 212 as none of other intermediaries like (112, 120, 201, 012, 121) occur in our set.

Eastman's algorithm helps in finding the correct shift required to make any word a code word.

For e.g,

Input: 001 Output: Shift by 2, thus producing 100

Input: 221 Output: Shift by 1, thus producing 212

And the beauty is, it is not just for words of length 3, but for any odd word length n.

The key idea is to think of x as partitioned into t substrings by boundary marked by \(b_j\) where \(0 \le b_0 \lt b_1 \lt ... \lt b_{t-1} < n\) and \(b_j = b_{j-t} + n\) for \(j \ge t\). Then substring \(y_j\) is \(x_{b_j} x_{b_{j+1}-1}\). The number t of substrings is always odd. Initially, t = n and \(b_j = j\) for all j; ultimately t = 1 and \(\sigma x = y0\) is the desired output.

Eastman's algorithm is based on comparison of adjacent substrings \(y_{j-1} and y_j\). If those substring have the same length, we use lexicographic comparison; otherwise we declare that the longer string is bigger.

The number of t substring is always odd because we went with an odd string length (n).

The comparison of adjacent substring form the recursive nature of the algorithm, we start with small substring of length 1 adjacent to each other and then we find compare higher length substring, whose markers have been found by the previous step. This will become clear as we look the hand demo.

http://ecx.images-amazon.com/images/I/41KZVIUGswL._SX332_BO1,204,203,200_.jpg

Basin and Ranges

It's convenient to describe the algorithm using the terminology based on the topograph of Nevada. Say that i is a basin if the substrings satisfy \(y_{i-1} \gt y_i \le y_{i+1}\). There must be at least one basin; otherwise all the \(y_i\) would be equal, and x would equal one of its cyclic shifts. We look at consecutive basins, i and j; this means that i < j and that i and j are basins, and that i+1 through j - 1 are not basins. If there's only one basin we have \(j = i + t\). The indices between consecutive basins are called ranges.

The basin and ranges is Knuth's terminology, taken from the book Basin and Ranges by John McPhee which describes the topology of Nevada. It is easier to imagine the construct we are looking for if we start to think in terms of basin and ranges.

Since t is odd, there is an odd number of consecutive basins for which \(j - i\) is odd. Each round of Eastman's algorithm retains exactly one boundary point in the range between such basins and deletes all the others. The retained point is the smallest \(k = i + 2l\) such that \(y_k \gt y_{k+1}\). At the end of a round, we reset t to the number of retained boundary points, and we begin another round if t > 1.

Word of length 19

Let's work through the algorithm by hand when n = 19 and x = 3141592653589793238

Phase 1

  • First markers differentiate each character.
  • We use . to denote the cyclic repetition of the 19 letter word.
3 | 1 | 4 |  1 | 5 | 9 | 2 | 6 | 5 | 3 | 5 | 8 | 9 | 7 | 9 | 3 | 2 | 3 | 8 . 3 | 1 | 4 | 1 | 5
  • Next we go about identifying basins. We identify the basins where for any 3 numbers (a, b, c), \(a \: \gt b \le c\) and put the markers below them
  • After the cyclic repetition we see the repetition of the basin. Like the last line below 1 is same as the first line. It is the basin that is repeated.
3  1  4  1  5  9  2  6  5  3  5  8  9  7  9  3  2  3  8  3  1  4  1 5

   |     |        |        |           |        |        .  |
  • We mark the ranges as odd length or even length ones.
3  1  4  1  5  9  2  6  5  3  5  8  9  7  9  3  2  3  8  3  1  4  1 5

---|--e--|---o----|---o----|-----e-----|---o----|-----e--.--|--------
  • Next, take all the odd length basin markers, go by steps of 2, 4, 6 so on and identify the first greater than number and place the new basin markers before them.

For e.g, in 1-5-9-2. The 2 length path is "1-5-9" and first higher will be 9 and we have to place the marker ahead of it. So, the phase 0 of eastman algorithm will output, 5, 8 and 15. denoting the indices where our basins are after the first phase.

If you are watching the video with Knuth giving a demo, there is a mistake in the video that second basin identifier is placed after 5, instead of before 5 (We should go by steps of 2 and place it before the first greater than number).

3  1  4  1  5  | 9  2  6  |  5  3  5  8  9  7  9  | 3  2  3  8  . 3  1  4 1  5

Phase 2

  • In the second phase, we use the basin markers of the previous phase and compare the sub strings denoted by the basin.
  • We take the substring of length 19, but now denoted by basins. The repetition of the string in the previous steps helped us here.
9  2  6  |  5  3  5  8  9  7  9  | 3  2  3  8  3  1  4 1  5
  • We apply the algorithm recursively on the strings 926, 5358979 and 323831415. We find that the string 323831415 is greater than the rest, so we can keep the basin marker ahead of it.
9  2  6  5  3  5  8  9  7  9  | 3  2  3  8  3  1  4 1  5

At the end of Phase 2, the algorithm outputs index 15, as the shift required to create the code word out of 19 word string. And thus our code word found by the eastman's algorithm is

3  2  3  8  3  1  4 1  5  9  2  6  5  3  5  8  9  7  9

Knuth's gave a demo with his implementation in CWEB. He shared a thought that even though algorithm is expressed recursively, the iterative implementation was straight forward. For the rest of the lecture he explores the algorithm on a binary string of PI of n = 19 and finds the shift required. Also, gives the probability of Eastman's algorithm finishing in one round, that is, just the phase 1.

All these are covered as exercises and answers in the pre-fascicle 5B of his volume 5 of The Art of Computer Programming, which can be explored in further depth.

Video

References

Tidbits